Paper Group NANR 177
Attention over Heads: A Multi-Hop Attention for Neural Machine Translation. ERL-Net: Entangled Representation Learning for Single Image De-Raining. A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks. Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer. Computational Investigations …
Attention over Heads: A Multi-Hop Attention for Neural Machine Translation
Title | Attention over Heads: A Multi-Hop Attention for Neural Machine Translation |
Authors | Shohei Iida, Ryuichiro Kimura, Hongyi Cui, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata |
Abstract | In this paper, we propose a multi-hop attention for the Transformer. It refines the attention for an output symbol by integrating that of each head, and consists of two hops. The first hop attention is the scaled dot-product attention which is the same attention mechanism used in the original Transformer. The second hop attention is a combination of multi-layer perceptron (MLP) attention and head gate, which efficiently increases the complexity of the model by adding dependencies between heads. We demonstrate that the translation accuracy of the proposed multi-hop attention outperforms the baseline Transformer significantly, +0.85 BLEU point for the IWSLT-2017 German-to-English task and +2.58 BLEU point for the WMT-2017 German-to-English task. We also find that the number of parameters required for a multi-hop attention is smaller than that for stacking another self-attention layer and the proposed model converges significantly faster than the original Transformer. |
Tasks | Machine Translation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-2030/ |
https://www.aclweb.org/anthology/P19-2030 | |
PWC | https://paperswithcode.com/paper/attention-over-heads-a-multi-hop-attention |
Repo | |
Framework | |
ERL-Net: Entangled Representation Learning for Single Image De-Raining
Title | ERL-Net: Entangled Representation Learning for Single Image De-Raining |
Authors | Guoqing Wang, Changming Sun, Arcot Sowmya |
Abstract | Despite the significant progress achieved in image de-raining by training an encoder-decoder network within the image-to-image translation formulation, blurry results with missing details indicate the deficiency of the existing models. By interpreting the de-raining encoder-decoder network as a conditional generator, within which the decoder acts as a generator conditioned on the embedding learned by the encoder, the unsatisfactory output can be attributed to the low-quality embedding learned by the encoder. In this paper, we hypothesize that there exists an inherent mapping between the low-quality embedding to a latent optimal one, with which the generator (decoder) can produce much better results. To improve the de-raining results significantly over existing models, we propose to learn this mapping by formulating a residual learning branch, that is capable of adaptively adding residuals to the original low-quality embedding in a representation entanglement manner. Using an embedding learned this way, the decoder is able to generate much more satisfactory de-raining results with better detail recovery and rain artefacts removal, providing new state-of-the-art results on four benchmark datasets with considerable improvement (i.e., on the challenging Rain100H data, an improvement of 4.19dB on PSNR and 5% on SSIM is obtained). The entanglement can be easily adopted into any encoder-decoder based image restoration networks. Besides, we propose a series of evaluation metrics to investigate the specific contribution of the proposed entangled representation learning mechanism. Codes are available at . |
Tasks | Image Restoration, Image-to-Image Translation, Representation Learning |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_ERL-Net_Entangled_Representation_Learning_for_Single_Image_De-Raining_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_ERL-Net_Entangled_Representation_Learning_for_Single_Image_De-Raining_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/erl-net-entangled-representation-learning-for |
Repo | |
Framework | |
A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks
Title | A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks |
Authors | Yinghao Xu, Xin Dong, Yudian Li, Hao Su |
Abstract | To reduce memory footprint and run-time latency, techniques such as neural net-work pruning and binarization have been explored separately. However, it is un-clear how to combine the best of the two worlds to get extremely small and efficient models. In this paper, we, for the first time, define the filter-level pruning problem for binary neural networks, which cannot be solved by simply migrating existing structural pruning methods for full-precision models. A novel learning-based approach is proposed to prune filters in our main/subsidiary network frame-work, where the main network is responsible for learning representative features to optimize the prediction performance, and the subsidiary component works as a filter selector on the main network. To avoid gradient mismatch when training the subsidiary component, we propose a layer-wise and bottom-up scheme. We also provide the theoretical and experimental comparison between our learning-based and greedy rule-based methods. Finally, we empirically demonstrate the effectiveness of our approach applied on several binary models, including binarizedNIN, VGG-11, and ResNet-18, on various image classification datasets. For bi-nary ResNet-18 on ImageNet, we use 78.6% filters but can achieve slightly better test error 49.87% (50.02%-0.15%) than the original model |
Tasks | Image Classification |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Xu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/a-mainsubsidiary-network-framework-for-1 |
Repo | |
Framework | |
Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer
Title | Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer |
Authors | Jingtan Piao, Chen Qian, Hongsheng Li |
Abstract | Monocular face reconstruction is a challenging task in computer vision, which aims to recover 3D face geometry from a single RGB face image. Recently, deep learning based methods have achieved great improvements on monocular face reconstruction. However, for deep learning-based methods to reach optimal performance, it is paramount to have large-scale training images with ground-truth 3D face geometry, which is generally difficult for human to annotate. To tackle this problem, we propose a semi-supervised monocular reconstruction method, which jointly optimizes a shape-preserved domain-transfer CycleGAN and a shape estimation network. The framework is semi-supervised trained with 3D rendered images with ground-truth shapes and in-the-wild face images without any extra annotation. The CycleGAN network transforms all realistic images to have the rendered style and is end-to-end trained within the overall framework. This is the key difference compared with existing CycleGAN-based learning methods, which just used CycleGAN as a separate training sample generator. Novel landmark consistency loss and edge-aware shape estimation loss are proposed for our two networks to jointly solve the challenging face reconstruction problem. Extensive experiments on public face reconstruction datasets demonstrate the effectiveness of our overall method as well as the individual components. |
Tasks | 3D Face Reconstruction, Face Reconstruction |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Piao_Semi-Supervised_Monocular_3D_Face_Reconstruction_With_End-to-End_Shape-Preserved_Domain_Transfer_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Piao_Semi-Supervised_Monocular_3D_Face_Reconstruction_With_End-to-End_Shape-Preserved_Domain_Transfer_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-monocular-3d-face |
Repo | |
Framework | |
Computational Investigations of Pragmatic Effects in Natural Language
Title | Computational Investigations of Pragmatic Effects in Natural Language |
Authors | Jad Kabbara |
Abstract | Semantics and pragmatics are two complimentary and intertwined aspects of meaning in language. The former is concerned with the literal (context-free) meaning of words and sentences, the latter focuses on the intended meaning, one that is context-dependent. While NLP research has focused in the past mostly on semantics, the goal of this thesis is to develop computational models that leverage this pragmatic knowledge in language that is crucial to performing many NLP tasks correctly. In this proposal, we begin by reviewing the current progress in this thesis, namely, on the tasks of definiteness prediction and adverbial presupposition triggering. Then we discuss the proposed research for the remainder of the thesis which builds on this progress towards the goal of building better and more pragmatically-aware natural language generation and understanding systems. |
Tasks | Text Generation |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-3010/ |
https://www.aclweb.org/anthology/N19-3010 | |
PWC | https://paperswithcode.com/paper/computational-investigations-of-pragmatic |
Repo | |
Framework | |
Biomedical Relation Classification by single and multiple source domain adaptation
Title | Biomedical Relation Classification by single and multiple source domain adaptation |
Authors | Sinchani Chakraborty, Sudeshna Sarkar, Pawan Goyal, Mahan Gattu, eeshwar |
Abstract | Relation classification is crucial for inferring semantic relatedness between entities in a piece of text. These systems can be trained given labelled data. However, relation classification is very domain-specific and it takes a lot of effort to label data for a new domain. In this paper, we explore domain adaptation techniques for this task. While past works have focused on single source domain adaptation for bio-medical relation classification, we classify relations in an unlabeled target domain by transferring useful knowledge from one or more related source domains. Our experiments with the model have shown to improve state-of-the-art F1 score on 3 benchmark biomedical corpora for single domain and on 2 out of 3 for multi-domain scenarios. When used with contextualized embeddings, there is further boost in performance outperforming neural-network based domain adaptation baselines for both the cases. |
Tasks | Domain Adaptation, Relation Classification |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6210/ |
https://www.aclweb.org/anthology/D19-6210 | |
PWC | https://paperswithcode.com/paper/biomedical-relation-classification-by-single |
Repo | |
Framework | |
Detecting Causal Language Use in Science Findings
Title | Detecting Causal Language Use in Science Findings |
Authors | Bei Yu, Yingya Li, Jun Wang |
Abstract | Causal interpretation of correlational findings from observational studies has been a major type of misinformation in science communication. Prior studies on identifying inappropriate use of causal language relied on manual content analysis, which is not scalable for examining a large volume of science publications. In this study, we first annotated a corpus of over 3,000 PubMed research conclusion sentences, then developed a BERT-based prediction model that classifies conclusion sentences into {}no relationship{''}, { }correlational{''}, {}conditional causal{''}, and { }direct causal{''} categories, achieving an accuracy of 0.90 and a macro-F1 of 0.88. We then applied the prediction model to measure the causal language use in the research conclusions of about 38,000 observational studies in PubMed. The prediction result shows that 21.7{%} studies used direct causal language exclusively in their conclusions, and 32.4{%} used some direct causal language. We also found that the ratio of causal language use differs among authors from different countries, challenging the notion of a shared consensus on causal language use in the global science community. Our prediction model could also be used to help identify the inappropriate use of causal language in science publications. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1473/ |
https://www.aclweb.org/anthology/D19-1473 | |
PWC | https://paperswithcode.com/paper/detecting-causal-language-use-in-science |
Repo | |
Framework | |
Multi-branch fusion network for hyperspectral image classification
Title | Multi-branch fusion network for hyperspectral image classification |
Authors | Hongmin Gao a, 1, Yao Yang a, 1, Sheng Lei b, Chenming Li a, ∗, Hui Zhou a, Xiaoyu Qu a |
Abstract | Hyperspectral remote sensing image (HSI) has the characteristics of large data volume and high spectral resolution. It contains abundant spectral information and has tremendous applicable value. Convolutional neural network (CNN) has been successfully applied to HSI classification. However, the limited labeled samples of the HSI make the existing CNN based HSI classification methods generally be plagued by small sample size problem and class imbalance, which cause great challenges for HSI classification. This work proposes a novel CNN architecture for HSI classification. The proposed CNN is a multi-branch fusion network, which is formed by merging multiple branches on an ordinary CNN. It can effectively extract features of HSIs. In addition, the 1 × 1 convolutional layer is introduced into the branches to reduce the number of parameters and then improve the classification efficiency. Furthermore, the L2 regularization is introduced into this work to improve the generalization performance of the proposed model under small sample set. Experimental results on three benchmark hyperspectral images demonstrate that the proposed CNN can provide excellent classification performance under small training set. |
Tasks | Hyperspectral Image Classification, Image Classification, L2 Regularization |
Published | 2019-01-18 |
URL | https://www.sciencedirect.com/science/article/pii/S0950705119300206 |
https://reader.elsevier.com/reader/sd/pii/S0950705119300206?token=A7FDEEE093FA10AFB265B5A06F464D619C6F100EF0EA1880E21B73D7B6EE6B3DF652205F88A7D36B9FDF06F9909C003B | |
PWC | https://paperswithcode.com/paper/multi-branch-fusion-network-for-hyperspectral |
Repo | |
Framework | |
On Human-Aligned Risk Minimization
Title | On Human-Aligned Risk Minimization |
Authors | Liu Leqi, Adarsh Prasad, Pradeep K. Ravikumar |
Abstract | The statistical decision theoretic foundations of modern machine learning have largely focused on the minimization of the expectation of some loss function for a given task. However, seminal results in behavioral economics have shown that human decision-making is based on different risk measures than the expectation of any given loss function. In this paper, we pose the following simple question: in contrast to minimizing expected loss, could we minimize a better human-aligned risk measure? While this might not seem natural at first glance, we analyze the properties of such a revised risk measure, and surprisingly show that it might also better align with additional desiderata like fairness that have attracted considerable recent attention. We focus in particular on a class of human-aligned risk measures inspired by cumulative prospect theory. We empirically study these risk measures, and demonstrate their improved performance on desiderata such as fairness, in contrast to the traditional workhorse of expected loss minimization. |
Tasks | Decision Making |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9642-on-human-aligned-risk-minimization |
http://papers.nips.cc/paper/9642-on-human-aligned-risk-minimization.pdf | |
PWC | https://paperswithcode.com/paper/on-human-aligned-risk-minimization |
Repo | |
Framework | |
Does NMT make a difference when post-editing closely related languages? The case of Spanish-Catalan
Title | Does NMT make a difference when post-editing closely related languages? The case of Spanish-Catalan |
Authors | Sergi Alvarez, Antoni Oliver, Toni Badia |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-6708/ |
https://www.aclweb.org/anthology/W19-6708 | |
PWC | https://paperswithcode.com/paper/does-nmt-make-a-difference-when-post-editing |
Repo | |
Framework | |
Evaluating machine translation in a low-resource language combination: Spanish-Galician.
Title | Evaluating machine translation in a low-resource language combination: Spanish-Galician. |
Authors | Mar{'\i}a Do Campo Bay{'o}n, Pilar S{'a}nchez-Gij{'o}n |
Abstract | |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-6705/ |
https://www.aclweb.org/anthology/W19-6705 | |
PWC | https://paperswithcode.com/paper/evaluating-machine-translation-in-a-low |
Repo | |
Framework | |
Domain Adaptation for Structured Output via Disentangled Patch Representations
Title | Domain Adaptation for Structured Output via Disentangled Patch Representations |
Authors | Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker |
Abstract | Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn strong supervised models like convolutional neural networks. However, these models trained on one data domain may not generalize well to other domains unequipped with annotations for model finetuning. To avoid the labor-intensive process of annotation, we develop a domain adaptation method to adapt the source data to the unlabeled target domain. To this end, we propose to learn discriminative feature representations of patches based on label histograms in the source domain, through the construction of a disentangled space. With such representations as guidance, we then use an adversarial learning scheme to push the feature representations in target patches to the closer distributions in source ones. In addition, we show that our framework can integrate a global alignment process with the proposed patch-level alignment and achieve state-of-the-art performance on semantic segmentation. Extensive ablation studies and experiments are conducted on numerous benchmark datasets with various settings, such as synthetic-to-real and cross-city scenarios. |
Tasks | Domain Adaptation, Semantic Segmentation |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1xFhiC9Y7 |
https://openreview.net/pdf?id=B1xFhiC9Y7 | |
PWC | https://paperswithcode.com/paper/domain-adaptation-for-structured-output-via-1 |
Repo | |
Framework | |
Implementing a Multi-lingual Chatbot for Positive Reinforcement in Young Learners
Title | Implementing a Multi-lingual Chatbot for Positive Reinforcement in Young Learners |
Authors | Francisca Oladipo, Abdulmalik Rufai |
Abstract | This is a humanitarian work {–}a counter-terrorism effort. The presentation describes the experiences of developing a multi-lingua, interactive chatbot trained on the corpus of two Nigerian Languages (Hausa and Fulfude), with simultaneous translation to a third (Kanuri), to stimulate conversations, deliver tailored contents to the users thereby aiding in the detection of the probability and degree of radicalization in young learners through data analysis of the games moves and vocabularies. As chatbots have the ability to simulate a human conversation based on rhetorical behavior, the system is able to learn the need of individual user through constant interaction and deliver tailored contents that promote good behavior in Hausa, Fulfulde and Kanuri languages. |
Tasks | Chatbot |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3629/ |
https://www.aclweb.org/anthology/W19-3629 | |
PWC | https://paperswithcode.com/paper/implementing-a-multi-lingual-chatbot-for |
Repo | |
Framework | |
Learning Reward Machines for Partially Observable Reinforcement Learning
Title | Learning Reward Machines for Partially Observable Reinforcement Learning |
Authors | Rodrigo Toro Icarte, Ethan Waldie, Toryn Klassen, Rick Valenzano, Margarita Castro, Sheila Mcilraith |
Abstract | Reward Machines (RMs), originally proposed for specifying problems in Reinforcement Learning (RL), provide a structured, automata-based representation of a reward function that allows an agent to decompose problems into subproblems that can be efficiently learned using off-policy learning. Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems. We pose the task of learning RMs as a discrete optimization problem where the objective is to find an RM that decomposes the problem into a set of subproblems such that the combination of their optimal memoryless policies is an optimal policy for the original problem. We show the effectiveness of this approach on three partially observable domains, where it significantly outperforms A3C, PPO, and ACER, and discuss its advantages, limitations, and broader potential. |
Tasks | Problem Decomposition |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9685-learning-reward-machines-for-partially-observable-reinforcement-learning |
http://papers.nips.cc/paper/9685-learning-reward-machines-for-partially-observable-reinforcement-learning.pdf | |
PWC | https://paperswithcode.com/paper/learning-reward-machines-for-partially |
Repo | |
Framework | |
Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation
Title | Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation |
Authors | Manuel R. Ciosici, Ira Assent |
Abstract | We present Abbreviation Explorer, a system that supports interactive exploration of abbreviations that are challenging for Unsupervised Abbreviation Disambiguation (UAD). Abbreviation Explorer helps to identify long-forms that are easily confused, and to pinpoint likely causes such as limitations of normalization, language switching, or inconsistent typing. It can also support determining which long-forms would benefit from additional input text for unsupervised abbreviation disambiguation. The system provides options for creating corrective rules that merge redundant long-forms with identical meaning. The identified rules can be easily applied to the already existing vector spaces used by UAD to improve disambiguation performance, while also avoiding the cost of retraining. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-4001/ |
https://www.aclweb.org/anthology/N19-4001 | |
PWC | https://paperswithcode.com/paper/abbreviation-explorer-an-interactive-system |
Repo | |
Framework | |