January 24, 2020

2741 words 13 mins read

Paper Group NANR 177

Paper Group NANR 177

Attention over Heads: A Multi-Hop Attention for Neural Machine Translation. ERL-Net: Entangled Representation Learning for Single Image De-Raining. A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks. Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer. Computational Investigations …

Attention over Heads: A Multi-Hop Attention for Neural Machine Translation

Title Attention over Heads: A Multi-Hop Attention for Neural Machine Translation
Authors Shohei Iida, Ryuichiro Kimura, Hongyi Cui, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata
Abstract In this paper, we propose a multi-hop attention for the Transformer. It refines the attention for an output symbol by integrating that of each head, and consists of two hops. The first hop attention is the scaled dot-product attention which is the same attention mechanism used in the original Transformer. The second hop attention is a combination of multi-layer perceptron (MLP) attention and head gate, which efficiently increases the complexity of the model by adding dependencies between heads. We demonstrate that the translation accuracy of the proposed multi-hop attention outperforms the baseline Transformer significantly, +0.85 BLEU point for the IWSLT-2017 German-to-English task and +2.58 BLEU point for the WMT-2017 German-to-English task. We also find that the number of parameters required for a multi-hop attention is smaller than that for stacking another self-attention layer and the proposed model converges significantly faster than the original Transformer.
Tasks Machine Translation
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-2030/
PDF https://www.aclweb.org/anthology/P19-2030
PWC https://paperswithcode.com/paper/attention-over-heads-a-multi-hop-attention
Repo
Framework

ERL-Net: Entangled Representation Learning for Single Image De-Raining

Title ERL-Net: Entangled Representation Learning for Single Image De-Raining
Authors Guoqing Wang, Changming Sun, Arcot Sowmya
Abstract Despite the significant progress achieved in image de-raining by training an encoder-decoder network within the image-to-image translation formulation, blurry results with missing details indicate the deficiency of the existing models. By interpreting the de-raining encoder-decoder network as a conditional generator, within which the decoder acts as a generator conditioned on the embedding learned by the encoder, the unsatisfactory output can be attributed to the low-quality embedding learned by the encoder. In this paper, we hypothesize that there exists an inherent mapping between the low-quality embedding to a latent optimal one, with which the generator (decoder) can produce much better results. To improve the de-raining results significantly over existing models, we propose to learn this mapping by formulating a residual learning branch, that is capable of adaptively adding residuals to the original low-quality embedding in a representation entanglement manner. Using an embedding learned this way, the decoder is able to generate much more satisfactory de-raining results with better detail recovery and rain artefacts removal, providing new state-of-the-art results on four benchmark datasets with considerable improvement (i.e., on the challenging Rain100H data, an improvement of 4.19dB on PSNR and 5% on SSIM is obtained). The entanglement can be easily adopted into any encoder-decoder based image restoration networks. Besides, we propose a series of evaluation metrics to investigate the specific contribution of the proposed entangled representation learning mechanism. Codes are available at .
Tasks Image Restoration, Image-to-Image Translation, Representation Learning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_ERL-Net_Entangled_Representation_Learning_for_Single_Image_De-Raining_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_ERL-Net_Entangled_Representation_Learning_for_Single_Image_De-Raining_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/erl-net-entangled-representation-learning-for
Repo
Framework

A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks

Title A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks
Authors Yinghao Xu, Xin Dong, Yudian Li, Hao Su
Abstract To reduce memory footprint and run-time latency, techniques such as neural net-work pruning and binarization have been explored separately. However, it is un-clear how to combine the best of the two worlds to get extremely small and efficient models. In this paper, we, for the first time, define the filter-level pruning problem for binary neural networks, which cannot be solved by simply migrating existing structural pruning methods for full-precision models. A novel learning-based approach is proposed to prune filters in our main/subsidiary network frame-work, where the main network is responsible for learning representative features to optimize the prediction performance, and the subsidiary component works as a filter selector on the main network. To avoid gradient mismatch when training the subsidiary component, we propose a layer-wise and bottom-up scheme. We also provide the theoretical and experimental comparison between our learning-based and greedy rule-based methods. Finally, we empirically demonstrate the effectiveness of our approach applied on several binary models, including binarizedNIN, VGG-11, and ResNet-18, on various image classification datasets. For bi-nary ResNet-18 on ImageNet, we use 78.6% filters but can achieve slightly better test error 49.87% (50.02%-0.15%) than the original model
Tasks Image Classification
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Xu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/a-mainsubsidiary-network-framework-for-1
Repo
Framework

Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer

Title Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer
Authors Jingtan Piao, Chen Qian, Hongsheng Li
Abstract Monocular face reconstruction is a challenging task in computer vision, which aims to recover 3D face geometry from a single RGB face image. Recently, deep learning based methods have achieved great improvements on monocular face reconstruction. However, for deep learning-based methods to reach optimal performance, it is paramount to have large-scale training images with ground-truth 3D face geometry, which is generally difficult for human to annotate. To tackle this problem, we propose a semi-supervised monocular reconstruction method, which jointly optimizes a shape-preserved domain-transfer CycleGAN and a shape estimation network. The framework is semi-supervised trained with 3D rendered images with ground-truth shapes and in-the-wild face images without any extra annotation. The CycleGAN network transforms all realistic images to have the rendered style and is end-to-end trained within the overall framework. This is the key difference compared with existing CycleGAN-based learning methods, which just used CycleGAN as a separate training sample generator. Novel landmark consistency loss and edge-aware shape estimation loss are proposed for our two networks to jointly solve the challenging face reconstruction problem. Extensive experiments on public face reconstruction datasets demonstrate the effectiveness of our overall method as well as the individual components.
Tasks 3D Face Reconstruction, Face Reconstruction
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Piao_Semi-Supervised_Monocular_3D_Face_Reconstruction_With_End-to-End_Shape-Preserved_Domain_Transfer_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Piao_Semi-Supervised_Monocular_3D_Face_Reconstruction_With_End-to-End_Shape-Preserved_Domain_Transfer_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/semi-supervised-monocular-3d-face
Repo
Framework

Computational Investigations of Pragmatic Effects in Natural Language

Title Computational Investigations of Pragmatic Effects in Natural Language
Authors Jad Kabbara
Abstract Semantics and pragmatics are two complimentary and intertwined aspects of meaning in language. The former is concerned with the literal (context-free) meaning of words and sentences, the latter focuses on the intended meaning, one that is context-dependent. While NLP research has focused in the past mostly on semantics, the goal of this thesis is to develop computational models that leverage this pragmatic knowledge in language that is crucial to performing many NLP tasks correctly. In this proposal, we begin by reviewing the current progress in this thesis, namely, on the tasks of definiteness prediction and adverbial presupposition triggering. Then we discuss the proposed research for the remainder of the thesis which builds on this progress towards the goal of building better and more pragmatically-aware natural language generation and understanding systems.
Tasks Text Generation
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-3010/
PDF https://www.aclweb.org/anthology/N19-3010
PWC https://paperswithcode.com/paper/computational-investigations-of-pragmatic
Repo
Framework

Biomedical Relation Classification by single and multiple source domain adaptation

Title Biomedical Relation Classification by single and multiple source domain adaptation
Authors Sinchani Chakraborty, Sudeshna Sarkar, Pawan Goyal, Mahan Gattu, eeshwar
Abstract Relation classification is crucial for inferring semantic relatedness between entities in a piece of text. These systems can be trained given labelled data. However, relation classification is very domain-specific and it takes a lot of effort to label data for a new domain. In this paper, we explore domain adaptation techniques for this task. While past works have focused on single source domain adaptation for bio-medical relation classification, we classify relations in an unlabeled target domain by transferring useful knowledge from one or more related source domains. Our experiments with the model have shown to improve state-of-the-art F1 score on 3 benchmark biomedical corpora for single domain and on 2 out of 3 for multi-domain scenarios. When used with contextualized embeddings, there is further boost in performance outperforming neural-network based domain adaptation baselines for both the cases.
Tasks Domain Adaptation, Relation Classification
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6210/
PDF https://www.aclweb.org/anthology/D19-6210
PWC https://paperswithcode.com/paper/biomedical-relation-classification-by-single
Repo
Framework

Detecting Causal Language Use in Science Findings

Title Detecting Causal Language Use in Science Findings
Authors Bei Yu, Yingya Li, Jun Wang
Abstract Causal interpretation of correlational findings from observational studies has been a major type of misinformation in science communication. Prior studies on identifying inappropriate use of causal language relied on manual content analysis, which is not scalable for examining a large volume of science publications. In this study, we first annotated a corpus of over 3,000 PubMed research conclusion sentences, then developed a BERT-based prediction model that classifies conclusion sentences into {}no relationship{''}, {}correlational{''}, {}conditional causal{''}, and {}direct causal{''} categories, achieving an accuracy of 0.90 and a macro-F1 of 0.88. We then applied the prediction model to measure the causal language use in the research conclusions of about 38,000 observational studies in PubMed. The prediction result shows that 21.7{%} studies used direct causal language exclusively in their conclusions, and 32.4{%} used some direct causal language. We also found that the ratio of causal language use differs among authors from different countries, challenging the notion of a shared consensus on causal language use in the global science community. Our prediction model could also be used to help identify the inappropriate use of causal language in science publications.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1473/
PDF https://www.aclweb.org/anthology/D19-1473
PWC https://paperswithcode.com/paper/detecting-causal-language-use-in-science
Repo
Framework

Multi-branch fusion network for hyperspectral image classification

Title Multi-branch fusion network for hyperspectral image classification
Authors Hongmin Gao a, 1, Yao Yang a, 1, Sheng Lei b, Chenming Li a, ∗, Hui Zhou a, Xiaoyu Qu a
Abstract Hyperspectral remote sensing image (HSI) has the characteristics of large data volume and high spectral resolution. It contains abundant spectral information and has tremendous applicable value. Convolutional neural network (CNN) has been successfully applied to HSI classification. However, the limited labeled samples of the HSI make the existing CNN based HSI classification methods generally be plagued by small sample size problem and class imbalance, which cause great challenges for HSI classification. This work proposes a novel CNN architecture for HSI classification. The proposed CNN is a multi-branch fusion network, which is formed by merging multiple branches on an ordinary CNN. It can effectively extract features of HSIs. In addition, the 1 × 1 convolutional layer is introduced into the branches to reduce the number of parameters and then improve the classification efficiency. Furthermore, the L2 regularization is introduced into this work to improve the generalization performance of the proposed model under small sample set. Experimental results on three benchmark hyperspectral images demonstrate that the proposed CNN can provide excellent classification performance under small training set.
Tasks Hyperspectral Image Classification, Image Classification, L2 Regularization
Published 2019-01-18
URL https://www.sciencedirect.com/science/article/pii/S0950705119300206
PDF https://reader.elsevier.com/reader/sd/pii/S0950705119300206?token=A7FDEEE093FA10AFB265B5A06F464D619C6F100EF0EA1880E21B73D7B6EE6B3DF652205F88A7D36B9FDF06F9909C003B
PWC https://paperswithcode.com/paper/multi-branch-fusion-network-for-hyperspectral
Repo
Framework

On Human-Aligned Risk Minimization

Title On Human-Aligned Risk Minimization
Authors Liu Leqi, Adarsh Prasad, Pradeep K. Ravikumar
Abstract The statistical decision theoretic foundations of modern machine learning have largely focused on the minimization of the expectation of some loss function for a given task. However, seminal results in behavioral economics have shown that human decision-making is based on different risk measures than the expectation of any given loss function. In this paper, we pose the following simple question: in contrast to minimizing expected loss, could we minimize a better human-aligned risk measure? While this might not seem natural at first glance, we analyze the properties of such a revised risk measure, and surprisingly show that it might also better align with additional desiderata like fairness that have attracted considerable recent attention. We focus in particular on a class of human-aligned risk measures inspired by cumulative prospect theory. We empirically study these risk measures, and demonstrate their improved performance on desiderata such as fairness, in contrast to the traditional workhorse of expected loss minimization.
Tasks Decision Making
Published 2019-12-01
URL http://papers.nips.cc/paper/9642-on-human-aligned-risk-minimization
PDF http://papers.nips.cc/paper/9642-on-human-aligned-risk-minimization.pdf
PWC https://paperswithcode.com/paper/on-human-aligned-risk-minimization
Repo
Framework

Does NMT make a difference when post-editing closely related languages? The case of Spanish-Catalan

Title Does NMT make a difference when post-editing closely related languages? The case of Spanish-Catalan
Authors Sergi Alvarez, Antoni Oliver, Toni Badia
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-6708/
PDF https://www.aclweb.org/anthology/W19-6708
PWC https://paperswithcode.com/paper/does-nmt-make-a-difference-when-post-editing
Repo
Framework

Evaluating machine translation in a low-resource language combination: Spanish-Galician.

Title Evaluating machine translation in a low-resource language combination: Spanish-Galician.
Authors Mar{'\i}a Do Campo Bay{'o}n, Pilar S{'a}nchez-Gij{'o}n
Abstract
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-6705/
PDF https://www.aclweb.org/anthology/W19-6705
PWC https://paperswithcode.com/paper/evaluating-machine-translation-in-a-low
Repo
Framework

Domain Adaptation for Structured Output via Disentangled Patch Representations

Title Domain Adaptation for Structured Output via Disentangled Patch Representations
Authors Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker
Abstract Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn strong supervised models like convolutional neural networks. However, these models trained on one data domain may not generalize well to other domains unequipped with annotations for model finetuning. To avoid the labor-intensive process of annotation, we develop a domain adaptation method to adapt the source data to the unlabeled target domain. To this end, we propose to learn discriminative feature representations of patches based on label histograms in the source domain, through the construction of a disentangled space. With such representations as guidance, we then use an adversarial learning scheme to push the feature representations in target patches to the closer distributions in source ones. In addition, we show that our framework can integrate a global alignment process with the proposed patch-level alignment and achieve state-of-the-art performance on semantic segmentation. Extensive ablation studies and experiments are conducted on numerous benchmark datasets with various settings, such as synthetic-to-real and cross-city scenarios.
Tasks Domain Adaptation, Semantic Segmentation
Published 2019-05-01
URL https://openreview.net/forum?id=B1xFhiC9Y7
PDF https://openreview.net/pdf?id=B1xFhiC9Y7
PWC https://paperswithcode.com/paper/domain-adaptation-for-structured-output-via-1
Repo
Framework

Implementing a Multi-lingual Chatbot for Positive Reinforcement in Young Learners

Title Implementing a Multi-lingual Chatbot for Positive Reinforcement in Young Learners
Authors Francisca Oladipo, Abdulmalik Rufai
Abstract This is a humanitarian work {–}a counter-terrorism effort. The presentation describes the experiences of developing a multi-lingua, interactive chatbot trained on the corpus of two Nigerian Languages (Hausa and Fulfude), with simultaneous translation to a third (Kanuri), to stimulate conversations, deliver tailored contents to the users thereby aiding in the detection of the probability and degree of radicalization in young learners through data analysis of the games moves and vocabularies. As chatbots have the ability to simulate a human conversation based on rhetorical behavior, the system is able to learn the need of individual user through constant interaction and deliver tailored contents that promote good behavior in Hausa, Fulfulde and Kanuri languages.
Tasks Chatbot
Published 2019-08-01
URL https://www.aclweb.org/anthology/papers/W/W19/W19-3629/
PDF https://www.aclweb.org/anthology/W19-3629
PWC https://paperswithcode.com/paper/implementing-a-multi-lingual-chatbot-for
Repo
Framework

Learning Reward Machines for Partially Observable Reinforcement Learning

Title Learning Reward Machines for Partially Observable Reinforcement Learning
Authors Rodrigo Toro Icarte, Ethan Waldie, Toryn Klassen, Rick Valenzano, Margarita Castro, Sheila Mcilraith
Abstract Reward Machines (RMs), originally proposed for specifying problems in Reinforcement Learning (RL), provide a structured, automata-based representation of a reward function that allows an agent to decompose problems into subproblems that can be efficiently learned using off-policy learning. Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems. We pose the task of learning RMs as a discrete optimization problem where the objective is to find an RM that decomposes the problem into a set of subproblems such that the combination of their optimal memoryless policies is an optimal policy for the original problem. We show the effectiveness of this approach on three partially observable domains, where it significantly outperforms A3C, PPO, and ACER, and discuss its advantages, limitations, and broader potential.
Tasks Problem Decomposition
Published 2019-12-01
URL http://papers.nips.cc/paper/9685-learning-reward-machines-for-partially-observable-reinforcement-learning
PDF http://papers.nips.cc/paper/9685-learning-reward-machines-for-partially-observable-reinforcement-learning.pdf
PWC https://paperswithcode.com/paper/learning-reward-machines-for-partially
Repo
Framework

Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation

Title Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation
Authors Manuel R. Ciosici, Ira Assent
Abstract We present Abbreviation Explorer, a system that supports interactive exploration of abbreviations that are challenging for Unsupervised Abbreviation Disambiguation (UAD). Abbreviation Explorer helps to identify long-forms that are easily confused, and to pinpoint likely causes such as limitations of normalization, language switching, or inconsistent typing. It can also support determining which long-forms would benefit from additional input text for unsupervised abbreviation disambiguation. The system provides options for creating corrective rules that merge redundant long-forms with identical meaning. The identified rules can be easily applied to the already existing vector spaces used by UAD to improve disambiguation performance, while also avoiding the cost of retraining.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-4001/
PDF https://www.aclweb.org/anthology/N19-4001
PWC https://paperswithcode.com/paper/abbreviation-explorer-an-interactive-system
Repo
Framework
comments powered by Disqus