January 24, 2020

2736 words 13 mins read

Paper Group NANR 153

Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization. Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction. Improving Abstractive Document Summarization with Salient Information Modeling. Courteously Yours: Inducing courteous behavior in Customer Care responses us …

Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization


Title	Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization
Authors	Guanlin Li, Lemao Liu, Guoping Huang, Conghui Zhu, Tiejun Zhao
Abstract	Many Data Augmentation (DA) methods have been proposed for neural machine translation. Existing works measure the superiority of DA methods in terms of their performance on a specific test set, but we find that some DA methods do not exhibit consistent improvements across translation tasks. Based on the observation, this paper makes an initial attempt to answer a fundamental question: what benefits, which are consistent across different methods and tasks, does DA in general obtain? Inspired by recent theoretic advances in deep learning, the paper understands DA from two perspectives towards the generalization ability of a model: input sensitivity and prediction margin, which are defined independent of specific test set thereby may lead to findings with relatively low variance. Extensive experiments show that relatively consistent benefits across five DA methods and four translation tasks are achieved regarding both perspectives.
Tasks	Data Augmentation, Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1570/
PDF	https://www.aclweb.org/anthology/D19-1570
PWC	https://paperswithcode.com/paper/understanding-data-augmentation-in-neural
Repo
Framework

Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction


Title	Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction
Authors	Youngwoon Lee, Shao-Hua Sun, Sriram Somasundaram, Edward Hu, Joseph J. Lim
Abstract	Intelligent creatures acquire complex skills by exploiting previously learned skills and learning to transition between them. To empower machines with this ability, we propose transition policies which effectively connect primitive skills to perform sequential tasks without handcrafted rewards. To effectively train our transition policies, we introduce proximity predictors which induce rewards gauging proximity to suitable initial states for the next skill. The proposed method is evaluated on a diverse set of experiments for continuous control in both bi-pedal locomotion and robotic arm manipulation tasks in MuJoCo. We demonstrate that transition policies enable us to effectively learn complex tasks and the induced proximity reward computed using the initiation predictor improves training efficiency. Videos of policies learned by our algorithm and baselines can be found at https://sites.google.com/view/transitions-iclr2019 .
Tasks	Continuous Control
Published	2019-05-01
URL	https://openreview.net/forum?id=rygrBhC5tQ
PDF	https://openreview.net/pdf?id=rygrBhC5tQ
PWC	https://paperswithcode.com/paper/composing-complex-skills-by-learning
Repo
Framework

Improving Abstractive Document Summarization with Salient Information Modeling


Title	Improving Abstractive Document Summarization with Salient Information Modeling
Authors	Yongjian You, Weijia Jia, Tianyi Liu, Wenmian Yang
Abstract	Comprehensive document encoding and salient information selection are two major difficulties for generating summaries with adequate salient information. To tackle the above difficulties, we propose a Transformer-based encoder-decoder framework with two novel extensions for abstractive document summarization. Specifically, (1) to encode the documents comprehensively, we design a focus-attention mechanism and incorporate it into the encoder. This mechanism models a Gaussian focal bias on attention scores to enhance the perception of local context, which contributes to producing salient and informative summaries. (2) To distinguish salient information precisely, we design an independent saliency-selection network which manages the information flow from encoder to decoder. This network effectively reduces the influences of secondary information on the generated summaries. Experimental results on the popular CNN/Daily Mail benchmark demonstrate that our model outperforms other state-of-the-art baselines on the ROUGE metrics.
Tasks	Document Summarization
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1205/
PDF	https://www.aclweb.org/anthology/P19-1205
PWC	https://paperswithcode.com/paper/improving-abstractive-document-summarization
Repo
Framework

Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network


Title	Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network
Authors	Hitesh Golchha, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya
Abstract	In this paper, we propose an effective deep learning framework for inducing courteous behavior in customer care responses. The interaction between a customer and the customer care representative contributes substantially to the overall customer experience. Thus it is imperative for customer care agents and chatbots engaging with humans to be personal, cordial and emphatic to ensure customer satisfaction and retention. Our system aims at automatically transforming neutral customer care responses into courteous replies. Along with stylistic transfer (of courtesy), our system ensures that responses are coherent with the conversation history, and generates courteous expressions consistent with the emotional state of the customer. Our technique is based on a reinforced pointer-generator model for the sequence to sequence task. The model is also conditioned on a hierarchically encoded and emotionally aware conversational context. We use real interactions on Twitter between customer care professionals and aggrieved customers to create a large conversational dataset having both forms of agent responses: {`}generic{'} and {`}courteous{'}. We perform quantitative and qualitative analyses on established and task-specific metrics, both automatic and human evaluation based. Our evaluation shows that the proposed models can generate emotionally-appropriate courteous expressions while preserving the content. Experimental results also prove that our proposed approach performs better than the baseline models.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1091/
PDF	https://www.aclweb.org/anthology/N19-1091
PWC	https://paperswithcode.com/paper/courteously-yours-inducing-courteous-behavior
Repo
Framework

Creative Flow+ Dataset


Title	Creative Flow+ Dataset
Authors	Maria Shugrina, Ziheng Liang, Amlan Kar, Jiaman Li, Angad Singh, Karan Singh, Sanja Fidler
Abstract	We present the Creative Flow+ Dataset, the first diverse multi-style artistic video dataset richly labeled with per-pixel optical flow, occlusions, correspondences, segmentation labels, normals, and depth. Our dataset includes 3000 animated sequences rendered using styles randomly selected from 40 textured line styles and 38 shading styles, spanning the range between flat cartoon fill and wildly sketchy shading. Our dataset includes 124K+ train set frames and 10K test set frames rendered at 1500x1500 resolution, far surpassing the largest available optical flow datasets in size. While modern techniques for tasks such as optical flow estimation achieve impressive performance on realistic images and video, today there is no way to gauge their performance on non-photorealistic images. Creative Flow+ poses a new challenge to generalize real-world Computer Vision to messy stylized content. We show that learning-based optical flow methods fail to generalize to this data and struggle to compete with classical approaches, and invite new research in this area. Our dataset and a new optical flow benchmark will be publicly available at: www.cs.toronto.edu/creativeflow/. We further release the complete dataset creation pipeline, allowing the community to generate and stylize their own data on demand.
Tasks	Optical Flow Estimation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Shugrina_Creative_Flow_Dataset_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Shugrina_Creative_Flow_Dataset_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/creative-flow-dataset
Repo
Framework

A Structural Probe for Finding Syntax in Word Representations


Title	A Structural Probe for Finding Syntax in Word Representations
Authors	John Hewitt, Christopher D. Manning
Abstract	Recent work has improved our ability to detect linguistic knowledge in word representations. However, current methods for detecting syntactic knowledge do not test whether syntax trees are represented in their entirety. In this work, we propose a structural probe, which evaluates whether syntax trees are embedded in a linear transformation of a neural network{'}s word representation space. The probe identifies a linear transformation under which squared L2 distance encodes the distance between words in the parse tree, and one in which squared L2 norm encodes depth in the parse tree. Using our probe, we show that such transformations exist for both ELMo and BERT but not in baselines, providing evidence that entire syntax trees are embedded implicitly in deep models{'} vector geometry.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1419/
PDF	https://www.aclweb.org/anthology/N19-1419
PWC	https://paperswithcode.com/paper/a-structural-probe-for-finding-syntax-in-word
Repo
Framework

Unsupervised Dialogue Spectrum Generation for Log Dialogue Ranking


Title	Unsupervised Dialogue Spectrum Generation for Log Dialogue Ranking
Authors	Xinnuo Xu, Yizhe Zhang, Lars Liden, Sungjin Lee
Abstract	Although the data-driven approaches of some recent bot building platforms make it possible for a wide range of users to easily create dialogue systems, those platforms don{'}t offer tools for quickly identifying which log dialogues contain problems. This is important since corrections to log dialogues provide a means to improve performance after deployment. A log dialogue ranker, which ranks problematic dialogues higher, is an essential tool due to the sheer volume of log dialogues that could be generated. However, training a ranker typically requires labelling a substantial amount of data, which is not feasible for most users. In this paper, we present a novel unsupervised approach for dialogue ranking using GANs and release a corpus of labelled dialogues for evaluation and comparison with supervised methods. The evaluation result shows that our method compares favorably to supervised methods without any labelled data.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-5919/
PDF	https://www.aclweb.org/anthology/W19-5919
PWC	https://paperswithcode.com/paper/unsupervised-dialogue-spectrum-generation-for
Repo
Framework

AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation


Title	AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation
Authors	Mohammad Tavakolian, Hamed R. Tavakoli, Abdenour Hadid
Abstract	We propose an Adaptive Weighted Spatiotemporal Distillation (AWSD) technique for video representation by encoding the appearance and dynamics of the videos into a single RGB image map. This is obtained by adaptively dividing the videos into small segments and comparing two consecutive segments. This allows using pre-trained models on still images for video classification while successfully capturing the spatiotemporal variations in the videos. The adaptive segment selection enables effective encoding of the essential discriminative information of untrimmed videos. Based on Gaussian Scale Mixture, we compute the weights by extracting the mutual information between two consecutive segments. Unlike pooling-based methods, our AWSD gives more importance to the frames that characterize actions or events thanks to its adaptive segment length selection. We conducted extensive experimental analysis to evaluate the effectiveness of our proposed method and compared our results against those of recent state-of-the-art methods on four benchmark datatsets, including UCF101, HMDB51, ActivityNet v1.3, and Maryland. The obtained results on these benchmark datatsets showed that our method significantly outperforms earlier works and sets the new state-of-the-art performance in video classification. Code is available at the project webpage: https://mohammadt68.github.io/AWSD/
Tasks	Video Classification
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Tavakolian_AWSD_Adaptive_Weighted_Spatiotemporal_Distillation_for_Video_Representation_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Tavakolian_AWSD_Adaptive_Weighted_Spatiotemporal_Distillation_for_Video_Representation_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/awsd-adaptive-weighted-spatiotemporal
Repo
Framework

Towards Automatic Variant Analysis of Ancient Devotional Texts


Title	Towards Automatic Variant Analysis of Ancient Devotional Texts
Authors	Amir Hazem, B{'e}atrice Daille, Dominique Stutzmann, Jacob Currie, Christine Jacquin
Abstract	We address in this paper the issue of text reuse in liturgical manuscripts of the middle ages. More specifically, we study variant readings of the Obsecro Te prayer, part of the devotional Books of Hours often used by Christians as guidance for their daily prayers. We aim at automatically extracting and categorising pairs of words and expressions that exhibit variant relations. For this purpose, we adopt a linguistic classification that allows to better characterize the variants than edit operations. Then, we study the evolution of Obsecro Te texts from a temporal and geographical axis. Finally, we contrast several unsupervised state-of-the-art approaches for the automatic extraction of Obsecro Te variants. Based on the manual observation of 772 Obsecro Te copies which show more than 21,000 variants, we show that the proposed methodology is helpful for an automatic study of variants and may serve as basis to analyze and to depict useful information from devotional texts.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4730/
PDF	https://www.aclweb.org/anthology/W19-4730
PWC	https://paperswithcode.com/paper/towards-automatic-variant-analysis-of-ancient
Repo
Framework

nlpUP at SemEval-2019 Task 6: A Deep Neural Language Model for Offensive Language Detection


Title	nlpUP at SemEval-2019 Task 6: A Deep Neural Language Model for Offensive Language Detection
Authors	Jelena Mitrovi{'c}, Bastian Birkeneder, Michael Granitzer
Abstract	This paper presents our submission for the SemEval shared task 6, sub-task A on the identification of offensive language. Our proposed model, C-BiGRU, combines a Convolutional Neural Network (CNN) with a bidirectional Recurrent Neural Network (RNN). We utilize word2vec to capture the semantic similarities between words. This composition allows us to extract long term dependencies in tweets and distinguish between offensive and non-offensive tweets. In addition, we evaluate our approach on a different dataset and show that our model is capable of detecting online aggressiveness in both English and German tweets. Our model achieved a macro F1-score of 79.40{%} on the SemEval dataset.
Tasks	Language Modelling
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2127/
PDF	https://www.aclweb.org/anthology/S19-2127
PWC	https://paperswithcode.com/paper/nlpup-at-semeval-2019-task-6-a-deep-neural
Repo
Framework

Dependency Parsing as Sequence Labeling with Head-Based Encoding and Multi-Task Learning


Title	Dependency Parsing as Sequence Labeling with Head-Based Encoding and Multi-Task Learning
Authors	Oph{'e}lie Lacroix
Abstract
Tasks	Dependency Parsing, Multi-Task Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7716/
PDF	https://www.aclweb.org/anthology/W19-7716
PWC	https://paperswithcode.com/paper/dependency-parsing-as-sequence-labeling-with
Repo
Framework

Adaptive Convolution for Multi-Relational Learning


Title	Adaptive Convolution for Multi-Relational Learning
Authors	Xiaotian Jiang, Quan Wang, Bin Wang
Abstract	We consider the problem of learning distributed representations for entities and relations of multi-relational data so as to predict missing links therein. Convolutional neural networks have recently shown their superiority for this problem, bringing increased model expressiveness while remaining parameter efficient. Despite the success, previous convolution designs fail to model full interactions between input entities and relations, which potentially limits the performance of link prediction. In this work we introduce ConvR, an adaptive convolutional network designed to maximize entity-relation interactions in a convolutional fashion. ConvR adaptively constructs convolution filters from relation representations, and applies these filters across entity representations to generate convolutional features. As such, ConvR enables rich interactions between entity and relation representations at diverse regions, and all the convolutional features generated will be able to capture such interactions. We evaluate ConvR on multiple benchmark datasets. Experimental results show that: (1) ConvR performs substantially better than competitive baselines in almost all the metrics and on all the datasets; (2) Compared with state-of-the-art convolutional models, ConvR is not only more effective but also more efficient. It offers a 7{%} increase in MRR and a 6{%} increase in Hits@10, while saving 12{%} in parameter storage.
Tasks	Link Prediction, Relational Reasoning
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1103/
PDF	https://www.aclweb.org/anthology/N19-1103
PWC	https://paperswithcode.com/paper/adaptive-convolution-for-multi-relational
Repo
Framework

Can You Unpack That? Learning to Rewrite Questions-in-Context


Title	Can You Unpack That? Learning to Rewrite Questions-in-Context
Authors	Ahmed Elgohary, Denis Peskov, Jordan Boyd-Graber
Abstract	Question answering is an AI-complete problem, but existing datasets lack key elements of language understanding such as coreference and ellipsis resolution. We consider sequential question answering: multiple questions are asked one-by-one in a conversation between a questioner and an answerer. Answering these questions is only possible through understanding the conversation history. We introduce the task of question-in-context rewriting: given the context of a conversation{'}s history, rewrite a context-dependent into a self-contained question with the same answer. We construct, CANARD, a dataset of 40,527 questions based on QuAC (Choi et al., 2018) and train Seq2Seq models for incorporating context into standalone questions.
Tasks	Question Answering
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1605/
PDF	https://www.aclweb.org/anthology/D19-1605
PWC	https://paperswithcode.com/paper/can-you-unpack-that-learning-to-rewrite
Repo
Framework

Data-efficient Neural Text Compression with Interactive Learning


Title	Data-efficient Neural Text Compression with Interactive Learning
Authors	Avinesh P.V.S, Christian M. Meyer
Abstract	Neural sequence-to-sequence models have been successfully applied to text compression. However, these models were trained on huge automatically induced parallel corpora, which are only available for a few domains and tasks. In this paper, we propose a novel interactive setup to neural text compression that enables transferring a model to new domains and compression tasks with minimal human supervision. This is achieved by employing active learning, which intelligently samples from a large pool of unlabeled data. Using this setup, we can successfully adapt a model trained on small data of 40k samples for a headline generation task to a general text compression dataset at an acceptable compression quality with just 500 sampled instances annotated by a human.
Tasks	Active Learning
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1262/
PDF	https://www.aclweb.org/anthology/N19-1262
PWC	https://paperswithcode.com/paper/data-efficient-neural-text-compression-with
Repo
Framework

Large-scale optimal transport map estimation using projection pursuit


Title	Large-scale optimal transport map estimation using projection pursuit
Authors	Cheng Meng, Yuan Ke, Jingyi Zhang, Mengrui Zhang, Wenxuan Zhong, Ping Ma
Abstract	This paper studies the estimation of large-scale optimal transport maps (OTM), which is a well known challenging problem owing to the curse of dimensionality. Existing literature approximates the large-scale OTM by a series of one-dimensional OTM problems through iterative random projection. Such methods, however, suffer from slow or none convergence in practice due to the nature of randomly selected projection directions. Instead, we propose an estimation method of large-scale OTM by combining the idea of projection pursuit regression and sufficient dimension reduction. The proposed method, named projection pursuit Monge map (PPMM), adaptively selects the most `informative'' projection direction in each iteration. We theoretically show the proposed dimension reduction method can consistently estimate the most` informative’’ projection direction in each iteration. Furthermore, the PPMM algorithm weakly convergences to the target large-scale OTM in a reasonable number of steps. Empirically, PPMM is computationally easy and converges fast. We assess its finite sample performance through the applications of Wasserstein distance estimation and generative models.
Tasks	Dimensionality Reduction
Published	2019-12-01
URL	http://papers.nips.cc/paper/9023-large-scale-optimal-transport-map-estimation-using-projection-pursuit
PDF	http://papers.nips.cc/paper/9023-large-scale-optimal-transport-map-estimation-using-projection-pursuit.pdf
PWC	https://paperswithcode.com/paper/large-scale-optimal-transport-map-estimation
Repo
Framework