January 25, 2020

2231 words 11 mins read

Paper Group NANR 93

Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation. CEM-RL: Combining evolutionary and gradient-based methods for policy search. A quantitative probe into the hierarchical structure of written Chinese. Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks. A2BCD: Asynchro …

Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation


Title	Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation
Authors	Francine Chen, Yan-Ying Chen
Abstract	A common issue in training a deep learning, abstractive summarization model is lack of a large set of training summaries. This paper examines techniques for adapting from a labeled source domain to an unlabeled target domain in the context of an encoder-decoder model for text generation. In addition to adversarial domain adaptation (ADA), we introduce the use of artificial titles and sequential training to capture the grammatical style of the unlabeled target domain. Evaluation on adapting to/from news articles and Stack Exchange posts indicates that the use of these techniques can boost performance for both unsupervised adaptation as well as fine-tuning with limited target data.
Tasks	Abstractive Text Summarization, Domain Adaptation, Text Generation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1211/
PDF	https://www.aclweb.org/anthology/P19-1211
PWC	https://paperswithcode.com/paper/adversarial-domain-adaptation-using
Repo
Framework

CEM-RL: Combining evolutionary and gradient-based methods for policy search


Title	CEM-RL: Combining evolutionary and gradient-based methods for policy search
Authors	Pourchot, Sigaud
Abstract	Deep neuroevolution and deep reinforcement learning (deep RL) algorithms are two popular approaches to policy search. The former is widely applicable and rather stable, but suffers from low sample efficiency. By contrast, the latter is more sample efficient, but the most sample efficient variants are also rather unstable and highly sensitive to hyper-parameter setting. So far, these families of methods have mostly been compared as competing tools. However, an emerging approach consists in combining them so as to get the best of both worlds. Two previously existing combinations use either an ad hoc evolutionary algorithm or a goal exploration process together with the Deep Deterministic Policy Gradient (DDPG) algorithm, a sample efficient off-policy deep RL algorithm. In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (TD3), another off-policy deep RL algorithm which improves over DDPG. We evaluate the resulting method, CEM-RL, on a set of benchmarks classically used in deep RL. We show that CEM-RL benefits from several advantages over its competitors and offers a satisfactory trade-off between performance and sample efficiency.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=BkeU5j0ctQ
PDF	https://openreview.net/pdf?id=BkeU5j0ctQ
PWC	https://paperswithcode.com/paper/cem-rl-combining-evolutionary-and-gradient
Repo
Framework

A quantitative probe into the hierarchical structure of written Chinese


Title	A quantitative probe into the hierarchical structure of written Chinese
Authors	Heng Chen, Haitao Liu
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7904/
PDF	https://www.aclweb.org/anthology/W19-7904
PWC	https://paperswithcode.com/paper/a-quantitative-probe-into-the-hierarchical
Repo
Framework

Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks


Title	Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks
Authors	Maxim Kodryan, Artem Grachev, Dmitry Ignatov, Dmitry Vetrov
Abstract	Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90{%} of the weights in both encoder and decoder layers can be removed with a minimal quality loss.
Tasks	Language Modelling
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4306/
PDF	https://www.aclweb.org/anthology/W19-4306
PWC	https://paperswithcode.com/paper/efficient-language-modeling-with-automatic
Repo
Framework

A2BCD: Asynchronous Acceleration with Optimal Complexity


Title	A2BCD: Asynchronous Acceleration with Optimal Complexity
Authors	Robert Hannah, Fei Feng, Wotao Yin
Abstract	In this paper, we propose the Asynchronous Accelerated Nonuniform Randomized Block Coordinate Descent algorithm (A2BCD). We prove A2BCD converges linearly to a solution of the convex minimization problem at the same rate as NU_ACDM, so long as the maximum delay is not too large. This is the first asynchronous Nesterov-accelerated algorithm that attains any provable speedup. Moreover, we then prove that these algorithms both have optimal complexity. Asynchronous algorithms complete much faster iterations, and A2BCD has optimal complexity. Hence we observe in experiments that A2BCD is the top-performing coordinate descent algorithm, converging up to 4-5x faster than NU_ACDM on some data sets in terms of wall-clock time. To motivate our theory and proof techniques, we also derive and analyze a continuous-time analog of our algorithm and prove it converges at the same rate.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=rylIAsCqYm
PDF	https://openreview.net/pdf?id=rylIAsCqYm
PWC	https://paperswithcode.com/paper/a2bcd-asynchronous-acceleration-with-optimal
Repo
Framework

Ilfhocail: A Lexicon of Irish MWEs


Title	Ilfhocail: A Lexicon of Irish MWEs
Authors	Abigail Walsh, Teresa Lynn, Jennifer Foster
Abstract	This paper describes the categorisation of Irish MWEs, and the construction of the first version of a lexicon of Irish MWEs for NLP purposes (Ilfhocail, meaning {`}Multiwords{'}), collected from a number of resources. For the purposes of quality assurance, 530 entries of this lexicon were examined and manually annotated for POS information and MWE category. \|
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5120/
PDF	https://www.aclweb.org/anthology/W19-5120
PWC	https://paperswithcode.com/paper/ilfhocail-a-lexicon-of-irish-mwes
Repo
Framework

Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression


Title	Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression
Authors	Kyung-Rae Kim, Whan Choi, Yeong Jun Koh, Seong-Gyun Jeong, Chang-Su Kim
Abstract	A novel algorithm to estimate instance-level future motion in a single image is proposed in this paper. We first represent the future motion of an instance with its direction, speed, and action classes. Then, we develop a deep neural network that exploits different levels of semantic information to perform the future motion estimation. For effective future motion classification, we adopt ordinal regression. Especially, we develop the cyclic ordinal regression scheme using binary classifiers. Experiments demonstrate that the proposed algorithm provides reliable performance and thus can be used effectively for vision applications, including single and multi object tracking. Furthermore, we release the future motion (FM) dataset, collected from diverse sources and annotated manually, as a benchmark for single-image future motion estimation.
Tasks	Motion Estimation, Multi-Object Tracking, Object Tracking
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/instance-level-future-motion-estimation-in-a
Repo
Framework

Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models


Title	Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models
Authors	Ashraf Elnagar, Omar Einea, Ridhwan Al-Debsi
Abstract
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-7409/
PDF	https://www.aclweb.org/anthology/W19-7409
PWC	https://paperswithcode.com/paper/automatic-text-tagging-of-arabic-news
Repo
Framework

Span-Level Model for Relation Extraction


Title	Span-Level Model for Relation Extraction
Authors	Kalpit Dixit, Yaser Al-Onaizan
Abstract	Relation Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions. Recent approaches for this span-level task have been token-level models which have inherent limitations. They cannot easily define and implement span-level features, cannot model overlapping entity mentions and have cascading errors due to the use of sequential decoding. To address these concerns, we present a model which directly models all possible spans and performs joint entity mention detection and relation extraction. We report a new state-of-the-art performance of 62.83 F1 (prev best was 60.49) on the ACE2005 dataset.
Tasks	Relation Extraction
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1525/
PDF	https://www.aclweb.org/anthology/P19-1525
PWC	https://paperswithcode.com/paper/span-level-model-for-relation-extraction
Repo
Framework

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder


Title	Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder
Authors	Ryan Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko
Abstract	We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks. By enabling the decoder at each time step to write to all of the encoder output layers, Scratchpad can employ the encoder as a {``}scratchpad{''} memory to keep track of what has been generated so far and thereby guide future generation. We evaluate Scratchpad in the context of three well-studied natural language generation tasks {—} Machine Translation, Question Generation, and Text Summarization {—} and obtain state-of-the-art or comparable performance on standard datasets for each task. Qualitative assessments in the form of human judgements (question generation), attention visualization (MT), and sample output (summarization) provide further evidence of the ability of Scratchpad to generate fluent and expressive output. \|
Tasks	Machine Translation, Question Generation, Text Generation, Text Summarization
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1407/
PDF	https://www.aclweb.org/anthology/P19-1407
PWC	https://paperswithcode.com/paper/keeping-notes-conditional-natural-language-1
Repo
Framework

Modeling Political Framing Across Policy Issues and Contexts


Title	Modeling Political Framing Across Policy Issues and Contexts
Authors	Shima Khanehzar, Andrew Turpin, Gosia Mikolajczak
Abstract
Tasks
Published	2019-04-01
URL	https://www.aclweb.org/anthology/U19-1009/
PDF	https://www.aclweb.org/anthology/U19-1009
PWC	https://paperswithcode.com/paper/modeling-political-framing-across-policy
Repo
Framework

Multilingual sentence-level bias detection in Wikipedia


Title	Multilingual sentence-level bias detection in Wikipedia
Authors	Aleks, Desislava rova, Fran{\c{c}}ois Lareau, Pierre Andr{'e} M{'e}nard
Abstract	We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English. Sifting through the revision history of the articles that at some point had been considered biased and later corrected, we retrieve the last tagged and the first untagged revisions as the before/after snapshots of what was deemed a violation of Wikipedia{'}s neutral point of view policy. We extract the sentences that were removed or rewritten in that edit. The approach yields sufficient data even in the case of relatively small Wikipedias, such as the Bulgarian one, where 62k articles produced 5k biased sentences. We evaluate our method by manually annotating 520 sentences for Bulgarian and French, and 744 for English. We assess the level of noise and analyze its sources. Finally, we exploit the data with well-known classification methods to detect biased sentences. Code and datasets are hosted at https://github.com/crim-ca/wiki-bias.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1006/
PDF	https://www.aclweb.org/anthology/R19-1006
PWC	https://paperswithcode.com/paper/multilingual-sentence-level-bias-detection-in
Repo
Framework

Initialized Equilibrium Propagation for Backprop-Free Training


Title	Initialized Equilibrium Propagation for Backprop-Free Training
Authors	Peter O’Connor, Efstratios Gavves, Max Welling
Abstract	Deep neural networks are almost universally trained with reverse-mode automatic differentiation (a.k.a. backpropagation). Biological networks, on the other hand, appear to lack any mechanism for sending gradients back to their input neurons, and thus cannot be learning in this way. In response to this, Scellier & Bengio (2017) proposed Equilibrium Propagation - a method for gradient-based train- ing of neural networks which uses only local learning rules and, crucially, does not rely on neurons having a mechanism for back-propagating an error gradient. Equilibrium propagation, however, has a major practical limitation: inference involves doing an iterative optimization of neural activations to find a fixed-point, and the number of steps required to closely approximate this fixed point scales poorly with the depth of the network. In response to this problem, we propose Initialized Equilibrium Propagation, which trains a feedforward network to initialize the iterative inference procedure for Equilibrium propagation. This feed-forward network learns to approximate the state of the fixed-point using a local learning rule. After training, we can simply use this initializing network for inference, resulting in a learned feedforward network. Our experiments show that this network appears to work as well or better than the original version of Equilibrium propagation. This shows how we might go about training deep networks without using backpropagation.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=B1GMDsR5tm
PDF	https://openreview.net/pdf?id=B1GMDsR5tm
PWC	https://paperswithcode.com/paper/initialized-equilibrium-propagation-for
Repo
Framework

Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms


Title	Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms
Authors	Diyi Yang, Jiaao Chen, Zichao Yang, Dan Jurafsky, Eduard Hovy
Abstract	Modeling what makes a request persuasive - eliciting the desired response from a reader - is critical to the study of propaganda, behavioral economics, and advertising. Yet current models can{'}t quantify the persuasiveness of requests or extract successful persuasive strategies. Building on theories of persuasion, we propose a neural network to quantify persuasiveness and identify the persuasive strategies in advocacy requests. Our semi-supervised hierarchical neural network model is supervised by the number of people persuaded to take actions and partially supervised at the sentence level with human-labeled rhetorical strategies. Our method outperforms several baselines, uncovers persuasive strategies - offering increased interpretability of persuasive speech - and has applications for other situations with document-level supervision but only partial sentence supervision.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1364/
PDF	https://www.aclweb.org/anthology/N19-1364
PWC	https://paperswithcode.com/paper/lets-make-your-request-more-persuasive
Repo
Framework

Distribution-Interpolation Trade off in Generative Models


Title	Distribution-Interpolation Trade off in Generative Models
Authors	Damian Leśniak, Igor Sieradzki, Igor Podolak
Abstract	We investigate the properties of multidimensional probability distributions in the context of latent space prior distributions of implicit generative models. Our work revolves around the phenomena arising while decoding linear interpolations between two random latent vectors – regions of latent space in close proximity to the origin of the space are oversampled, which restricts the usability of linear interpolations as a tool to analyse the latent space. We show that the distribution mismatch can be eliminated completely by a proper choice of the latent probability distribution or using non-linear interpolations. We prove that there is a trade off between the interpolation being linear, and the latent distribution having even the most basic properties required for stable training, such as finite mean. We use the multidimensional Cauchy distribution as an example of the prior distribution, and also provide a general method of creating non-linear interpolations, that is easily applicable to a large family of commonly used latent distributions.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SyMhLo0qKQ
PDF	https://openreview.net/pdf?id=SyMhLo0qKQ
PWC	https://paperswithcode.com/paper/distribution-interpolation-trade-off-in
Repo
Framework