January 25, 2020

2231 words 11 mins read

Paper Group NANR 93

Paper Group NANR 93

Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation. CEM-RL: Combining evolutionary and gradient-based methods for policy search. A quantitative probe into the hierarchical structure of written Chinese. Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks. A2BCD: Asynchro …

Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation

Title Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation
Authors Francine Chen, Yan-Ying Chen
Abstract A common issue in training a deep learning, abstractive summarization model is lack of a large set of training summaries. This paper examines techniques for adapting from a labeled source domain to an unlabeled target domain in the context of an encoder-decoder model for text generation. In addition to adversarial domain adaptation (ADA), we introduce the use of artificial titles and sequential training to capture the grammatical style of the unlabeled target domain. Evaluation on adapting to/from news articles and Stack Exchange posts indicates that the use of these techniques can boost performance for both unsupervised adaptation as well as fine-tuning with limited target data.
Tasks Abstractive Text Summarization, Domain Adaptation, Text Generation
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1211/
PDF https://www.aclweb.org/anthology/P19-1211
PWC https://paperswithcode.com/paper/adversarial-domain-adaptation-using
Repo
Framework
Title CEM-RL: Combining evolutionary and gradient-based methods for policy search
Authors Pourchot, Sigaud
Abstract Deep neuroevolution and deep reinforcement learning (deep RL) algorithms are two popular approaches to policy search. The former is widely applicable and rather stable, but suffers from low sample efficiency. By contrast, the latter is more sample efficient, but the most sample efficient variants are also rather unstable and highly sensitive to hyper-parameter setting. So far, these families of methods have mostly been compared as competing tools. However, an emerging approach consists in combining them so as to get the best of both worlds. Two previously existing combinations use either an ad hoc evolutionary algorithm or a goal exploration process together with the Deep Deterministic Policy Gradient (DDPG) algorithm, a sample efficient off-policy deep RL algorithm. In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (TD3), another off-policy deep RL algorithm which improves over DDPG. We evaluate the resulting method, CEM-RL, on a set of benchmarks classically used in deep RL. We show that CEM-RL benefits from several advantages over its competitors and offers a satisfactory trade-off between performance and sample efficiency.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=BkeU5j0ctQ
PDF https://openreview.net/pdf?id=BkeU5j0ctQ
PWC https://paperswithcode.com/paper/cem-rl-combining-evolutionary-and-gradient
Repo
Framework

A quantitative probe into the hierarchical structure of written Chinese

Title A quantitative probe into the hierarchical structure of written Chinese
Authors Heng Chen, Haitao Liu
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7904/
PDF https://www.aclweb.org/anthology/W19-7904
PWC https://paperswithcode.com/paper/a-quantitative-probe-into-the-hierarchical
Repo
Framework

Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks

Title Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks
Authors Maxim Kodryan, Artem Grachev, Dmitry Ignatov, Dmitry Vetrov
Abstract Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90{%} of the weights in both encoder and decoder layers can be removed with a minimal quality loss.
Tasks Language Modelling
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4306/
PDF https://www.aclweb.org/anthology/W19-4306
PWC https://paperswithcode.com/paper/efficient-language-modeling-with-automatic
Repo
Framework

A2BCD: Asynchronous Acceleration with Optimal Complexity

Title A2BCD: Asynchronous Acceleration with Optimal Complexity
Authors Robert Hannah, Fei Feng, Wotao Yin
Abstract In this paper, we propose the Asynchronous Accelerated Nonuniform Randomized Block Coordinate Descent algorithm (A2BCD). We prove A2BCD converges linearly to a solution of the convex minimization problem at the same rate as NU_ACDM, so long as the maximum delay is not too large. This is the first asynchronous Nesterov-accelerated algorithm that attains any provable speedup. Moreover, we then prove that these algorithms both have optimal complexity. Asynchronous algorithms complete much faster iterations, and A2BCD has optimal complexity. Hence we observe in experiments that A2BCD is the top-performing coordinate descent algorithm, converging up to 4-5x faster than NU_ACDM on some data sets in terms of wall-clock time. To motivate our theory and proof techniques, we also derive and analyze a continuous-time analog of our algorithm and prove it converges at the same rate.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=rylIAsCqYm
PDF https://openreview.net/pdf?id=rylIAsCqYm
PWC https://paperswithcode.com/paper/a2bcd-asynchronous-acceleration-with-optimal
Repo
Framework

Ilfhocail: A Lexicon of Irish MWEs

Title Ilfhocail: A Lexicon of Irish MWEs
Authors Abigail Walsh, Teresa Lynn, Jennifer Foster
Abstract This paper describes the categorisation of Irish MWEs, and the construction of the first version of a lexicon of Irish MWEs for NLP purposes (Ilfhocail, meaning {`}Multiwords{'}), collected from a number of resources. For the purposes of quality assurance, 530 entries of this lexicon were examined and manually annotated for POS information and MWE category. |
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5120/
PDF https://www.aclweb.org/anthology/W19-5120
PWC https://paperswithcode.com/paper/ilfhocail-a-lexicon-of-irish-mwes
Repo
Framework

Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression

Title Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression
Authors Kyung-Rae Kim, Whan Choi, Yeong Jun Koh, Seong-Gyun Jeong, Chang-Su Kim
Abstract A novel algorithm to estimate instance-level future motion in a single image is proposed in this paper. We first represent the future motion of an instance with its direction, speed, and action classes. Then, we develop a deep neural network that exploits different levels of semantic information to perform the future motion estimation. For effective future motion classification, we adopt ordinal regression. Especially, we develop the cyclic ordinal regression scheme using binary classifiers. Experiments demonstrate that the proposed algorithm provides reliable performance and thus can be used effectively for vision applications, including single and multi object tracking. Furthermore, we release the future motion (FM) dataset, collected from diverse sources and annotated manually, as a benchmark for single-image future motion estimation.
Tasks Motion Estimation, Multi-Object Tracking, Object Tracking
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/instance-level-future-motion-estimation-in-a
Repo
Framework

Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models

Title Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models
Authors Ashraf Elnagar, Omar Einea, Ridhwan Al-Debsi
Abstract
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/W19-7409/
PDF https://www.aclweb.org/anthology/W19-7409
PWC https://paperswithcode.com/paper/automatic-text-tagging-of-arabic-news
Repo
Framework

Span-Level Model for Relation Extraction

Title Span-Level Model for Relation Extraction
Authors Kalpit Dixit, Yaser Al-Onaizan
Abstract Relation Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions. Recent approaches for this span-level task have been token-level models which have inherent limitations. They cannot easily define and implement span-level features, cannot model overlapping entity mentions and have cascading errors due to the use of sequential decoding. To address these concerns, we present a model which directly models all possible spans and performs joint entity mention detection and relation extraction. We report a new state-of-the-art performance of 62.83 F1 (prev best was 60.49) on the ACE2005 dataset.
Tasks Relation Extraction
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1525/
PDF https://www.aclweb.org/anthology/P19-1525
PWC https://paperswithcode.com/paper/span-level-model-for-relation-extraction
Repo
Framework

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder

Title Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder
Authors Ryan Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko
Abstract We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks. By enabling the decoder at each time step to write to all of the encoder output layers, Scratchpad can employ the encoder as a {``}scratchpad{''} memory to keep track of what has been generated so far and thereby guide future generation. We evaluate Scratchpad in the context of three well-studied natural language generation tasks {—} Machine Translation, Question Generation, and Text Summarization {—} and obtain state-of-the-art or comparable performance on standard datasets for each task. Qualitative assessments in the form of human judgements (question generation), attention visualization (MT), and sample output (summarization) provide further evidence of the ability of Scratchpad to generate fluent and expressive output. |
Tasks Machine Translation, Question Generation, Text Generation, Text Summarization
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1407/
PDF https://www.aclweb.org/anthology/P19-1407
PWC https://paperswithcode.com/paper/keeping-notes-conditional-natural-language-1
Repo
Framework

Modeling Political Framing Across Policy Issues and Contexts

Title Modeling Political Framing Across Policy Issues and Contexts
Authors Shima Khanehzar, Andrew Turpin, Gosia Mikolajczak
Abstract
Tasks
Published 2019-04-01
URL https://www.aclweb.org/anthology/U19-1009/
PDF https://www.aclweb.org/anthology/U19-1009
PWC https://paperswithcode.com/paper/modeling-political-framing-across-policy
Repo
Framework

Multilingual sentence-level bias detection in Wikipedia

Title Multilingual sentence-level bias detection in Wikipedia
Authors Aleks, Desislava rova, Fran{\c{c}}ois Lareau, Pierre Andr{'e} M{'e}nard
Abstract We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English. Sifting through the revision history of the articles that at some point had been considered biased and later corrected, we retrieve the last tagged and the first untagged revisions as the before/after snapshots of what was deemed a violation of Wikipedia{'}s neutral point of view policy. We extract the sentences that were removed or rewritten in that edit. The approach yields sufficient data even in the case of relatively small Wikipedias, such as the Bulgarian one, where 62k articles produced 5k biased sentences. We evaluate our method by manually annotating 520 sentences for Bulgarian and French, and 744 for English. We assess the level of noise and analyze its sources. Finally, we exploit the data with well-known classification methods to detect biased sentences. Code and datasets are hosted at https://github.com/crim-ca/wiki-bias.
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1006/
PDF https://www.aclweb.org/anthology/R19-1006
PWC https://paperswithcode.com/paper/multilingual-sentence-level-bias-detection-in
Repo
Framework

Initialized Equilibrium Propagation for Backprop-Free Training

Title Initialized Equilibrium Propagation for Backprop-Free Training
Authors Peter O’Connor, Efstratios Gavves, Max Welling
Abstract Deep neural networks are almost universally trained with reverse-mode automatic differentiation (a.k.a. backpropagation). Biological networks, on the other hand, appear to lack any mechanism for sending gradients back to their input neurons, and thus cannot be learning in this way. In response to this, Scellier & Bengio (2017) proposed Equilibrium Propagation - a method for gradient-based train- ing of neural networks which uses only local learning rules and, crucially, does not rely on neurons having a mechanism for back-propagating an error gradient. Equilibrium propagation, however, has a major practical limitation: inference involves doing an iterative optimization of neural activations to find a fixed-point, and the number of steps required to closely approximate this fixed point scales poorly with the depth of the network. In response to this problem, we propose Initialized Equilibrium Propagation, which trains a feedforward network to initialize the iterative inference procedure for Equilibrium propagation. This feed-forward network learns to approximate the state of the fixed-point using a local learning rule. After training, we can simply use this initializing network for inference, resulting in a learned feedforward network. Our experiments show that this network appears to work as well or better than the original version of Equilibrium propagation. This shows how we might go about training deep networks without using backpropagation.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=B1GMDsR5tm
PDF https://openreview.net/pdf?id=B1GMDsR5tm
PWC https://paperswithcode.com/paper/initialized-equilibrium-propagation-for
Repo
Framework

Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms

Title Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms
Authors Diyi Yang, Jiaao Chen, Zichao Yang, Dan Jurafsky, Eduard Hovy
Abstract Modeling what makes a request persuasive - eliciting the desired response from a reader - is critical to the study of propaganda, behavioral economics, and advertising. Yet current models can{'}t quantify the persuasiveness of requests or extract successful persuasive strategies. Building on theories of persuasion, we propose a neural network to quantify persuasiveness and identify the persuasive strategies in advocacy requests. Our semi-supervised hierarchical neural network model is supervised by the number of people persuaded to take actions and partially supervised at the sentence level with human-labeled rhetorical strategies. Our method outperforms several baselines, uncovers persuasive strategies - offering increased interpretability of persuasive speech - and has applications for other situations with document-level supervision but only partial sentence supervision.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1364/
PDF https://www.aclweb.org/anthology/N19-1364
PWC https://paperswithcode.com/paper/lets-make-your-request-more-persuasive
Repo
Framework

Distribution-Interpolation Trade off in Generative Models

Title Distribution-Interpolation Trade off in Generative Models
Authors Damian Leśniak, Igor Sieradzki, Igor Podolak
Abstract We investigate the properties of multidimensional probability distributions in the context of latent space prior distributions of implicit generative models. Our work revolves around the phenomena arising while decoding linear interpolations between two random latent vectors – regions of latent space in close proximity to the origin of the space are oversampled, which restricts the usability of linear interpolations as a tool to analyse the latent space. We show that the distribution mismatch can be eliminated completely by a proper choice of the latent probability distribution or using non-linear interpolations. We prove that there is a trade off between the interpolation being linear, and the latent distribution having even the most basic properties required for stable training, such as finite mean. We use the multidimensional Cauchy distribution as an example of the prior distribution, and also provide a general method of creating non-linear interpolations, that is easily applicable to a large family of commonly used latent distributions.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SyMhLo0qKQ
PDF https://openreview.net/pdf?id=SyMhLo0qKQ
PWC https://paperswithcode.com/paper/distribution-interpolation-trade-off-in
Repo
Framework
comments powered by Disqus