Paper Group NANR 93
Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation. CEM-RL: Combining evolutionary and gradient-based methods for policy search. A quantitative probe into the hierarchical structure of written Chinese. Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks. A2BCD: Asynchro …
Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation
Title | Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation |
Authors | Francine Chen, Yan-Ying Chen |
Abstract | A common issue in training a deep learning, abstractive summarization model is lack of a large set of training summaries. This paper examines techniques for adapting from a labeled source domain to an unlabeled target domain in the context of an encoder-decoder model for text generation. In addition to adversarial domain adaptation (ADA), we introduce the use of artificial titles and sequential training to capture the grammatical style of the unlabeled target domain. Evaluation on adapting to/from news articles and Stack Exchange posts indicates that the use of these techniques can boost performance for both unsupervised adaptation as well as fine-tuning with limited target data. |
Tasks | Abstractive Text Summarization, Domain Adaptation, Text Generation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1211/ |
https://www.aclweb.org/anthology/P19-1211 | |
PWC | https://paperswithcode.com/paper/adversarial-domain-adaptation-using |
Repo | |
Framework | |
CEM-RL: Combining evolutionary and gradient-based methods for policy search
Title | CEM-RL: Combining evolutionary and gradient-based methods for policy search |
Authors | Pourchot, Sigaud |
Abstract | Deep neuroevolution and deep reinforcement learning (deep RL) algorithms are two popular approaches to policy search. The former is widely applicable and rather stable, but suffers from low sample efficiency. By contrast, the latter is more sample efficient, but the most sample efficient variants are also rather unstable and highly sensitive to hyper-parameter setting. So far, these families of methods have mostly been compared as competing tools. However, an emerging approach consists in combining them so as to get the best of both worlds. Two previously existing combinations use either an ad hoc evolutionary algorithm or a goal exploration process together with the Deep Deterministic Policy Gradient (DDPG) algorithm, a sample efficient off-policy deep RL algorithm. In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (TD3), another off-policy deep RL algorithm which improves over DDPG. We evaluate the resulting method, CEM-RL, on a set of benchmarks classically used in deep RL. We show that CEM-RL benefits from several advantages over its competitors and offers a satisfactory trade-off between performance and sample efficiency. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BkeU5j0ctQ |
https://openreview.net/pdf?id=BkeU5j0ctQ | |
PWC | https://paperswithcode.com/paper/cem-rl-combining-evolutionary-and-gradient |
Repo | |
Framework | |
A quantitative probe into the hierarchical structure of written Chinese
Title | A quantitative probe into the hierarchical structure of written Chinese |
Authors | Heng Chen, Haitao Liu |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-7904/ |
https://www.aclweb.org/anthology/W19-7904 | |
PWC | https://paperswithcode.com/paper/a-quantitative-probe-into-the-hierarchical |
Repo | |
Framework | |
Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks
Title | Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks |
Authors | Maxim Kodryan, Artem Grachev, Dmitry Ignatov, Dmitry Vetrov |
Abstract | Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90{%} of the weights in both encoder and decoder layers can be removed with a minimal quality loss. |
Tasks | Language Modelling |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4306/ |
https://www.aclweb.org/anthology/W19-4306 | |
PWC | https://paperswithcode.com/paper/efficient-language-modeling-with-automatic |
Repo | |
Framework | |
A2BCD: Asynchronous Acceleration with Optimal Complexity
Title | A2BCD: Asynchronous Acceleration with Optimal Complexity |
Authors | Robert Hannah, Fei Feng, Wotao Yin |
Abstract | In this paper, we propose the Asynchronous Accelerated Nonuniform Randomized Block Coordinate Descent algorithm (A2BCD). We prove A2BCD converges linearly to a solution of the convex minimization problem at the same rate as NU_ACDM, so long as the maximum delay is not too large. This is the first asynchronous Nesterov-accelerated algorithm that attains any provable speedup. Moreover, we then prove that these algorithms both have optimal complexity. Asynchronous algorithms complete much faster iterations, and A2BCD has optimal complexity. Hence we observe in experiments that A2BCD is the top-performing coordinate descent algorithm, converging up to 4-5x faster than NU_ACDM on some data sets in terms of wall-clock time. To motivate our theory and proof techniques, we also derive and analyze a continuous-time analog of our algorithm and prove it converges at the same rate. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rylIAsCqYm |
https://openreview.net/pdf?id=rylIAsCqYm | |
PWC | https://paperswithcode.com/paper/a2bcd-asynchronous-acceleration-with-optimal |
Repo | |
Framework | |
Ilfhocail: A Lexicon of Irish MWEs
Title | Ilfhocail: A Lexicon of Irish MWEs |
Authors | Abigail Walsh, Teresa Lynn, Jennifer Foster |
Abstract | This paper describes the categorisation of Irish MWEs, and the construction of the first version of a lexicon of Irish MWEs for NLP purposes (Ilfhocail, meaning {`}Multiwords{'}), collected from a number of resources. For the purposes of quality assurance, 530 entries of this lexicon were examined and manually annotated for POS information and MWE category. | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5120/ |
https://www.aclweb.org/anthology/W19-5120 | |
PWC | https://paperswithcode.com/paper/ilfhocail-a-lexicon-of-irish-mwes |
Repo | |
Framework | |
Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression
Title | Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression |
Authors | Kyung-Rae Kim, Whan Choi, Yeong Jun Koh, Seong-Gyun Jeong, Chang-Su Kim |
Abstract | A novel algorithm to estimate instance-level future motion in a single image is proposed in this paper. We first represent the future motion of an instance with its direction, speed, and action classes. Then, we develop a deep neural network that exploits different levels of semantic information to perform the future motion estimation. For effective future motion classification, we adopt ordinal regression. Especially, we develop the cyclic ordinal regression scheme using binary classifiers. Experiments demonstrate that the proposed algorithm provides reliable performance and thus can be used effectively for vision applications, including single and multi object tracking. Furthermore, we release the future motion (FM) dataset, collected from diverse sources and annotated manually, as a benchmark for single-image future motion estimation. |
Tasks | Motion Estimation, Multi-Object Tracking, Object Tracking |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Kim_Instance-Level_Future_Motion_Estimation_in_a_Single_Image_Based_on_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/instance-level-future-motion-estimation-in-a |
Repo | |
Framework | |
Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models
Title | Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models |
Authors | Ashraf Elnagar, Omar Einea, Ridhwan Al-Debsi |
Abstract | |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-7409/ |
https://www.aclweb.org/anthology/W19-7409 | |
PWC | https://paperswithcode.com/paper/automatic-text-tagging-of-arabic-news |
Repo | |
Framework | |
Span-Level Model for Relation Extraction
Title | Span-Level Model for Relation Extraction |
Authors | Kalpit Dixit, Yaser Al-Onaizan |
Abstract | Relation Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions. Recent approaches for this span-level task have been token-level models which have inherent limitations. They cannot easily define and implement span-level features, cannot model overlapping entity mentions and have cascading errors due to the use of sequential decoding. To address these concerns, we present a model which directly models all possible spans and performs joint entity mention detection and relation extraction. We report a new state-of-the-art performance of 62.83 F1 (prev best was 60.49) on the ACE2005 dataset. |
Tasks | Relation Extraction |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1525/ |
https://www.aclweb.org/anthology/P19-1525 | |
PWC | https://paperswithcode.com/paper/span-level-model-for-relation-extraction |
Repo | |
Framework | |
Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder
Title | Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder |
Authors | Ryan Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko |
Abstract | We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks. By enabling the decoder at each time step to write to all of the encoder output layers, Scratchpad can employ the encoder as a {``}scratchpad{''} memory to keep track of what has been generated so far and thereby guide future generation. We evaluate Scratchpad in the context of three well-studied natural language generation tasks {—} Machine Translation, Question Generation, and Text Summarization {—} and obtain state-of-the-art or comparable performance on standard datasets for each task. Qualitative assessments in the form of human judgements (question generation), attention visualization (MT), and sample output (summarization) provide further evidence of the ability of Scratchpad to generate fluent and expressive output. | |
Tasks | Machine Translation, Question Generation, Text Generation, Text Summarization |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1407/ |
https://www.aclweb.org/anthology/P19-1407 | |
PWC | https://paperswithcode.com/paper/keeping-notes-conditional-natural-language-1 |
Repo | |
Framework | |
Modeling Political Framing Across Policy Issues and Contexts
Title | Modeling Political Framing Across Policy Issues and Contexts |
Authors | Shima Khanehzar, Andrew Turpin, Gosia Mikolajczak |
Abstract | |
Tasks | |
Published | 2019-04-01 |
URL | https://www.aclweb.org/anthology/U19-1009/ |
https://www.aclweb.org/anthology/U19-1009 | |
PWC | https://paperswithcode.com/paper/modeling-political-framing-across-policy |
Repo | |
Framework | |
Multilingual sentence-level bias detection in Wikipedia
Title | Multilingual sentence-level bias detection in Wikipedia |
Authors | Aleks, Desislava rova, Fran{\c{c}}ois Lareau, Pierre Andr{'e} M{'e}nard |
Abstract | We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English. Sifting through the revision history of the articles that at some point had been considered biased and later corrected, we retrieve the last tagged and the first untagged revisions as the before/after snapshots of what was deemed a violation of Wikipedia{'}s neutral point of view policy. We extract the sentences that were removed or rewritten in that edit. The approach yields sufficient data even in the case of relatively small Wikipedias, such as the Bulgarian one, where 62k articles produced 5k biased sentences. We evaluate our method by manually annotating 520 sentences for Bulgarian and French, and 744 for English. We assess the level of noise and analyze its sources. Finally, we exploit the data with well-known classification methods to detect biased sentences. Code and datasets are hosted at https://github.com/crim-ca/wiki-bias. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1006/ |
https://www.aclweb.org/anthology/R19-1006 | |
PWC | https://paperswithcode.com/paper/multilingual-sentence-level-bias-detection-in |
Repo | |
Framework | |
Initialized Equilibrium Propagation for Backprop-Free Training
Title | Initialized Equilibrium Propagation for Backprop-Free Training |
Authors | Peter O’Connor, Efstratios Gavves, Max Welling |
Abstract | Deep neural networks are almost universally trained with reverse-mode automatic differentiation (a.k.a. backpropagation). Biological networks, on the other hand, appear to lack any mechanism for sending gradients back to their input neurons, and thus cannot be learning in this way. In response to this, Scellier & Bengio (2017) proposed Equilibrium Propagation - a method for gradient-based train- ing of neural networks which uses only local learning rules and, crucially, does not rely on neurons having a mechanism for back-propagating an error gradient. Equilibrium propagation, however, has a major practical limitation: inference involves doing an iterative optimization of neural activations to find a fixed-point, and the number of steps required to closely approximate this fixed point scales poorly with the depth of the network. In response to this problem, we propose Initialized Equilibrium Propagation, which trains a feedforward network to initialize the iterative inference procedure for Equilibrium propagation. This feed-forward network learns to approximate the state of the fixed-point using a local learning rule. After training, we can simply use this initializing network for inference, resulting in a learned feedforward network. Our experiments show that this network appears to work as well or better than the original version of Equilibrium propagation. This shows how we might go about training deep networks without using backpropagation. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1GMDsR5tm |
https://openreview.net/pdf?id=B1GMDsR5tm | |
PWC | https://paperswithcode.com/paper/initialized-equilibrium-propagation-for |
Repo | |
Framework | |
Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms
Title | Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms |
Authors | Diyi Yang, Jiaao Chen, Zichao Yang, Dan Jurafsky, Eduard Hovy |
Abstract | Modeling what makes a request persuasive - eliciting the desired response from a reader - is critical to the study of propaganda, behavioral economics, and advertising. Yet current models can{'}t quantify the persuasiveness of requests or extract successful persuasive strategies. Building on theories of persuasion, we propose a neural network to quantify persuasiveness and identify the persuasive strategies in advocacy requests. Our semi-supervised hierarchical neural network model is supervised by the number of people persuaded to take actions and partially supervised at the sentence level with human-labeled rhetorical strategies. Our method outperforms several baselines, uncovers persuasive strategies - offering increased interpretability of persuasive speech - and has applications for other situations with document-level supervision but only partial sentence supervision. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1364/ |
https://www.aclweb.org/anthology/N19-1364 | |
PWC | https://paperswithcode.com/paper/lets-make-your-request-more-persuasive |
Repo | |
Framework | |
Distribution-Interpolation Trade off in Generative Models
Title | Distribution-Interpolation Trade off in Generative Models |
Authors | Damian Leśniak, Igor Sieradzki, Igor Podolak |
Abstract | We investigate the properties of multidimensional probability distributions in the context of latent space prior distributions of implicit generative models. Our work revolves around the phenomena arising while decoding linear interpolations between two random latent vectors – regions of latent space in close proximity to the origin of the space are oversampled, which restricts the usability of linear interpolations as a tool to analyse the latent space. We show that the distribution mismatch can be eliminated completely by a proper choice of the latent probability distribution or using non-linear interpolations. We prove that there is a trade off between the interpolation being linear, and the latent distribution having even the most basic properties required for stable training, such as finite mean. We use the multidimensional Cauchy distribution as an example of the prior distribution, and also provide a general method of creating non-linear interpolations, that is easily applicable to a large family of commonly used latent distributions. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyMhLo0qKQ |
https://openreview.net/pdf?id=SyMhLo0qKQ | |
PWC | https://paperswithcode.com/paper/distribution-interpolation-trade-off-in |
Repo | |
Framework | |