January 25, 2020

2810 words 14 mins read

Paper Group NANR 72

Paper Group NANR 72

Backplay: ‘Man muss immer umkehren’. Introduction to Discourse Relation Parsing and Treebanking (DISRPT): 7th Workshop on Rhetorical Structure Theory and Related Formalisms. Learning to Caption Images Through a Lifetime by Asking Questions. Ensemble Methods to Distinguish Mainland and Taiwan Chinese. An End-to-End Generative Architecture for Paraph …

Backplay: ‘Man muss immer umkehren’

Title Backplay: ‘Man muss immer umkehren’
Authors Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna
Abstract Model-free reinforcement learning (RL) requires a large number of trials to learn a good policy, especially in environments with sparse rewards. We explore a method to improve the sample efficiency when we have access to demonstrations. Our approach, Backplay, uses a single demonstration to construct a curriculum for a given task. Rather than starting each training episode in the environment’s fixed initial state, we start the agent near the end of the demonstration and move the starting point backwards during the course of training until we reach the initial state. Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency. This includes reward shaping, behavioral cloning, and reverse curriculum generation.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=H1xk8jAqKQ
PDF https://openreview.net/pdf?id=H1xk8jAqKQ
PWC https://paperswithcode.com/paper/backplay-man-muss-immer-umkehren-1
Repo
Framework
Title Introduction to Discourse Relation Parsing and Treebanking (DISRPT): 7th Workshop on Rhetorical Structure Theory and Related Formalisms
Authors Amir Zeldes, Debopam Das, Erick Galani Maziero, Juliano Antonio, Mikel Iruskieta
Abstract This overview summarizes the main contributions of the accepted papers at the 2019 workshop on Discourse Relation Parsing and Treebanking (DISRPT 2019). Co-located with NAACL 2019 in Minneapolis, the workshop{'}s aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. In addition to an invited talk, eighteen papers outlined below were presented, four of which were submitted as part of a shared task on elementary discourse unit segmentation and connective detection.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-2701/
PDF https://www.aclweb.org/anthology/W19-2701
PWC https://paperswithcode.com/paper/introduction-to-discourse-relation-parsing
Repo
Framework

Learning to Caption Images Through a Lifetime by Asking Questions

Title Learning to Caption Images Through a Lifetime by Asking Questions
Authors Tingke Shen, Amlan Kar, Sanja Fidler
Abstract In order to bring artificial agents into our lives, we will need to go beyond supervised learning on closed datasets to having the ability to continuously expand knowledge. Inspired by a student learning in a classroom, we present an agent that can continuously learn by posing natural language questions to humans. Our agent is composed of three interacting modules, one that performs captioning, another that generates questions and a decision maker that learns when to ask questions by implicitly reasoning about the uncertainty of the agent and expertise of the teacher. As compared to current active learning methods which query images for full captions, our agent is able to ask pointed questions to improve the generated captions. The agent trains on the improved captions, expanding its knowledge. We show that our approach achieves better performance using less human supervision than the baselines on the challenging MSCOCO dataset.
Tasks Active Learning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Shen_Learning_to_Caption_Images_Through_a_Lifetime_by_Asking_Questions_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Shen_Learning_to_Caption_Images_Through_a_Lifetime_by_Asking_Questions_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/learning-to-caption-images-through-a-lifetime
Repo
Framework

Ensemble Methods to Distinguish Mainland and Taiwan Chinese

Title Ensemble Methods to Distinguish Mainland and Taiwan Chinese
Authors Hai Hu, Wen Li, He Zhou, Zuoyu Tian, Yiwen Zhang, Liang Zou
Abstract This paper describes the IUCL system at VarDial 2019 evaluation campaign for the task of discriminating between Mainland and Taiwan variation of mandarin Chinese. We first build several base classifiers, including a Naive Bayes classifier with word n-gram as features, SVMs with both character and syntactic features, and neural networks with pre-trained character/word embeddings. Then we adopt ensemble methods to combine output from base classifiers to make final predictions. Our ensemble models achieve the highest F1 score (0.893) in simplified Chinese track and the second highest (0.901) in traditional Chinese track. Our results demonstrate the effectiveness and robustness of the ensemble methods.
Tasks Word Embeddings
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1417/
PDF https://www.aclweb.org/anthology/W19-1417
PWC https://paperswithcode.com/paper/ensemble-methods-to-distinguish-mainland-and
Repo
Framework

An End-to-End Generative Architecture for Paraphrase Generation

Title An End-to-End Generative Architecture for Paraphrase Generation
Authors Qian Yang, Zhouyuan Huo, Dinghan Shen, Yong Cheng, Wenlin Wang, Guoyin Wang, Lawrence Carin
Abstract Generating high-quality paraphrases is a fundamental yet challenging natural language processing task. Despite the effectiveness of previous work based on generative models, there remain problems with exposure bias in recurrent neural networks, and often a failure to generate realistic sentences. To overcome these challenges, we propose the first end-to-end conditional generative architecture for generating paraphrases via adversarial training, which does not depend on extra linguistic information. Extensive experiments on four public datasets demonstrate the proposed method achieves state-of-the-art results, outperforming previous generative architectures on both automatic metrics (BLEU, METEOR, and TER) and human evaluations.
Tasks Paraphrase Generation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1309/
PDF https://www.aclweb.org/anthology/D19-1309
PWC https://paperswithcode.com/paper/an-end-to-end-generative-architecture-for
Repo
Framework

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Title An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation
Authors Aless Raganato, ro, Ra{'u}l V{'a}zquez, Mathias Creutz, J{"o}rg Tiedemann
Abstract In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4304/
PDF https://www.aclweb.org/anthology/W19-4304
PWC https://paperswithcode.com/paper/an-evaluation-of-language-agnostic-inner
Repo
Framework

The Punster’s Amanuensis: The Proper Place of Humans and Machines in the Translation of Wordplay

Title The Punster’s Amanuensis: The Proper Place of Humans and Machines in the Translation of Wordplay
Authors Tristan Miller
Abstract The translation of wordplay is one of the most extensively researched problems in translation studies, but it has attracted little attention in the fields of natural language processing and machine translation. This is because today{'}s language technologies treat anomalies and ambiguities in the input as things that must be resolved in favour of a single {``}correct{''} interpretation, rather than preserved and interpreted in their own right. But if computers cannot yet process such creative language on their own, can they at least provide specialized support to translation professionals? In this paper, I survey the state of the art relevant to computational processing of humorous wordplay and put forth a vision of how existing theories, resources, and technologies could be adapted and extended to support interactive, computer-assisted translation. |
Tasks Machine Translation
Published 2019-09-01
URL https://www.aclweb.org/anthology/W19-8707/
PDF https://www.aclweb.org/anthology/W19-8707
PWC https://paperswithcode.com/paper/the-punsters-amanuensis-the-proper-place-of
Repo
Framework

RANDOM MASK: Towards Robust Convolutional Neural Networks

Title RANDOM MASK: Towards Robust Convolutional Neural Networks
Authors Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Liwei Wang
Abstract Robustness of neural networks has recently been highlighted by the adversarial examples, i.e., inputs added with well-designed perturbations which are imperceptible to humans but can cause the network to give incorrect outputs. In this paper, we design a new CNN architecture that by itself has good robustness. We introduce a simple but powerful technique, Random Mask, to modify existing CNN structures. We show that CNN with Random Mask achieves state-of-the-art performance against black-box adversarial attacks without applying any adversarial training. We next investigate the adversarial examples which “fool” a CNN with Random Mask. Surprisingly, we find that these adversarial examples often “fool” humans as well. This raises fundamental questions on how to define adversarial examples and robustness properly.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SkgkJn05YX
PDF https://openreview.net/pdf?id=SkgkJn05YX
PWC https://paperswithcode.com/paper/random-mask-towards-robust-convolutional
Repo
Framework

Semi-Supervised Induction of POS-Tag Lexicons with Tree Models

Title Semi-Supervised Induction of POS-Tag Lexicons with Tree Models
Authors Maciej Janicki
Abstract We approach the problem of POS tagging of morphologically rich languages in a setting where only a small amount of labeled training data is available. We show that a bigram HMM tagger benefits from re-training on a larger untagged text using Baum-Welch estimation. Most importantly, this estimation can be significantly improved by pre-guessing tags for OOV words based on morphological criteria. We consider two models for this task: a character-based recurrent neural network, which guesses the tag from the string form of the word, and a recently proposed graph-based model of morphological transformations. In the latter, the unknown POS tags can be modeled as latent variables in a way very similar to Hidden Markov Tree models and an analogue of the Forward-Backward algorithm can be formulated, which enables us to compute expected values over unknown taggings. We evaluate both the quality of the induced tag lexicon and its impact on the HMM{'}s tagging accuracy. In both tasks, the graph-based morphology model performs significantly better than the RNN predictor. This confirms the intuition that morphologically related words provide useful information about an unknown word{'}s POS tag.
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1060/
PDF https://www.aclweb.org/anthology/R19-1060
PWC https://paperswithcode.com/paper/semi-supervised-induction-of-pos-tag-lexicons
Repo
Framework

Interpolated Spectral NGram Language Models

Title Interpolated Spectral NGram Language Models
Authors Ariadna Quattoni, Xavier Carreras
Abstract Spectral models for learning weighted non-deterministic automata have nice theoretical and algorithmic properties. Despite this, it has been challenging to obtain competitive results in language modeling tasks, for two main reasons. First, in order to capture long-range dependencies of the data, the method must use statistics from long substrings, which results in very large matrices that are difficult to decompose. The second is that the loss function behind spectral learning, based on moment matching, differs from the probabilistic metrics used to evaluate language models. In this work we employ a technique for scaling up spectral learning, and use interpolated predictions that are optimized to maximize perplexity. Our experiments in character-based language modeling show that our method matches the performance of state-of-the-art ngram models, while being very fast to train.
Tasks Language Modelling
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1594/
PDF https://www.aclweb.org/anthology/P19-1594
PWC https://paperswithcode.com/paper/interpolated-spectral-ngram-language-models
Repo
Framework

Is Similarity Visually Grounded? Computational Model of Similarity for the Estonian language

Title Is Similarity Visually Grounded? Computational Model of Similarity for the Estonian language
Authors Claudia Kittask, Eduard Barbu
Abstract Researchers in Computational Linguistics build models of similarity and test them against human judgments. Although there are many empirical studies of the computational models of similarity for the English language, the similarity for other languages is less explored. In this study we are chiefly interested in two aspects. In the first place we want to know how much of the human similarity is grounded in the visual perception. To answer this question two neural computer vision models are used and their correlation with the human derived similarity scores is computed. In the second place we investigate if language influences the similarity computation. To this purpose diverse computational models trained on Estonian resources are evaluated against human judgments
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1064/
PDF https://www.aclweb.org/anthology/R19-1064
PWC https://paperswithcode.com/paper/is-similarity-visually-grounded-computational
Repo
Framework

Impact of ECG Dataset Diversity on Generalization of CNN Model for Detecting QRS Complex

Title Impact of ECG Dataset Diversity on Generalization of CNN Model for Detecting QRS Complex
Authors Ahsan Habib, Chandan Karmakar, John Yearwood
Abstract Detection of QRS complexes in electrocardiogram (ECG) signal is crucial for automated cardiac diagnosis. Automated QRS detection has been a research topic for over three decades and several of the traditional QRS detection methods show acceptable detection accuracy, however, the applicability of these methods beyond their study-specific databases was not explored. The non-stationary nature of ECG and signal variance of intra and inter-patient recordings impose significant challenges on single QRS detectors to achieve reasonable performance. In real life, a promising QRS detector may be expected to achieve acceptable accuracy over diverse ECG recordings and, thus, investigation of the model’s generalization capability is crucial. This paper investigates the generalization capability of convolutional neural network (CNN) based-models from intra (subject wise leave-one-out and five-fold cross validation) and inter-database (training with single and multiple databases) points-of-view over three publicly available ECG databases, namely MIT-BIH Arrhythmia, INCART, and QT. Leave-one-out test accuracy reports 99.22%, 97.13%, and 96.25% for these databases accordingly and inter-database tests report more than 90% accuracy with the single exception of INCART. The performance variation reveals the fact that a CNN model’s generalization capability does not increase simply by adding more training samples, rather the inclusion of samples from a diverse range of subjects is necessary for reasonable QRS detection accuracy.
Tasks Electrocardiography (ECG), QRS Complex Detection
Published 2019-07-10
URL https://doi.org/10.1109/ACCESS.2019.2927726
PDF https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8758818
PWC https://paperswithcode.com/paper/impact-of-ecg-dataset-diversity-on
Repo
Framework

Improved Regret Bounds for Bandit Combinatorial Optimization

Title Improved Regret Bounds for Bandit Combinatorial Optimization
Authors Shinji Ito, Daisuke Hatano, Hanna Sumita, Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi
Abstract \textit{Bandit combinatorial optimization} is a bandit framework in which a player chooses an action within a given finite set $\mathcal{A} \subseteq { 0, 1 }^d$ and incurs a loss that is the inner product of the chosen action and an unobservable loss vector in $\mathbb{R} ^ d$ in each round. In this paper, we aim to reveal the property, which makes the bandit combinatorial optimization hard. Recently, Cohen et al.~\citep{cohen2017tight} obtained a lower bound $\Omega(\sqrt{d k^3 T / \log T})$ of the regret, where $k$ is the maximum $\ell_1$-norm of action vectors, and $T$ is the number of rounds. This lower bound was achieved by considering a continuous strongly-correlated distribution of losses. Our main contribution is that we managed to improve this bound by $\Omega( \sqrt{d k ^3 T} )$ through applying a factor of $\sqrt{\log T}$, which can be done by means of strongly-correlated losses with \textit{binary} values. The bound derives better regret bounds for three specific examples of the bandit combinatorial optimization: the multitask bandit, the bandit ranking and the multiple-play bandit. In particular, the bound obtained for the bandit ranking in the present study addresses an open problem raised in \citep{cohen2017tight}. In addition, we demonstrate that the problem becomes easier without considering correlations among entries of loss vectors. In fact, if each entry of loss vectors is an independent random variable, then, one can achieve a regret of $\tilde{O}(\sqrt{d k^2 T})$, which is $\sqrt{k}$ times smaller than the lower bound shown above. The observed results indicated that correlation among losses is the reason for observing a large regret.
Tasks Combinatorial Optimization
Published 2019-12-01
URL http://papers.nips.cc/paper/9373-improved-regret-bounds-for-bandit-combinatorial-optimization
PDF http://papers.nips.cc/paper/9373-improved-regret-bounds-for-bandit-combinatorial-optimization.pdf
PWC https://paperswithcode.com/paper/improved-regret-bounds-for-bandit
Repo
Framework

Fine-Grained Evaluation for Entity Linking

Title Fine-Grained Evaluation for Entity Linking
Authors Henry Rosales-M{'e}ndez, Aidan Hogan, Barbara Poblete
Abstract The Entity Linking (EL) task identifies entity mentions in a text corpus and associates them with an unambiguous identifier in a Knowledge Base. While much work has been done on the topic, we first present the results of a survey that reveal a lack of consensus in the community regarding what forms of mentions in a text and what forms of links the EL task should consider. We argue that no one definition of the Entity Linking task fits all, and rather propose a fine-grained categorization of different types of entity mentions and links. We then re-annotate three EL benchmark datasets {–} ACE2004, KORE50, and VoxEL {–} with respect to these categories. We propose a fuzzy recall metric to address the lack of consensus and conclude with fine-grained evaluation results comparing a selection of online EL systems.
Tasks Entity Linking
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1066/
PDF https://www.aclweb.org/anthology/D19-1066
PWC https://paperswithcode.com/paper/fine-grained-evaluation-for-entity-linking
Repo
Framework

A Tree-to-Sequence Model for Neural NLG in Task-Oriented Dialog

Title A Tree-to-Sequence Model for Neural NLG in Task-Oriented Dialog
Authors Jinfeng Rao, Kartikeya Upasani, Anusha Balakrishnan, Michael White, Anuj Kumar, Rajen Subba
Abstract Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems. Sequence-to-sequence models on flat meaning representations (MR) have been dominant in this task, for example in the E2E NLG Challenge. Previous work has shown that a tree-structured MR can improve the model for better discourse-level structuring and sentence-level planning. In this work, we propose a tree-to-sequence model that uses a tree-LSTM encoder to leverage the tree structures in the input MR, and further enhance the decoding by a structure-enhanced attention mechanism. In addition, we explore combining these enhancements with constrained decoding to improve semantic correctness. Our experiments not only show significant improvements over standard seq2seq baselines, but also is more data-efficient and generalizes better to hard scenarios.
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-8611/
PDF https://www.aclweb.org/anthology/W19-8611
PWC https://paperswithcode.com/paper/a-tree-to-sequence-model-for-neural-nlg-in
Repo
Framework
comments powered by Disqus