July 26, 2019

2131 words 11 mins read

Paper Group NANR 164

Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora. An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters. Simplifying metaphorical language for young readers: A corpus study on news text. Splitting Complex English Sentences. Parameter Transfer across Domains for Word …

Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora


Title	Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora
Authors
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5600/
PDF	https://www.aclweb.org/anthology/W17-5600
PWC	https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-curation
Repo
Framework

An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters


Title	An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters
Authors	Matthieu Labeau, Alex Allauzen, re
Abstract	Noise Contrastive Estimation (NCE) is a learning procedure that is regularly used to train neural language models, since it avoids the computational bottleneck caused by the output softmax. In this paper, we attempt to explain some of the weaknesses of this objective function, and to draw directions for further developments. Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the bigram distribution, can solve these issues and provide stable and data-efficient learning.
Tasks	Language Modelling, Machine Translation, Speech Recognition
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2003/
PDF	https://www.aclweb.org/anthology/E17-2003
PWC	https://paperswithcode.com/paper/an-experimental-analysis-of-noise-contrastive
Repo
Framework

Simplifying metaphorical language for young readers: A corpus study on news text


Title	Simplifying metaphorical language for young readers: A corpus study on news text
Authors	Magdalena Wolska, Yulia Clausen
Abstract	The paper presents first results of an ongoing project on text simplification focusing on linguistic metaphors. Based on an analysis of a parallel corpus of news text professionally simplified for different grade levels, we identify six types of simplification choices falling into two broad categories: preserving metaphors or dropping them. An annotation study on almost 300 source sentences with metaphors (grade level 12) and their simplified counterparts (grade 4) is conducted. The results show that most metaphors are preserved and when they are dropped, the semantic content tends to be preserved rather than dropped, however, it is reworded without metaphorical language. In general, some of the expected tendencies in complexity reduction, measured with psycholinguistic variables linked to metaphor comprehension, are observed, suggesting good prospect for machine learning-based metaphor simplification.
Tasks	Lexical Simplification, Reading Comprehension, Text Simplification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5035/
PDF	https://www.aclweb.org/anthology/W17-5035
PWC	https://paperswithcode.com/paper/simplifying-metaphorical-language-for-young
Repo
Framework

Splitting Complex English Sentences


Title	Splitting Complex English Sentences
Authors	John Lee, J. Buddhika K. Pathirage Don
Abstract	This paper applies parsing technology to the task of syntactic simplification of English sentences, focusing on the identification of text spans that can be removed from a complex sentence. We report the most comprehensive evaluation to-date on this task, using a dataset of sentences that exhibit simplification based on coordination, subordination, punctuation/parataxis, adjectival clauses, participial phrases, and appositive phrases. We train a decision tree with features derived from text span length, POS tags and dependency relations, and show that it significantly outperforms a parser-only baseline.
Tasks	Lexical Simplification, Machine Translation, Reading Comprehension, Text Simplification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6307/
PDF	https://www.aclweb.org/anthology/W17-6307
PWC	https://paperswithcode.com/paper/splitting-complex-english-sentences
Repo
Framework

Parameter Transfer across Domains for Word Sense Disambiguation


Title	Parameter Transfer across Domains for Word Sense Disambiguation
Authors	Sallam Abualhaija, Nina Tahmasebi, Diane Forin, Karl-Heinz Zimmermann
Abstract	Word sense disambiguation is defined as finding the corresponding sense for a target word in a given context, which comprises a major step in text applications. Recently, it has been addressed as an optimization problem. The idea behind is to find a sequence of senses that corresponds to the words in a given context with a maximum semantic similarity. Metaheuristics like simulated annealing and D-Bees provide approximate good-enough solutions, but are usually influenced by the starting parameters. In this paper, we study the parameter tuning for both algorithms within the word sense disambiguation problem. The experiments are conducted on different datasets to cover different disambiguation scenarios. We show that D-Bees is robust and less sensitive towards the initial parameters compared to simulated annealing, hence, it is sufficient to tune the parameters once and reuse them for different datasets, domains or languages.
Tasks	Lexical Simplification, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1001/
PDF	https://doi.org/10.26615/978-954-452-049-6_001
PWC	https://paperswithcode.com/paper/parameter-transfer-across-domains-for-word
Repo
Framework

World of Bits: An Open-Domain Platform for Web-Based Agents


Title	World of Bits: An Open-Domain Platform for Web-Based Agents
Authors	Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, Percy Liang
Abstract	While simulated game environments have greatly accelerated research in reinforcement learning, existing environments lack the open-domain realism of tasks in computer vision or natural language processing, which operate on artifacts created by humans in natural, organic settings. To foster reinforcement learning research in such settings, we introduce the World of Bits (WoB), a platform in which agents complete tasks on the Internet by performing low-level keyboard and mouse actions. The two main challenges are: (i) to curate a large, diverse set of interesting web-based tasks, and (ii) to ensure that these tasks have a well-defined reward structure and are reproducible despite the transience of the web. To do this, we develop a methodology in which crowdworkers create tasks defined by natural language questions and provide demonstrations of how to answer the question on real websites using keyboard and mouse; HTTP traffic is cached to create a reproducible offline approximation of the web site. Finally, we show that agents trained via behavioral cloning and reinforcement learning can successfully complete a range of our web-based tasks.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=843
PDF	http://proceedings.mlr.press/v70/shi17a/shi17a.pdf
PWC	https://paperswithcode.com/paper/world-of-bits-an-open-domain-platform-for-web
Repo
Framework

An Artificial Language Evaluation of Distributional Semantic Models


Title	An Artificial Language Evaluation of Distributional Semantic Models
Authors	Fatemeh Torabi Asr, Michael Jones
Abstract	Recent studies of distributional semantic models have set up a competition between word embeddings obtained from predictive neural networks and word vectors obtained from abstractive count-based models. This paper is an attempt to reveal the underlying contribution of additional training data and post-processing steps on each type of model in word similarity and relatedness inference tasks. We do so by designing an artificial language framework, training a predictive and a count-based model on data sampled from this grammar, and evaluating the resulting word vectors in paradigmatic and syntagmatic tasks defined with respect to the grammar.
Tasks	Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-1015/
PDF	https://www.aclweb.org/anthology/K17-1015
PWC	https://paperswithcode.com/paper/an-artificial-language-evaluation-of
Repo
Framework

Learning Word Representations with Regularization from Prior Knowledge


Title	Learning Word Representations with Regularization from Prior Knowledge
Authors	Yan Song, Chia-Jung Lee, Fei Xia
Abstract	Conventional word embeddings are trained with specific criteria (e.g., based on language modeling or co-occurrence) inside a single information source, disregarding the opportunity for further calibration using external knowledge. This paper presents a unified framework that leverages pre-learned or external priors, in the form of a regularizer, for enhancing conventional language model-based embedding learning. We consider two types of regularizers. The first type is derived from topic distribution by running LDA on unlabeled data. The second type is based on dictionaries that are created with human annotation efforts. To effectively learn with the regularizers, we propose a novel data structure, trajectory softmax, in this paper. The resulting embeddings are evaluated by word similarity and sentiment classification. Experimental results show that our learning framework with regularization from prior knowledge improves embedding quality across multiple datasets, compared to a diverse collection of baseline methods.
Tasks	Calibration, Language Modelling, Learning Word Embeddings, Sentiment Analysis, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-1016/
PDF	https://www.aclweb.org/anthology/K17-1016
PWC	https://paperswithcode.com/paper/learning-word-representations-with-1
Repo
Framework

Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger


Title	Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger
Authors	Kimmo Kettunen, Laura L{"o}fberg
Abstract
Tasks	Named Entity Recognition, Optical Character Recognition
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0204/
PDF	https://www.aclweb.org/anthology/W17-0204
PWC	https://paperswithcode.com/paper/tagging-named-entities-in-19th-century-and
Repo
Framework

A Joint Model for Semantic Sequences: Frames, Entities, Sentiments


Title	A Joint Model for Semantic Sequences: Frames, Entities, Sentiments
Authors	Haoruo Peng, Snigdha Chaturvedi, Dan Roth
Abstract	Understanding stories {–} sequences of events {–} is a crucial yet challenging natural language understanding task. These events typically carry multiple aspects of semantics including actions, entities and emotions. Not only does each individual aspect contribute to the meaning of the story, so does the interaction among these aspects. Building on this intuition, we propose to jointly model important aspects of semantic knowledge {–} frames, entities and sentiments {–} via a semantic language model. We achieve this by first representing these aspects{'} semantic units at an appropriate level of abstraction and then using the resulting vector representations for each semantic aspect to learn a joint representation via a neural language model. We show that the joint semantic language model is of high quality and can generate better semantic sequences than models that operate on the word level. We further demonstrate that our joint model can be applied to story cloze test and shallow discourse parsing tasks with improved performance and that each semantic aspect contributes to the model.
Tasks	Language Modelling
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-1019/
PDF	https://www.aclweb.org/anthology/K17-1019
PWC	https://paperswithcode.com/paper/a-joint-model-for-semantic-sequences-frames
Repo
Framework

A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq


Title	A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq
Authors	Jean-Philippe Bernardy, Stergios Chatzikyriakidis
Abstract
Tasks	Natural Language Inference
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-6801/
PDF	https://www.aclweb.org/anthology/W17-6801
PWC	https://paperswithcode.com/paper/a-type-theoretical-system-for-the-fracas-test
Repo
Framework

CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface


Title	CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface
Authors	Tiberiu Boros, Stefan Daniel Dumitrescu, Sonia Pipa
Abstract	Voice enabled human computer interfaces (HCI) that integrate automatic speech recognition, text-to-speech synthesis and natural language understanding have become a commodity, introduced by the immersion of smart phones and other gadgets in our daily lives. Smart assistants are able to respond to simple queries (similar to text-based question-answering systems), perform simple tasks (call a number, reject a call etc.) and help organizing appointments. With this paper we introduce a newly created process automation platform that enables the user to control applications and home appliances and to query the system for information using a natural voice interface. We offer an overview of the technologies that enabled us to construct our system and we present different usage scenarios in home and office environments.
Tasks	Question Answering, Speech Recognition, Speech Synthesis, Text-To-Speech Synthesis
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-3009/
PDF	https://www.aclweb.org/anthology/E17-3009
PWC	https://paperswithcode.com/paper/cassandra-a-multipurpose-configurable-voice
Repo
Framework

Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017


Title	Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017
Authors	Kenji Imamura, Eiichiro Sumita
Abstract	In this paper, we describe the NICT-2 neural machine translation system evaluated at WAT2017. This system uses multiple models as an ensemble and combines models with opposite decoding directions by reranking (called bi-directional reranking). In our experimental results on small data sets, the translation quality improved when the number of models was increased to 32 in total and did not saturate. In the experiments on large data sets, improvements of 1.59-3.32 BLEU points were achieved when six-model ensembles were combined by the bi-directional reranking.
Tasks	Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5711/
PDF	https://www.aclweb.org/anthology/W17-5711
PWC	https://paperswithcode.com/paper/ensemble-and-reranking-using-multiple-models
Repo
Framework

Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network


Title	Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network
Authors	Zi-Yi Dou
Abstract	Document-level sentiment classification is a fundamental problem which aims to predict a user{'}s overall sentiment about a product in a document. Several methods have been proposed to tackle the problem whereas most of them fail to consider the influence of users who express the sentiment and products which are evaluated. To address the issue, we propose a deep memory network for document-level sentiment classification which could capture the user and product information at the same time. To prove the effectiveness of our algorithm, we conduct experiments on IMDB and Yelp datasets and the results indicate that our model can achieve better performance than several existing methods.
Tasks	Opinion Mining, Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1054/
PDF	https://www.aclweb.org/anthology/D17-1054
PWC	https://paperswithcode.com/paper/capturing-user-and-product-information-for
Repo
Framework

Near-Optimal Design of Experiments via Regret Minimization


Title	Near-Optimal Design of Experiments via Regret Minimization
Authors	Zeyuan Allen-Zhu, Yuanzhi Li, Aarti Singh, Yining Wang
Abstract	We consider computationally tractable methods for the experimental design problem, where k out of n design points of dimension p are selected so that certain optimality criteria are approximately satisfied. Our algorithm finds a $(1+\epsilon)$-approximate optimal design when k is a linear function of p; in contrast, existing results require k to be super-linear in p. Our algorithm also handles all popular optimality criteria, while existing ones only handle one or two such criteria. Numerical results on synthetic and real-world design problems verify the practical effectiveness of the proposed algorithm.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=609
PDF	http://proceedings.mlr.press/v70/allen-zhu17e/allen-zhu17e.pdf
PWC	https://paperswithcode.com/paper/near-optimal-design-of-experiments-via-regret
Repo
Framework