July 26, 2019

2131 words 11 mins read

Paper Group NANR 164

Paper Group NANR 164

Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora. An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters. Simplifying metaphorical language for young readers: A corpus study on news text. Splitting Complex English Sentences. Parameter Transfer across Domains for Word …

Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora

Title Proceedings of the First Workshop on Curation and Applications of Parallel and Comparable Corpora
Authors
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5600/
PDF https://www.aclweb.org/anthology/W17-5600
PWC https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-curation
Repo
Framework

An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters

Title An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters
Authors Matthieu Labeau, Alex Allauzen, re
Abstract Noise Contrastive Estimation (NCE) is a learning procedure that is regularly used to train neural language models, since it avoids the computational bottleneck caused by the output softmax. In this paper, we attempt to explain some of the weaknesses of this objective function, and to draw directions for further developments. Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the bigram distribution, can solve these issues and provide stable and data-efficient learning.
Tasks Language Modelling, Machine Translation, Speech Recognition
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2003/
PDF https://www.aclweb.org/anthology/E17-2003
PWC https://paperswithcode.com/paper/an-experimental-analysis-of-noise-contrastive
Repo
Framework

Simplifying metaphorical language for young readers: A corpus study on news text

Title Simplifying metaphorical language for young readers: A corpus study on news text
Authors Magdalena Wolska, Yulia Clausen
Abstract The paper presents first results of an ongoing project on text simplification focusing on linguistic metaphors. Based on an analysis of a parallel corpus of news text professionally simplified for different grade levels, we identify six types of simplification choices falling into two broad categories: preserving metaphors or dropping them. An annotation study on almost 300 source sentences with metaphors (grade level 12) and their simplified counterparts (grade 4) is conducted. The results show that most metaphors are preserved and when they are dropped, the semantic content tends to be preserved rather than dropped, however, it is reworded without metaphorical language. In general, some of the expected tendencies in complexity reduction, measured with psycholinguistic variables linked to metaphor comprehension, are observed, suggesting good prospect for machine learning-based metaphor simplification.
Tasks Lexical Simplification, Reading Comprehension, Text Simplification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5035/
PDF https://www.aclweb.org/anthology/W17-5035
PWC https://paperswithcode.com/paper/simplifying-metaphorical-language-for-young
Repo
Framework

Splitting Complex English Sentences

Title Splitting Complex English Sentences
Authors John Lee, J. Buddhika K. Pathirage Don
Abstract This paper applies parsing technology to the task of syntactic simplification of English sentences, focusing on the identification of text spans that can be removed from a complex sentence. We report the most comprehensive evaluation to-date on this task, using a dataset of sentences that exhibit simplification based on coordination, subordination, punctuation/parataxis, adjectival clauses, participial phrases, and appositive phrases. We train a decision tree with features derived from text span length, POS tags and dependency relations, and show that it significantly outperforms a parser-only baseline.
Tasks Lexical Simplification, Machine Translation, Reading Comprehension, Text Simplification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6307/
PDF https://www.aclweb.org/anthology/W17-6307
PWC https://paperswithcode.com/paper/splitting-complex-english-sentences
Repo
Framework

Parameter Transfer across Domains for Word Sense Disambiguation

Title Parameter Transfer across Domains for Word Sense Disambiguation
Authors Sallam Abualhaija, Nina Tahmasebi, Diane Forin, Karl-Heinz Zimmermann
Abstract Word sense disambiguation is defined as finding the corresponding sense for a target word in a given context, which comprises a major step in text applications. Recently, it has been addressed as an optimization problem. The idea behind is to find a sequence of senses that corresponds to the words in a given context with a maximum semantic similarity. Metaheuristics like simulated annealing and D-Bees provide approximate good-enough solutions, but are usually influenced by the starting parameters. In this paper, we study the parameter tuning for both algorithms within the word sense disambiguation problem. The experiments are conducted on different datasets to cover different disambiguation scenarios. We show that D-Bees is robust and less sensitive towards the initial parameters compared to simulated annealing, hence, it is sufficient to tune the parameters once and reuse them for different datasets, domains or languages.
Tasks Lexical Simplification, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1001/
PDF https://doi.org/10.26615/978-954-452-049-6_001
PWC https://paperswithcode.com/paper/parameter-transfer-across-domains-for-word
Repo
Framework

World of Bits: An Open-Domain Platform for Web-Based Agents

Title World of Bits: An Open-Domain Platform for Web-Based Agents
Authors Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, Percy Liang
Abstract While simulated game environments have greatly accelerated research in reinforcement learning, existing environments lack the open-domain realism of tasks in computer vision or natural language processing, which operate on artifacts created by humans in natural, organic settings. To foster reinforcement learning research in such settings, we introduce the World of Bits (WoB), a platform in which agents complete tasks on the Internet by performing low-level keyboard and mouse actions. The two main challenges are: (i) to curate a large, diverse set of interesting web-based tasks, and (ii) to ensure that these tasks have a well-defined reward structure and are reproducible despite the transience of the web. To do this, we develop a methodology in which crowdworkers create tasks defined by natural language questions and provide demonstrations of how to answer the question on real websites using keyboard and mouse; HTTP traffic is cached to create a reproducible offline approximation of the web site. Finally, we show that agents trained via behavioral cloning and reinforcement learning can successfully complete a range of our web-based tasks.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=843
PDF http://proceedings.mlr.press/v70/shi17a/shi17a.pdf
PWC https://paperswithcode.com/paper/world-of-bits-an-open-domain-platform-for-web
Repo
Framework

An Artificial Language Evaluation of Distributional Semantic Models

Title An Artificial Language Evaluation of Distributional Semantic Models
Authors Fatemeh Torabi Asr, Michael Jones
Abstract Recent studies of distributional semantic models have set up a competition between word embeddings obtained from predictive neural networks and word vectors obtained from abstractive count-based models. This paper is an attempt to reveal the underlying contribution of additional training data and post-processing steps on each type of model in word similarity and relatedness inference tasks. We do so by designing an artificial language framework, training a predictive and a count-based model on data sampled from this grammar, and evaluating the resulting word vectors in paradigmatic and syntagmatic tasks defined with respect to the grammar.
Tasks Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1015/
PDF https://www.aclweb.org/anthology/K17-1015
PWC https://paperswithcode.com/paper/an-artificial-language-evaluation-of
Repo
Framework

Learning Word Representations with Regularization from Prior Knowledge

Title Learning Word Representations with Regularization from Prior Knowledge
Authors Yan Song, Chia-Jung Lee, Fei Xia
Abstract Conventional word embeddings are trained with specific criteria (e.g., based on language modeling or co-occurrence) inside a single information source, disregarding the opportunity for further calibration using external knowledge. This paper presents a unified framework that leverages pre-learned or external priors, in the form of a regularizer, for enhancing conventional language model-based embedding learning. We consider two types of regularizers. The first type is derived from topic distribution by running LDA on unlabeled data. The second type is based on dictionaries that are created with human annotation efforts. To effectively learn with the regularizers, we propose a novel data structure, trajectory softmax, in this paper. The resulting embeddings are evaluated by word similarity and sentiment classification. Experimental results show that our learning framework with regularization from prior knowledge improves embedding quality across multiple datasets, compared to a diverse collection of baseline methods.
Tasks Calibration, Language Modelling, Learning Word Embeddings, Sentiment Analysis, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1016/
PDF https://www.aclweb.org/anthology/K17-1016
PWC https://paperswithcode.com/paper/learning-word-representations-with-1
Repo
Framework

Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger

Title Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger
Authors Kimmo Kettunen, Laura L{"o}fberg
Abstract
Tasks Named Entity Recognition, Optical Character Recognition
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0204/
PDF https://www.aclweb.org/anthology/W17-0204
PWC https://paperswithcode.com/paper/tagging-named-entities-in-19th-century-and
Repo
Framework

A Joint Model for Semantic Sequences: Frames, Entities, Sentiments

Title A Joint Model for Semantic Sequences: Frames, Entities, Sentiments
Authors Haoruo Peng, Snigdha Chaturvedi, Dan Roth
Abstract Understanding stories {–} sequences of events {–} is a crucial yet challenging natural language understanding task. These events typically carry multiple aspects of semantics including actions, entities and emotions. Not only does each individual aspect contribute to the meaning of the story, so does the interaction among these aspects. Building on this intuition, we propose to jointly model important aspects of semantic knowledge {–} frames, entities and sentiments {–} via a semantic language model. We achieve this by first representing these aspects{'} semantic units at an appropriate level of abstraction and then using the resulting vector representations for each semantic aspect to learn a joint representation via a neural language model. We show that the joint semantic language model is of high quality and can generate better semantic sequences than models that operate on the word level. We further demonstrate that our joint model can be applied to story cloze test and shallow discourse parsing tasks with improved performance and that each semantic aspect contributes to the model.
Tasks Language Modelling
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1019/
PDF https://www.aclweb.org/anthology/K17-1019
PWC https://paperswithcode.com/paper/a-joint-model-for-semantic-sequences-frames
Repo
Framework

A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq

Title A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq
Authors Jean-Philippe Bernardy, Stergios Chatzikyriakidis
Abstract
Tasks Natural Language Inference
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-6801/
PDF https://www.aclweb.org/anthology/W17-6801
PWC https://paperswithcode.com/paper/a-type-theoretical-system-for-the-fracas-test
Repo
Framework

CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface

Title CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface
Authors Tiberiu Boros, Stefan Daniel Dumitrescu, Sonia Pipa
Abstract Voice enabled human computer interfaces (HCI) that integrate automatic speech recognition, text-to-speech synthesis and natural language understanding have become a commodity, introduced by the immersion of smart phones and other gadgets in our daily lives. Smart assistants are able to respond to simple queries (similar to text-based question-answering systems), perform simple tasks (call a number, reject a call etc.) and help organizing appointments. With this paper we introduce a newly created process automation platform that enables the user to control applications and home appliances and to query the system for information using a natural voice interface. We offer an overview of the technologies that enabled us to construct our system and we present different usage scenarios in home and office environments.
Tasks Question Answering, Speech Recognition, Speech Synthesis, Text-To-Speech Synthesis
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-3009/
PDF https://www.aclweb.org/anthology/E17-3009
PWC https://paperswithcode.com/paper/cassandra-a-multipurpose-configurable-voice
Repo
Framework

Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017

Title Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017
Authors Kenji Imamura, Eiichiro Sumita
Abstract In this paper, we describe the NICT-2 neural machine translation system evaluated at WAT2017. This system uses multiple models as an ensemble and combines models with opposite decoding directions by reranking (called bi-directional reranking). In our experimental results on small data sets, the translation quality improved when the number of models was increased to 32 in total and did not saturate. In the experiments on large data sets, improvements of 1.59-3.32 BLEU points were achieved when six-model ensembles were combined by the bi-directional reranking.
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5711/
PDF https://www.aclweb.org/anthology/W17-5711
PWC https://paperswithcode.com/paper/ensemble-and-reranking-using-multiple-models
Repo
Framework

Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network

Title Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network
Authors Zi-Yi Dou
Abstract Document-level sentiment classification is a fundamental problem which aims to predict a user{'}s overall sentiment about a product in a document. Several methods have been proposed to tackle the problem whereas most of them fail to consider the influence of users who express the sentiment and products which are evaluated. To address the issue, we propose a deep memory network for document-level sentiment classification which could capture the user and product information at the same time. To prove the effectiveness of our algorithm, we conduct experiments on IMDB and Yelp datasets and the results indicate that our model can achieve better performance than several existing methods.
Tasks Opinion Mining, Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1054/
PDF https://www.aclweb.org/anthology/D17-1054
PWC https://paperswithcode.com/paper/capturing-user-and-product-information-for
Repo
Framework

Near-Optimal Design of Experiments via Regret Minimization

Title Near-Optimal Design of Experiments via Regret Minimization
Authors Zeyuan Allen-Zhu, Yuanzhi Li, Aarti Singh, Yining Wang
Abstract We consider computationally tractable methods for the experimental design problem, where k out of n design points of dimension p are selected so that certain optimality criteria are approximately satisfied. Our algorithm finds a $(1+\epsilon)$-approximate optimal design when k is a linear function of p; in contrast, existing results require k to be super-linear in p. Our algorithm also handles all popular optimality criteria, while existing ones only handle one or two such criteria. Numerical results on synthetic and real-world design problems verify the practical effectiveness of the proposed algorithm.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=609
PDF http://proceedings.mlr.press/v70/allen-zhu17e/allen-zhu17e.pdf
PWC https://paperswithcode.com/paper/near-optimal-design-of-experiments-via-regret
Repo
Framework
comments powered by Disqus