July 26, 2019

2258 words 11 mins read

Paper Group NANR 172

ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection. Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task. Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands. Exploring Optimal Voting in Native Language Identification. Sub-character Neural Language Modelling …

ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection


Title	ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection
Authors	Abhisek Chakrabarty, Utpal Garain
Abstract
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-2006/
PDF	https://www.aclweb.org/anthology/K17-2006
PWC	https://paperswithcode.com/paper/isi-at-the-sigmorphon-2017-shared-task-on
Repo
Framework

Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task


Title	Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task
Authors	Rajen Chatterjee, M. Amin Farajian, Matteo Negri, Marco Turchi, Ankit Srivastava, Santanu Pal
Abstract
Tasks	Automatic Post-Editing, Language Modelling, Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4773/
PDF	https://www.aclweb.org/anthology/W17-4773
PWC	https://paperswithcode.com/paper/multi-source-neural-automatic-post-editing
Repo
Framework

Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands


Title	Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands
Authors	Muhannad Alomari, Paul Duckworth, Majd Hawasly, David C. Hogg, Anthony G. Cohn
Abstract	We present a cognitively plausible system capable of acquiring knowledge in language and vision from pairs of short video clips and linguistic descriptions. The aim of this work is to teach a robot manipulator how to execute natural language commands by demonstration. This is achieved by first learning a set of visual {`}concepts{'} that abstract the visual feature spaces into concepts that have human-level meaning. Second, learning the mapping/grounding between words and the extracted visual concepts. Third, inducing grammar rules via a semantic representation known as Robot Control Language (RCL). We evaluate our approach against state-of-the-art supervised and unsupervised grounding and grammar induction systems, and show that a robot can learn to execute never seen-before commands from pairs of unlabelled linguistic and visual inputs. \|
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2805/
PDF	https://www.aclweb.org/anthology/W17-2805
PWC	https://paperswithcode.com/paper/natural-language-grounding-and-grammar
Repo
Framework

Exploring Optimal Voting in Native Language Identification


Title	Exploring Optimal Voting in Native Language Identification
Authors	Cyril Goutte, Serge L{'e}ger
Abstract	We describe the submissions entered by the National Research Council Canada in the NLI-2017 evaluation. We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems. We also explored the use of features that rely on no linguistic preprocessing. Long ngrams of characters obtained from raw text turned out to yield the best performance on all textual input (written essays and speech transcripts). Voting ensembles turned out to produce small performance gains, with little difference between the various optimization strategies we tried. Our top systems achieved accuracies of 87{%} on the essay track, 84{%} on the speech track, and close to 92{%} by combining essays, speech and i-vectors in the fusion track.
Tasks	Language Identification, Native Language Identification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5041/
PDF	https://www.aclweb.org/anthology/W17-5041
PWC	https://paperswithcode.com/paper/exploring-optimal-voting-in-native-language
Repo
Framework

Sub-character Neural Language Modelling in Japanese


Title	Sub-character Neural Language Modelling in Japanese
Authors	Viet Nguyen, Julian Brooke, Timothy Baldwin
Abstract	In East Asian languages such as Japanese and Chinese, the semantics of a character are (somewhat) reflected in its sub-character elements. This paper examines the effect of using sub-characters for language modeling in Japanese. This is achieved by decomposing characters according to a range of character decomposition datasets, and training a neural language model over variously decomposed character representations. Our results indicate that language modelling can be improved through the inclusion of sub-characters, though this result depends on a good choice of decomposition dataset and the appropriate granularity of decomposition.
Tasks	Language Modelling
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4122/
PDF	https://www.aclweb.org/anthology/W17-4122
PWC	https://paperswithcode.com/paper/sub-character-neural-language-modelling-in
Repo
Framework

Annotation of Entities and Relations in Spanish Radiology Reports


Title	Annotation of Entities and Relations in Spanish Radiology Reports
Authors	Viviana Cotik, Dar{'\i}o Filippo, Rol Roller, , Hans Uszkoreit, Feiyu Xu
Abstract	Radiology reports express the results of a radiology study and contain information about anatomical entities, findings, measures and impressions of the medical doctor. The use of information extraction techniques can help physicians to access this information in order to understand data and to infer further knowledge. Supervised machine learning methods are very popular to address information extraction, but are usually domain and language dependent. To train new classification models, annotated data is required. Moreover, annotated data is also required as an evaluation resource of information extraction algorithms. However, one major drawback of processing clinical data is the low availability of annotated datasets. For this reason we performed a manual annotation of radiology reports written in Spanish. This paper presents the corpus, the annotation schema, the annotation guidelines and further insight of the data.
Tasks	Named Entity Recognition, Relation Extraction
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1025/
PDF	https://doi.org/10.26615/978-954-452-049-6_025
PWC	https://paperswithcode.com/paper/annotation-of-entities-and-relations-in
Repo
Framework

Improving Evaluation of Document-level Machine Translation Quality Estimation


Title	Improving Evaluation of Document-level Machine Translation Quality Estimation
Authors	Yvette Graham, Qingsong Ma, Timothy Baldwin, Qun Liu, Carla Parra, Carolina Scarton
Abstract	Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable. In this paper, we explore the validity of human annotations currently employed in the evaluation of document-level quality estimation for machine translation (MT). We demonstrate the degree to which MT system rankings are dependent on weights employed in the construction of the gold standard, before proposing direct human assessment as a valid alternative. Experiments show direct assessment (DA) scores for documents to be highly reliable, achieving a correlation of above 0.9 in a self-replication experiment, in addition to a substantial estimated cost reduction through quality controlled crowd-sourcing. The original gold standard based on post-edits incurs a 10{–}20 times greater cost than DA.
Tasks	Machine Translation
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2057/
PDF	https://www.aclweb.org/anthology/E17-2057
PWC	https://paperswithcode.com/paper/improving-evaluation-of-document-level
Repo
Framework

Using a Graph-based Coherence Model in Document-Level Machine Translation


Title	Using a Graph-based Coherence Model in Document-Level Machine Translation
Authors	Leo Born, Mohsen Mesgar, Michael Strube
Abstract	Although coherence is an important aspect of any text generation system, it has received little attention in the context of machine translation (MT) so far. We hypothesize that the quality of document-level translation can be improved if MT models take into account the semantic relations among sentences during translation. We integrate the graph-based coherence model proposed by Mesgar and Strube, (2016) with Docent (Hardmeier et al., 2012, Hardmeier, 2014) a document-level machine translation system. The application of this graph-based coherence modeling approach is novel in the context of machine translation. We evaluate the coherence model and its effects on the quality of the machine translation. The result of our experiments shows that our coherence model slightly improves the quality of translation in terms of the average Meteor score.
Tasks	Machine Translation, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4803/
PDF	https://www.aclweb.org/anthology/W17-4803
PWC	https://paperswithcode.com/paper/using-a-graph-based-coherence-model-in
Repo
Framework

A Computational Study on Word Meanings and Their Distributed Representations via Polymodal Embedding


Title	A Computational Study on Word Meanings and Their Distributed Representations via Polymodal Embedding
Authors	Joohee Park, Sung-hyon Myaeng
Abstract	A distributed representation has become a popular approach to capturing a word meaning. Besides its success and practical value, however, questions arise about the relationships between a true word meaning and its distributed representation. In this paper, we examine such a relationship via polymodal embedding approach inspired by the theory that humans tend to use diverse sources in developing a word meaning. The result suggests that the existing embeddings lack in capturing certain aspects of word meanings which can be significantly improved by the polymodal approach. Also, we show distinct characteristics of different types of words (e.g. concreteness) via computational studies. Finally, we show our proposed embedding method outperforms the baselines in the word similarity measure tasks and the hypernym prediction tasks.
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1022/
PDF	https://www.aclweb.org/anthology/I17-1022
PWC	https://paperswithcode.com/paper/a-computational-study-on-word-meanings-and
Repo
Framework

Learning to Generate Product Reviews from Attributes


Title	Learning to Generate Product Reviews from Attributes
Authors	Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu
Abstract	Automatically generating product reviews is a meaningful, yet not well-studied task in sentiment analysis. Traditional natural language generation methods rely extensively on hand-crafted rules and predefined templates. This paper presents an attention-enhanced attribute-to-sequence model to generate product reviews for given attribute information, such as user, product, and rating. The attribute encoder learns to represent input attributes as vectors. Then, the sequence decoder generates reviews by conditioning its output on these vectors. We also introduce an attention mechanism to jointly generate reviews and align words with input attributes. The proposed model is trained end-to-end to maximize the likelihood of target product reviews given the attributes. We build a publicly available dataset for the review generation task by leveraging the Amazon book reviews and their metadata. Experiments on the dataset show that our approach outperforms baseline methods and the attention mechanism significantly improves the performance of our model.
Tasks	Sentiment Analysis, Text Generation
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1059/
PDF	https://www.aclweb.org/anthology/E17-1059
PWC	https://paperswithcode.com/paper/learning-to-generate-product-reviews-from
Repo
Framework

Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context


Title	Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context
Authors	Ryu Takeda, Kazunori Komatani
Abstract	Unsupervised segmentation of phoneme sequences is an essential process to obtain unknown words during spoken dialogues. In this segmentation, an input phoneme sequence without delimiters is converted into segmented sub-sequences corresponding to words. The Pitman-Yor semi-Markov model (PYSMM) is promising for this problem, but its performance degrades when it is applied to phoneme-level word segmentation. This is because of insufficient cues for the segmentation, e.g., homophones are improperly treated as single entries and their different contexts are also confused. We propose a phoneme-length context model for PYSMM to give a helpful cue at the phoneme-level and to predict succeeding segments more accurately. Our experiments showed that the peak performance with our context model outperformed those without such a context model by 0.045 at most in terms of F-measures of estimated segmentation.
Tasks	Spoken Dialogue Systems
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1025/
PDF	https://www.aclweb.org/anthology/I17-1025
PWC	https://paperswithcode.com/paper/unsupervised-segmentation-of-phoneme
Repo
Framework

Extractive Summarization Using Multi-Task Learning with Document Classification


Title	Extractive Summarization Using Multi-Task Learning with Document Classification
Authors	Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, Ichiro Sakata
Abstract	The need for automatic document summarization that can be used for practical applications is increasing rapidly. In this paper, we propose a general framework for summarization that extracts sentences from a document using externally related information. Our work is aimed at single document summarization using small amounts of reference summaries. In particular, we address document summarization in the framework of multi-task learning using curriculum learning for sentence extraction and document classification. The proposed framework enables us to obtain better feature representations to extract sentences from documents. We evaluate our proposed summarization method on two datasets: financial report and news corpus. Experimental results demonstrate that our summarizers achieve performance that is comparable to state-of-the-art systems.
Tasks	Document Classification, Document Summarization, Multi-Task Learning
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1223/
PDF	https://www.aclweb.org/anthology/D17-1223
PWC	https://paperswithcode.com/paper/extractive-summarization-using-multi-task
Repo
Framework

FA3L at SemEval-2017 Task 3: A ThRee Embeddings Recurrent Neural Network for Question Answering


Title	FA3L at SemEval-2017 Task 3: A ThRee Embeddings Recurrent Neural Network for Question Answering
Authors	Giuseppe Attardi, Antonio Carta, Federico Errica, Andrea Madotto, Ludovica Pannitto
Abstract	In this paper we present ThReeNN, a model for Community Question Answering, Task 3, of SemEval-2017. The proposed model exploits both syntactic and semantic information to build a single and meaningful embedding space. Using a dependency parser in combination with word embeddings, the model creates sequences of inputs for a Recurrent Neural Network, which are then used for the ranking purposes of the Task. The score obtained on the official test data shows promising results.
Tasks	Community Question Answering, Dependency Parsing, Model Selection, Question Answering, Question Similarity, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2048/
PDF	https://www.aclweb.org/anthology/S17-2048
PWC	https://paperswithcode.com/paper/fa3l-at-semeval-2017-task-3-a-three
Repo
Framework

Turning Distributional Thesauri into Word Vectors for Synonym Extraction and Expansion


Title	Turning Distributional Thesauri into Word Vectors for Synonym Extraction and Expansion
Authors	Olivier Ferret
Abstract	In this article, we propose to investigate a new problem consisting in turning a distributional thesaurus into dense word vectors. We propose more precisely a method for performing such task by associating graph embedding and distributed representation adaptation. We have applied and evaluated it for English nouns at a large scale about its ability to retrieve synonyms. In this context, we have also illustrated the interest of the developed method for three different tasks: the improvement of already existing word embeddings, the fusion of heterogeneous representations and the expansion of synsets.
Tasks	Graph Embedding, Semantic Textual Similarity, Word Embeddings
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1028/
PDF	https://www.aclweb.org/anthology/I17-1028
PWC	https://paperswithcode.com/paper/turning-distributional-thesauri-into-word
Repo
Framework

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings


Title	Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings
Authors	Sina Zarrie{\ss}, David Schlangen
Abstract	There has recently been a lot of work trying to use images of referents of words for improving vector space meaning representations derived from text. We investigate the opposite direction, as it were, trying to improve visual word predictors that identify objects in images, by exploiting distributional similarity information during training. We show that for certain words (such as entry-level nouns or hypernyms), we can indeed learn better referential word meanings by taking into account their semantic similarity to other words. For other words, there is no or even a detrimental effect, compared to a learning setup that presents even semantically related objects as negative instances.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2014/
PDF	https://www.aclweb.org/anthology/E17-2014
PWC	https://paperswithcode.com/paper/is-this-a-child-a-girl-or-a-car-exploring-the
Repo
Framework