July 26, 2019

2258 words 11 mins read

Paper Group NANR 172

Paper Group NANR 172

ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection. Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task. Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands. Exploring Optimal Voting in Native Language Identification. Sub-character Neural Language Modelling …

ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection

Title ISI at the SIGMORPHON 2017 Shared Task on Morphological Reinflection
Authors Abhisek Chakrabarty, Utpal Garain
Abstract
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-2006/
PDF https://www.aclweb.org/anthology/K17-2006
PWC https://paperswithcode.com/paper/isi-at-the-sigmorphon-2017-shared-task-on
Repo
Framework

Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task

Title Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task
Authors Rajen Chatterjee, M. Amin Farajian, Matteo Negri, Marco Turchi, Ankit Srivastava, Santanu Pal
Abstract
Tasks Automatic Post-Editing, Language Modelling, Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4773/
PDF https://www.aclweb.org/anthology/W17-4773
PWC https://paperswithcode.com/paper/multi-source-neural-automatic-post-editing
Repo
Framework

Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands

Title Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands
Authors Muhannad Alomari, Paul Duckworth, Majd Hawasly, David C. Hogg, Anthony G. Cohn
Abstract We present a cognitively plausible system capable of acquiring knowledge in language and vision from pairs of short video clips and linguistic descriptions. The aim of this work is to teach a robot manipulator how to execute natural language commands by demonstration. This is achieved by first learning a set of visual {`}concepts{'} that abstract the visual feature spaces into concepts that have human-level meaning. Second, learning the mapping/grounding between words and the extracted visual concepts. Third, inducing grammar rules via a semantic representation known as Robot Control Language (RCL). We evaluate our approach against state-of-the-art supervised and unsupervised grounding and grammar induction systems, and show that a robot can learn to execute never seen-before commands from pairs of unlabelled linguistic and visual inputs. |
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2805/
PDF https://www.aclweb.org/anthology/W17-2805
PWC https://paperswithcode.com/paper/natural-language-grounding-and-grammar
Repo
Framework

Exploring Optimal Voting in Native Language Identification

Title Exploring Optimal Voting in Native Language Identification
Authors Cyril Goutte, Serge L{'e}ger
Abstract We describe the submissions entered by the National Research Council Canada in the NLI-2017 evaluation. We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems. We also explored the use of features that rely on no linguistic preprocessing. Long ngrams of characters obtained from raw text turned out to yield the best performance on all textual input (written essays and speech transcripts). Voting ensembles turned out to produce small performance gains, with little difference between the various optimization strategies we tried. Our top systems achieved accuracies of 87{%} on the essay track, 84{%} on the speech track, and close to 92{%} by combining essays, speech and i-vectors in the fusion track.
Tasks Language Identification, Native Language Identification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5041/
PDF https://www.aclweb.org/anthology/W17-5041
PWC https://paperswithcode.com/paper/exploring-optimal-voting-in-native-language
Repo
Framework

Sub-character Neural Language Modelling in Japanese

Title Sub-character Neural Language Modelling in Japanese
Authors Viet Nguyen, Julian Brooke, Timothy Baldwin
Abstract In East Asian languages such as Japanese and Chinese, the semantics of a character are (somewhat) reflected in its sub-character elements. This paper examines the effect of using sub-characters for language modeling in Japanese. This is achieved by decomposing characters according to a range of character decomposition datasets, and training a neural language model over variously decomposed character representations. Our results indicate that language modelling can be improved through the inclusion of sub-characters, though this result depends on a good choice of decomposition dataset and the appropriate granularity of decomposition.
Tasks Language Modelling
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4122/
PDF https://www.aclweb.org/anthology/W17-4122
PWC https://paperswithcode.com/paper/sub-character-neural-language-modelling-in
Repo
Framework

Annotation of Entities and Relations in Spanish Radiology Reports

Title Annotation of Entities and Relations in Spanish Radiology Reports
Authors Viviana Cotik, Dar{'\i}o Filippo, Rol Roller, , Hans Uszkoreit, Feiyu Xu
Abstract Radiology reports express the results of a radiology study and contain information about anatomical entities, findings, measures and impressions of the medical doctor. The use of information extraction techniques can help physicians to access this information in order to understand data and to infer further knowledge. Supervised machine learning methods are very popular to address information extraction, but are usually domain and language dependent. To train new classification models, annotated data is required. Moreover, annotated data is also required as an evaluation resource of information extraction algorithms. However, one major drawback of processing clinical data is the low availability of annotated datasets. For this reason we performed a manual annotation of radiology reports written in Spanish. This paper presents the corpus, the annotation schema, the annotation guidelines and further insight of the data.
Tasks Named Entity Recognition, Relation Extraction
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1025/
PDF https://doi.org/10.26615/978-954-452-049-6_025
PWC https://paperswithcode.com/paper/annotation-of-entities-and-relations-in
Repo
Framework

Improving Evaluation of Document-level Machine Translation Quality Estimation

Title Improving Evaluation of Document-level Machine Translation Quality Estimation
Authors Yvette Graham, Qingsong Ma, Timothy Baldwin, Qun Liu, Carla Parra, Carolina Scarton
Abstract Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable. In this paper, we explore the validity of human annotations currently employed in the evaluation of document-level quality estimation for machine translation (MT). We demonstrate the degree to which MT system rankings are dependent on weights employed in the construction of the gold standard, before proposing direct human assessment as a valid alternative. Experiments show direct assessment (DA) scores for documents to be highly reliable, achieving a correlation of above 0.9 in a self-replication experiment, in addition to a substantial estimated cost reduction through quality controlled crowd-sourcing. The original gold standard based on post-edits incurs a 10{–}20 times greater cost than DA.
Tasks Machine Translation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2057/
PDF https://www.aclweb.org/anthology/E17-2057
PWC https://paperswithcode.com/paper/improving-evaluation-of-document-level
Repo
Framework

Using a Graph-based Coherence Model in Document-Level Machine Translation

Title Using a Graph-based Coherence Model in Document-Level Machine Translation
Authors Leo Born, Mohsen Mesgar, Michael Strube
Abstract Although coherence is an important aspect of any text generation system, it has received little attention in the context of machine translation (MT) so far. We hypothesize that the quality of document-level translation can be improved if MT models take into account the semantic relations among sentences during translation. We integrate the graph-based coherence model proposed by Mesgar and Strube, (2016) with Docent (Hardmeier et al., 2012, Hardmeier, 2014) a document-level machine translation system. The application of this graph-based coherence modeling approach is novel in the context of machine translation. We evaluate the coherence model and its effects on the quality of the machine translation. The result of our experiments shows that our coherence model slightly improves the quality of translation in terms of the average Meteor score.
Tasks Machine Translation, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4803/
PDF https://www.aclweb.org/anthology/W17-4803
PWC https://paperswithcode.com/paper/using-a-graph-based-coherence-model-in
Repo
Framework

A Computational Study on Word Meanings and Their Distributed Representations via Polymodal Embedding

Title A Computational Study on Word Meanings and Their Distributed Representations via Polymodal Embedding
Authors Joohee Park, Sung-hyon Myaeng
Abstract A distributed representation has become a popular approach to capturing a word meaning. Besides its success and practical value, however, questions arise about the relationships between a true word meaning and its distributed representation. In this paper, we examine such a relationship via polymodal embedding approach inspired by the theory that humans tend to use diverse sources in developing a word meaning. The result suggests that the existing embeddings lack in capturing certain aspects of word meanings which can be significantly improved by the polymodal approach. Also, we show distinct characteristics of different types of words (e.g. concreteness) via computational studies. Finally, we show our proposed embedding method outperforms the baselines in the word similarity measure tasks and the hypernym prediction tasks.
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1022/
PDF https://www.aclweb.org/anthology/I17-1022
PWC https://paperswithcode.com/paper/a-computational-study-on-word-meanings-and
Repo
Framework

Learning to Generate Product Reviews from Attributes

Title Learning to Generate Product Reviews from Attributes
Authors Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu
Abstract Automatically generating product reviews is a meaningful, yet not well-studied task in sentiment analysis. Traditional natural language generation methods rely extensively on hand-crafted rules and predefined templates. This paper presents an attention-enhanced attribute-to-sequence model to generate product reviews for given attribute information, such as user, product, and rating. The attribute encoder learns to represent input attributes as vectors. Then, the sequence decoder generates reviews by conditioning its output on these vectors. We also introduce an attention mechanism to jointly generate reviews and align words with input attributes. The proposed model is trained end-to-end to maximize the likelihood of target product reviews given the attributes. We build a publicly available dataset for the review generation task by leveraging the Amazon book reviews and their metadata. Experiments on the dataset show that our approach outperforms baseline methods and the attention mechanism significantly improves the performance of our model.
Tasks Sentiment Analysis, Text Generation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1059/
PDF https://www.aclweb.org/anthology/E17-1059
PWC https://paperswithcode.com/paper/learning-to-generate-product-reviews-from
Repo
Framework

Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context

Title Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context
Authors Ryu Takeda, Kazunori Komatani
Abstract Unsupervised segmentation of phoneme sequences is an essential process to obtain unknown words during spoken dialogues. In this segmentation, an input phoneme sequence without delimiters is converted into segmented sub-sequences corresponding to words. The Pitman-Yor semi-Markov model (PYSMM) is promising for this problem, but its performance degrades when it is applied to phoneme-level word segmentation. This is because of insufficient cues for the segmentation, e.g., homophones are improperly treated as single entries and their different contexts are also confused. We propose a phoneme-length context model for PYSMM to give a helpful cue at the phoneme-level and to predict succeeding segments more accurately. Our experiments showed that the peak performance with our context model outperformed those without such a context model by 0.045 at most in terms of F-measures of estimated segmentation.
Tasks Spoken Dialogue Systems
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1025/
PDF https://www.aclweb.org/anthology/I17-1025
PWC https://paperswithcode.com/paper/unsupervised-segmentation-of-phoneme
Repo
Framework

Extractive Summarization Using Multi-Task Learning with Document Classification

Title Extractive Summarization Using Multi-Task Learning with Document Classification
Authors Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, Ichiro Sakata
Abstract The need for automatic document summarization that can be used for practical applications is increasing rapidly. In this paper, we propose a general framework for summarization that extracts sentences from a document using externally related information. Our work is aimed at single document summarization using small amounts of reference summaries. In particular, we address document summarization in the framework of multi-task learning using curriculum learning for sentence extraction and document classification. The proposed framework enables us to obtain better feature representations to extract sentences from documents. We evaluate our proposed summarization method on two datasets: financial report and news corpus. Experimental results demonstrate that our summarizers achieve performance that is comparable to state-of-the-art systems.
Tasks Document Classification, Document Summarization, Multi-Task Learning
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1223/
PDF https://www.aclweb.org/anthology/D17-1223
PWC https://paperswithcode.com/paper/extractive-summarization-using-multi-task
Repo
Framework

FA3L at SemEval-2017 Task 3: A ThRee Embeddings Recurrent Neural Network for Question Answering

Title FA3L at SemEval-2017 Task 3: A ThRee Embeddings Recurrent Neural Network for Question Answering
Authors Giuseppe Attardi, Antonio Carta, Federico Errica, Andrea Madotto, Ludovica Pannitto
Abstract In this paper we present ThReeNN, a model for Community Question Answering, Task 3, of SemEval-2017. The proposed model exploits both syntactic and semantic information to build a single and meaningful embedding space. Using a dependency parser in combination with word embeddings, the model creates sequences of inputs for a Recurrent Neural Network, which are then used for the ranking purposes of the Task. The score obtained on the official test data shows promising results.
Tasks Community Question Answering, Dependency Parsing, Model Selection, Question Answering, Question Similarity, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2048/
PDF https://www.aclweb.org/anthology/S17-2048
PWC https://paperswithcode.com/paper/fa3l-at-semeval-2017-task-3-a-three
Repo
Framework

Turning Distributional Thesauri into Word Vectors for Synonym Extraction and Expansion

Title Turning Distributional Thesauri into Word Vectors for Synonym Extraction and Expansion
Authors Olivier Ferret
Abstract In this article, we propose to investigate a new problem consisting in turning a distributional thesaurus into dense word vectors. We propose more precisely a method for performing such task by associating graph embedding and distributed representation adaptation. We have applied and evaluated it for English nouns at a large scale about its ability to retrieve synonyms. In this context, we have also illustrated the interest of the developed method for three different tasks: the improvement of already existing word embeddings, the fusion of heterogeneous representations and the expansion of synsets.
Tasks Graph Embedding, Semantic Textual Similarity, Word Embeddings
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1028/
PDF https://www.aclweb.org/anthology/I17-1028
PWC https://paperswithcode.com/paper/turning-distributional-thesauri-into-word
Repo
Framework

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Title Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings
Authors Sina Zarrie{\ss}, David Schlangen
Abstract There has recently been a lot of work trying to use images of referents of words for improving vector space meaning representations derived from text. We investigate the opposite direction, as it were, trying to improve visual word predictors that identify objects in images, by exploiting distributional similarity information during training. We show that for certain words (such as entry-level nouns or hypernyms), we can indeed learn better referential word meanings by taking into account their semantic similarity to other words. For other words, there is no or even a detrimental effect, compared to a learning setup that presents even semantically related objects as negative instances.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2014/
PDF https://www.aclweb.org/anthology/E17-2014
PWC https://paperswithcode.com/paper/is-this-a-child-a-girl-or-a-car-exploring-the
Repo
Framework
comments powered by Disqus