July 26, 2019

1707 words 9 mins read

Paper Group NANR 107

Paper Group NANR 107

Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing. Neural Temporal Relation Extraction. Deception Detection for the Russian Language: Lexical and Syntactic Parameters. Understanding Non-Native Writings: Can a Parser Help?. XMU Neural Machine Translation Systems for WAT 2017. Depictives in Engl …

Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing

Title Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing
Authors Atsushi Fujita, Eiichiro Sumita
Abstract Aiming at facilitating the research on quality estimation (QE) and automatic post-editing (APE) of machine translation (MT) outputs, especially for those among Asian languages, we have created new datasets for Japanese to English, Chinese, and Korean translations. As the source text, actual utterances in Japanese were extracted from the log data of our speech translation service. MT outputs were then given by phrase-based statistical MT systems. Finally, human evaluators were employed to grade the quality of MT outputs and to post-edit them. This paper describes the characteristics of the created datasets and reports on our benchmarking experiments on word-level QE, sentence-level QE, and APE conducted using the created datasets.
Tasks Automatic Post-Editing, Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5705/
PDF https://www.aclweb.org/anthology/W17-5705
PWC https://paperswithcode.com/paper/japanese-to-englishchinesekorean-datasets-for

Neural Temporal Relation Extraction

Title Neural Temporal Relation Extraction
Authors Dmitriy Dligach, Timothy Miller, Chen Lin, Steven Bethard, Guergana Savova
Abstract We experiment with neural architectures for temporal relation extraction and establish a new state-of-the-art for several scenarios. We find that neural models with only tokens as input outperform state-of-the-art hand-engineered feature-based models, that convolutional neural networks outperform LSTM models, and that encoding relation arguments with XML tags outperforms a traditional position-based encoding.
Tasks Relation Classification, Relation Extraction, Temporal Information Extraction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2118/
PDF https://www.aclweb.org/anthology/E17-2118
PWC https://paperswithcode.com/paper/neural-temporal-relation-extraction

Deception Detection for the Russian Language: Lexical and Syntactic Parameters

Title Deception Detection for the Russian Language: Lexical and Syntactic Parameters
Authors Dina Pisarevskaya, Tatiana Litvinova, Olga Litvinova
Abstract The field of automated deception detection in written texts is methodologically challenging. Different linguistic levels (lexics, syntax and semantics) are basically used for different types of English texts to reveal if they are truthful or deceptive. Such parameters as POS tags and POS tags n-grams, punctuation marks, sentiment polarity of words, psycholinguistic features, fragments of synta�tic structures are taken into consideration. The importance of different types of parameters was not compared for the Russian language before and should be investigated before moving to complex models and higher levels of linguistic processing. On the example of the Russian Deception Bank Corpus we estimate the impact of three groups of features (POS features including bigrams, sentiment and psycholinguistic features, syntax and readability features) on the successful deception detection and find out that POS features can be used for binary text classification, but the results should be double-checked and, if possible, improved.
Tasks Deception Detection, Information Retrieval, Text Classification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-7701/
PDF https://doi.org/10.26615/978-954-452-038-0_001
PWC https://paperswithcode.com/paper/deception-detection-for-the-russian-language

Understanding Non-Native Writings: Can a Parser Help?

Title Understanding Non-Native Writings: Can a Parser Help?
Authors Jirka Hana, Barbora Hladk{'a}
Abstract We present a pilot study on parsing non-native texts written by learners of Czech. We performed experiments that have shown that at least high-level syntactic functions, like subject, predicate, and object, can be assigned based on a parser trained on standard native language.
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-5902/
PDF https://www.aclweb.org/anthology/W17-5902
PWC https://paperswithcode.com/paper/understanding-non-native-writings-can-a

XMU Neural Machine Translation Systems for WAT 2017

Title XMU Neural Machine Translation Systems for WAT 2017
Authors Boli Wang, Zhixing Tan, Jinming Hu, Yidong Chen, Xiaodong Shi
Abstract This paper describes the Neural Machine Translation systems of Xiamen University for the shared translation tasks of WAT 2017. Our systems are based on the Encoder-Decoder framework with attention. We participated in three subtasks. We experimented subword segmentation, synthetic training data and model ensembling. Experiments show that all these methods can give substantial improvements.
Tasks Machine Translation, Word Embeddings
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5707/
PDF https://www.aclweb.org/anthology/W17-5707
PWC https://paperswithcode.com/paper/xmu-neural-machine-translation-systems-for-1

Depictives in English: An LTAG Approach

Title Depictives in English: An LTAG Approach
Authors Benjamin Burkhardt, Timm Lichte, Laura Kallmeyer
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6203/
PDF https://www.aclweb.org/anthology/W17-6203
PWC https://paperswithcode.com/paper/depictives-in-english-an-ltag-approach

Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph

Title Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph
Authors Gus Hahn-Powell, Marco A. Valenzuela-Esc{'a}rcega, Mihai Surdeanu
Tasks Reading Comprehension
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-4018/
PDF https://www.aclweb.org/anthology/P17-4018
PWC https://paperswithcode.com/paper/swanson-linking-revisited-accelerating

Connecting Documentation and Revitalization: A New Approach to Language Apps

Title Connecting Documentation and Revitalization: A New Approach to Language Apps
Authors Alexa N. Little
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0120/
PDF https://www.aclweb.org/anthology/W17-0120
PWC https://paperswithcode.com/paper/connecting-documentation-and-revitalization-a

MultiNews: A Web collection of an Aligned Multimodal and Multilingual Corpus

Title MultiNews: A Web collection of an Aligned Multimodal and Multilingual Corpus
Authors Haithem Afli, Pintu Lohar, Andy Way
Abstract Integrating Natural Language Processing (NLP) and computer vision is a promising effort. However, the applicability of these methods directly depends on the availability of a specific multimodal data that includes images and texts. In this paper, we present a collection of a Multimodal corpus of comparable texts and their images in 9 languages from the web news articles of Euronews website. This corpus has found widespread use in the NLP community in Multilingual and multimodal tasks. Here, we focus on its acquisition of the images and text data and their multilingual alignment.
Tasks Content-Based Image Retrieval, Image Retrieval, Machine Translation, Multimodal Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5602/
PDF https://www.aclweb.org/anthology/W17-5602
PWC https://paperswithcode.com/paper/multinews-a-web-collection-of-an-aligned

Idiom Type Identification with Smoothed Lexical Features and a Maximum Margin Classifier

Title Idiom Type Identification with Smoothed Lexical Features and a Maximum Margin Classifier
Authors Giancarlo Salton, Robert Ross, John Kelleher
Abstract In our work we address limitations in the state-of-the-art in idiom type identification. We investigate different approaches for a lexical fixedness metric, a component of the state-of the-art model. We also show that our Machine Learning based approach to the idiom type identification task achieves an F1-score of 0.85, an improvement of 11 points over the state-of the-art.
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1083/
PDF https://doi.org/10.26615/978-954-452-049-6_083
PWC https://paperswithcode.com/paper/idiom-type-identification-with-smoothed

Beyond Words: Deep Learning for Multiword Expressions and Collocations

Title Beyond Words: Deep Learning for Multiword Expressions and Collocations
Authors Valia Kordoni
Abstract Deep learning has recently shown much promise for NLP applications. Traditionally, in most NLP approaches, documents or sentences are represented by a sparse bag-of-words representation. There is now a lot of work which goes beyond this by adopting a distributed representation of words, by constructing a so-called ``neural embedding’’ or vector space representation of each word or document. The aim of this tutorial is to go beyond the learning of word vectors and present methods for learning vector representations for Multiword Expressions and bilingual phrase pairs, all of which are useful for various NLP applications.This tutorial aims to provide attendees with a clear notion of the linguistic and distributional characteristics of Multiword Expressions (MWEs), their relevance for the intersection of deep learning and natural language processing, what methods and resources are available to support their use, and what more could be done in the future. Our target audience are researchers and practitioners in machine learning, parsing (syntactic and semantic) and language technology, not necessarily experts in MWEs, who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication. |
Tasks Information Retrieval, Machine Translation
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-5005/
PDF https://www.aclweb.org/anthology/P17-5005
PWC https://paperswithcode.com/paper/beyond-words-deep-learning-for-multiword

SMT reranked NMT

Title SMT reranked NMT
Authors Terumasa Ehara
Abstract System architecture, experimental settings and experimental results of the EHR team for the WAT2017 tasks are described. We participate in three tasks: JPCen-ja, JPCzh-ja and JPCko-ja. Although the basic architecture of our system is NMT, reranking technique is conducted using SMT results. One of the major drawback of NMT is under-translation and over-translation. On the other hand, SMT infrequently makes such translations. So, using reranking of n-best NMT outputs by the SMT output, discarding such translations can be expected. We can improve BLEU score from 46.03 to 47.08 by this technique in JPCzh-ja task.
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5710/
PDF https://www.aclweb.org/anthology/W17-5710
PWC https://paperswithcode.com/paper/smt-reranked-nmt

Translation Synchronization via Truncated Least Squares

Title Translation Synchronization via Truncated Least Squares
Authors Xiangru Huang, Zhenxiao Liang, Chandrajit Bajaj, Qixing Huang
Abstract In this paper, we introduce a robust algorithm, \textsl{TranSync}, for the 1D translation synchronization problem, in which the aim is to recover the global coordinates of a set of nodes from noisy measurements of relative coordinates along an observation graph. The basic idea of TranSync is to apply truncated least squares, where the solution at each step is used to gradually prune out noisy measurements. We analyze TranSync under both deterministic and randomized noisy models, demonstrating its robustness and stability. Experimental results on synthetic and real datasets show that TranSync is superior to state-of-the-art convex formulations in terms of both efficiency and accuracy.
Published 2017-12-01
URL http://papers.nips.cc/paper/6744-translation-synchronization-via-truncated-least-squares
PDF http://papers.nips.cc/paper/6744-translation-synchronization-via-truncated-least-squares.pdf
PWC https://paperswithcode.com/paper/translation-synchronization-via-truncated

Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations

Title Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations
Authors Frank Drewes, Anna Jonsson
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6211/
PDF https://www.aclweb.org/anthology/W17-6211
PWC https://paperswithcode.com/paper/contextual-hyperedge-replacement-grammars-for

YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English

Title YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English
Authors Nan Wang, Jin Wang, Xuejie Zhang
Abstract This paper describes our submission to IJCNLP 2017 shared task 4, for predicting the tags of unseen customer feedback sentences, such as comments, complaints, bugs, requests, and meaningless and undetermined statements. With the use of a neural network, a large number of deep learning methods have been developed, which perform very well on text classification. Our ensemble classification model is based on a bi-directional gated recurrent unit and an attention mechanism which shows a 3.8{%} improvement in classification accuracy. To enhance the model performance, we also compared it with several word-embedding models. The comparative results show that a combination of both word2vec and GloVe achieves the best performance.
Tasks Question Answering, Representation Learning, Sentiment Analysis, Text Classification
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4029/
PDF https://www.aclweb.org/anthology/I17-4029
PWC https://paperswithcode.com/paper/ynu-hpcc-at-ijcnlp-2017-task-4-attention
comments powered by Disqus