Paper Group NANR 107
Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing. Neural Temporal Relation Extraction. Deception Detection for the Russian Language: Lexical and Syntactic Parameters. Understanding Non-Native Writings: Can a Parser Help?. XMU Neural Machine Translation Systems for WAT 2017. Depictives in Engl …
Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing
Title | Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing |
Authors | Atsushi Fujita, Eiichiro Sumita |
Abstract | Aiming at facilitating the research on quality estimation (QE) and automatic post-editing (APE) of machine translation (MT) outputs, especially for those among Asian languages, we have created new datasets for Japanese to English, Chinese, and Korean translations. As the source text, actual utterances in Japanese were extracted from the log data of our speech translation service. MT outputs were then given by phrase-based statistical MT systems. Finally, human evaluators were employed to grade the quality of MT outputs and to post-edit them. This paper describes the characteristics of the created datasets and reports on our benchmarking experiments on word-level QE, sentence-level QE, and APE conducted using the created datasets. |
Tasks | Automatic Post-Editing, Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5705/ |
https://www.aclweb.org/anthology/W17-5705 | |
PWC | https://paperswithcode.com/paper/japanese-to-englishchinesekorean-datasets-for |
Repo | |
Framework | |
Neural Temporal Relation Extraction
Title | Neural Temporal Relation Extraction |
Authors | Dmitriy Dligach, Timothy Miller, Chen Lin, Steven Bethard, Guergana Savova |
Abstract | We experiment with neural architectures for temporal relation extraction and establish a new state-of-the-art for several scenarios. We find that neural models with only tokens as input outperform state-of-the-art hand-engineered feature-based models, that convolutional neural networks outperform LSTM models, and that encoding relation arguments with XML tags outperforms a traditional position-based encoding. |
Tasks | Relation Classification, Relation Extraction, Temporal Information Extraction |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2118/ |
https://www.aclweb.org/anthology/E17-2118 | |
PWC | https://paperswithcode.com/paper/neural-temporal-relation-extraction |
Repo | |
Framework | |
Deception Detection for the Russian Language: Lexical and Syntactic Parameters
Title | Deception Detection for the Russian Language: Lexical and Syntactic Parameters |
Authors | Dina Pisarevskaya, Tatiana Litvinova, Olga Litvinova |
Abstract | The field of automated deception detection in written texts is methodologically challenging. Different linguistic levels (lexics, syntax and semantics) are basically used for different types of English texts to reveal if they are truthful or deceptive. Such parameters as POS tags and POS tags n-grams, punctuation marks, sentiment polarity of words, psycholinguistic features, fragments of synta�tic structures are taken into consideration. The importance of different types of parameters was not compared for the Russian language before and should be investigated before moving to complex models and higher levels of linguistic processing. On the example of the Russian Deception Bank Corpus we estimate the impact of three groups of features (POS features including bigrams, sentiment and psycholinguistic features, syntax and readability features) on the successful deception detection and find out that POS features can be used for binary text classification, but the results should be double-checked and, if possible, improved. |
Tasks | Deception Detection, Information Retrieval, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-7701/ |
https://doi.org/10.26615/978-954-452-038-0_001 | |
PWC | https://paperswithcode.com/paper/deception-detection-for-the-russian-language |
Repo | |
Framework | |
Understanding Non-Native Writings: Can a Parser Help?
Title | Understanding Non-Native Writings: Can a Parser Help? |
Authors | Jirka Hana, Barbora Hladk{'a} |
Abstract | We present a pilot study on parsing non-native texts written by learners of Czech. We performed experiments that have shown that at least high-level syntactic functions, like subject, predicate, and object, can be assigned based on a parser trained on standard native language. |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-5902/ |
https://www.aclweb.org/anthology/W17-5902 | |
PWC | https://paperswithcode.com/paper/understanding-non-native-writings-can-a |
Repo | |
Framework | |
XMU Neural Machine Translation Systems for WAT 2017
Title | XMU Neural Machine Translation Systems for WAT 2017 |
Authors | Boli Wang, Zhixing Tan, Jinming Hu, Yidong Chen, Xiaodong Shi |
Abstract | This paper describes the Neural Machine Translation systems of Xiamen University for the shared translation tasks of WAT 2017. Our systems are based on the Encoder-Decoder framework with attention. We participated in three subtasks. We experimented subword segmentation, synthetic training data and model ensembling. Experiments show that all these methods can give substantial improvements. |
Tasks | Machine Translation, Word Embeddings |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5707/ |
https://www.aclweb.org/anthology/W17-5707 | |
PWC | https://paperswithcode.com/paper/xmu-neural-machine-translation-systems-for-1 |
Repo | |
Framework | |
Depictives in English: An LTAG Approach
Title | Depictives in English: An LTAG Approach |
Authors | Benjamin Burkhardt, Timm Lichte, Laura Kallmeyer |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6203/ |
https://www.aclweb.org/anthology/W17-6203 | |
PWC | https://paperswithcode.com/paper/depictives-in-english-an-ltag-approach |
Repo | |
Framework | |
Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph
Title | Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph |
Authors | Gus Hahn-Powell, Marco A. Valenzuela-Esc{'a}rcega, Mihai Surdeanu |
Abstract | |
Tasks | Reading Comprehension |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4018/ |
https://www.aclweb.org/anthology/P17-4018 | |
PWC | https://paperswithcode.com/paper/swanson-linking-revisited-accelerating |
Repo | |
Framework | |
Connecting Documentation and Revitalization: A New Approach to Language Apps
Title | Connecting Documentation and Revitalization: A New Approach to Language Apps |
Authors | Alexa N. Little |
Abstract | |
Tasks | |
Published | 2017-03-01 |
URL | https://www.aclweb.org/anthology/W17-0120/ |
https://www.aclweb.org/anthology/W17-0120 | |
PWC | https://paperswithcode.com/paper/connecting-documentation-and-revitalization-a |
Repo | |
Framework | |
MultiNews: A Web collection of an Aligned Multimodal and Multilingual Corpus
Title | MultiNews: A Web collection of an Aligned Multimodal and Multilingual Corpus |
Authors | Haithem Afli, Pintu Lohar, Andy Way |
Abstract | Integrating Natural Language Processing (NLP) and computer vision is a promising effort. However, the applicability of these methods directly depends on the availability of a specific multimodal data that includes images and texts. In this paper, we present a collection of a Multimodal corpus of comparable texts and their images in 9 languages from the web news articles of Euronews website. This corpus has found widespread use in the NLP community in Multilingual and multimodal tasks. Here, we focus on its acquisition of the images and text data and their multilingual alignment. |
Tasks | Content-Based Image Retrieval, Image Retrieval, Machine Translation, Multimodal Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5602/ |
https://www.aclweb.org/anthology/W17-5602 | |
PWC | https://paperswithcode.com/paper/multinews-a-web-collection-of-an-aligned |
Repo | |
Framework | |
Idiom Type Identification with Smoothed Lexical Features and a Maximum Margin Classifier
Title | Idiom Type Identification with Smoothed Lexical Features and a Maximum Margin Classifier |
Authors | Giancarlo Salton, Robert Ross, John Kelleher |
Abstract | In our work we address limitations in the state-of-the-art in idiom type identification. We investigate different approaches for a lexical fixedness metric, a component of the state-of the-art model. We also show that our Machine Learning based approach to the idiom type identification task achieves an F1-score of 0.85, an improvement of 11 points over the state-of the-art. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1083/ |
https://doi.org/10.26615/978-954-452-049-6_083 | |
PWC | https://paperswithcode.com/paper/idiom-type-identification-with-smoothed |
Repo | |
Framework | |
Beyond Words: Deep Learning for Multiword Expressions and Collocations
Title | Beyond Words: Deep Learning for Multiword Expressions and Collocations |
Authors | Valia Kordoni |
Abstract | Deep learning has recently shown much promise for NLP applications. Traditionally, in most NLP approaches, documents or sentences are represented by a sparse bag-of-words representation. There is now a lot of work which goes beyond this by adopting a distributed representation of words, by constructing a so-called ``neural embedding’’ or vector space representation of each word or document. The aim of this tutorial is to go beyond the learning of word vectors and present methods for learning vector representations for Multiword Expressions and bilingual phrase pairs, all of which are useful for various NLP applications.This tutorial aims to provide attendees with a clear notion of the linguistic and distributional characteristics of Multiword Expressions (MWEs), their relevance for the intersection of deep learning and natural language processing, what methods and resources are available to support their use, and what more could be done in the future. Our target audience are researchers and practitioners in machine learning, parsing (syntactic and semantic) and language technology, not necessarily experts in MWEs, who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication. | |
Tasks | Information Retrieval, Machine Translation |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-5005/ |
https://www.aclweb.org/anthology/P17-5005 | |
PWC | https://paperswithcode.com/paper/beyond-words-deep-learning-for-multiword |
Repo | |
Framework | |
SMT reranked NMT
Title | SMT reranked NMT |
Authors | Terumasa Ehara |
Abstract | System architecture, experimental settings and experimental results of the EHR team for the WAT2017 tasks are described. We participate in three tasks: JPCen-ja, JPCzh-ja and JPCko-ja. Although the basic architecture of our system is NMT, reranking technique is conducted using SMT results. One of the major drawback of NMT is under-translation and over-translation. On the other hand, SMT infrequently makes such translations. So, using reranking of n-best NMT outputs by the SMT output, discarding such translations can be expected. We can improve BLEU score from 46.03 to 47.08 by this technique in JPCzh-ja task. |
Tasks | Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5710/ |
https://www.aclweb.org/anthology/W17-5710 | |
PWC | https://paperswithcode.com/paper/smt-reranked-nmt |
Repo | |
Framework | |
Translation Synchronization via Truncated Least Squares
Title | Translation Synchronization via Truncated Least Squares |
Authors | Xiangru Huang, Zhenxiao Liang, Chandrajit Bajaj, Qixing Huang |
Abstract | In this paper, we introduce a robust algorithm, \textsl{TranSync}, for the 1D translation synchronization problem, in which the aim is to recover the global coordinates of a set of nodes from noisy measurements of relative coordinates along an observation graph. The basic idea of TranSync is to apply truncated least squares, where the solution at each step is used to gradually prune out noisy measurements. We analyze TranSync under both deterministic and randomized noisy models, demonstrating its robustness and stability. Experimental results on synthetic and real datasets show that TranSync is superior to state-of-the-art convex formulations in terms of both efficiency and accuracy. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6744-translation-synchronization-via-truncated-least-squares |
http://papers.nips.cc/paper/6744-translation-synchronization-via-truncated-least-squares.pdf | |
PWC | https://paperswithcode.com/paper/translation-synchronization-via-truncated |
Repo | |
Framework | |
Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations
Title | Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations |
Authors | Frank Drewes, Anna Jonsson |
Abstract | |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6211/ |
https://www.aclweb.org/anthology/W17-6211 | |
PWC | https://paperswithcode.com/paper/contextual-hyperedge-replacement-grammars-for |
Repo | |
Framework | |
YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English
Title | YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English |
Authors | Nan Wang, Jin Wang, Xuejie Zhang |
Abstract | This paper describes our submission to IJCNLP 2017 shared task 4, for predicting the tags of unseen customer feedback sentences, such as comments, complaints, bugs, requests, and meaningless and undetermined statements. With the use of a neural network, a large number of deep learning methods have been developed, which perform very well on text classification. Our ensemble classification model is based on a bi-directional gated recurrent unit and an attention mechanism which shows a 3.8{%} improvement in classification accuracy. To enhance the model performance, we also compared it with several word-embedding models. The comparative results show that a combination of both word2vec and GloVe achieves the best performance. |
Tasks | Question Answering, Representation Learning, Sentiment Analysis, Text Classification |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4029/ |
https://www.aclweb.org/anthology/I17-4029 | |
PWC | https://paperswithcode.com/paper/ynu-hpcc-at-ijcnlp-2017-task-4-attention |
Repo | |
Framework | |