May 5, 2019

1895 words 9 mins read

Paper Group NANR 136

CSE: Conceptual Sentence Embeddings based on Attention Model. Learning from Within? Comparing PoS Tagging Approaches for Historical Text. Applying Neural Networks to English-Chinese Named Entity Transliteration. Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation. PROTEST: A Test Suite for Evaluating Pronouns in Machine Tran …

CSE: Conceptual Sentence Embeddings based on Attention Model


Title	CSE: Conceptual Sentence Embeddings based on Attention Model
Authors	Yashen Wang, Heyan Huang, Chong Feng, Qiang Zhou, Jiahui Gu, Xiong Gao
Abstract
Tasks	Information Retrieval, Sentence Embedding, Sentence Embeddings, Text Classification
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1048/
PDF	https://www.aclweb.org/anthology/P16-1048
PWC	https://paperswithcode.com/paper/cse-conceptual-sentence-embeddings-based-on
Repo
Framework

Learning from Within? Comparing PoS Tagging Approaches for Historical Text


Title	Learning from Within? Comparing PoS Tagging Approaches for Historical Text
Authors	Sarah Schulz, Jonas Kuhn
Abstract	In this paper, we investigate unsupervised and semi-supervised methods for part-of-speech (PoS) tagging in the context of historical German text. We locate our research in the context of Digital Humanities where the non-canonical nature of text causes issues facing an Natural Language Processing world in which tools are mainly trained on standard data. Data deviating from the norm requires tools adjusted to this data. We explore to which extend the availability of such training material and resources related to it influences the accuracy of PoS tagging. We investigate a variety of algorithms including neural nets, conditional random fields and self-learning techniques in order to find the best-fitted approach to tackle data sparsity. Although methods using resources from related languages outperform weakly supervised methods using just a few training examples, we can still reach a promising accuracy with methods abstaining additional resources.
Tasks	Part-Of-Speech Tagging
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1684/
PDF	https://www.aclweb.org/anthology/L16-1684
PWC	https://paperswithcode.com/paper/learning-from-within-comparing-pos-tagging
Repo
Framework

Applying Neural Networks to English-Chinese Named Entity Transliteration


Title	Applying Neural Networks to English-Chinese Named Entity Transliteration
Authors	Yan Shao, Joakim Nivre
Abstract
Tasks	Information Retrieval, Machine Translation, Transliteration
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2710/
PDF	https://www.aclweb.org/anthology/W16-2710
PWC	https://paperswithcode.com/paper/applying-neural-networks-to-english-chinese
Repo
Framework

Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation


Title	Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation
Authors	Pitambar Behera, Neha Mourya, P, V ey, ana
Abstract	In Machine Translation, divergence is one of the major barriers which plays a deciding role in determining the efficiency of the system at hand. Translation divergences originate when there is structural discrepancies between the input and the output languages. It can be of various types based on the issues we are addressing to such as linguistic, cultural, communicative and so on. Owing to the fact that two languages owe their origin to different language families, linguistic divergences emerge. The present study attempts at categorizing different types of linguistic divergences: the lexical-semantic and syntactic. In addition, it also helps identify and resolve the divergent linguistic features between English as source language and Bhojpuri as target language pair. Dorr{'}s theoretical framework (1994, 1994a) has been followed in the classification and resolution procedure. Furthermore, so far as the methodology is concerned, we have adhered to the Dorr{'}s Lexical Conceptual Structure for the resolution of divergences. This research will prove to be beneficial for developing efficient MT systems if the mentioned factors are incorporated considering the inherent structural constraints between source and target languages.ated considering the inherent structural constraints between SL and TL pairs.
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-3711/
PDF	https://www.aclweb.org/anthology/W16-3711
PWC	https://paperswithcode.com/paper/dealing-with-linguistic-divergences-in
Repo
Framework

PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation


Title	PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation
Authors	Liane Guillou, Christian Hardmeier
Abstract	We present PROTEST, a test suite for the evaluation of pronoun translation by MT systems. The test suite comprises 250 hand-selected pronoun tokens and an automatic evaluation method which compares the translations of pronouns in MT output with those in the reference translation. Pronoun translations that do not match the reference are referred for manual evaluation. PROTEST is designed to support analysis of system performance at the level of individual pronoun groups, rather than to provide a single aggregate measure over all pronouns. We wish to encourage detailed analyses to highlight issues in the handling of specific linguistic mechanisms by MT systems, thereby contributing to a better understanding of those problems involved in translating pronouns. We present two use cases for PROTEST: a) for measuring improvement/degradation of an incremental system change, and b) for comparing the performance of a group of systems whose design may be largely unrelated. Following the latter use case, we demonstrate the application of PROTEST to the evaluation of the systems submitted to the DiscoMT 2015 shared task on pronoun translation.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1100/
PDF	https://www.aclweb.org/anthology/L16-1100
PWC	https://paperswithcode.com/paper/protest-a-test-suite-for-evaluating-pronouns
Repo
Framework

Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity


Title	Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity
Authors	Kewei Tu
Abstract
Tasks	Dependency Parsing
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1208/
PDF	https://www.aclweb.org/anthology/D16-1208
PWC	https://paperswithcode.com/paper/modified-dirichlet-distribution-allowing
Repo
Framework

Sentiframes: A Resource for Verb-centered German Sentiment Inference


Title	Sentiframes: A Resource for Verb-centered German Sentiment Inference
Authors	Manfred Klenner, Michael Amsler
Abstract	In this paper, a German verb resource for verb-centered sentiment inference is introduced and evaluated. Our model specifies verb polarity frames that capture the polarity effects on the fillers of the verb{'}s arguments given a sentence with that verb frame. Verb signatures and selectional restrictions are also part of the model. An algorithm to apply the verb resource to treebank sentences and the results of our first evaluation are discussed.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1461/
PDF	https://www.aclweb.org/anthology/L16-1461
PWC	https://paperswithcode.com/paper/sentiframes-a-resource-for-verb-centered
Repo
Framework

Incremental Global Event Extraction


Title	Incremental Global Event Extraction
Authors	Alex Judea, Michael Strube
Abstract	Event extraction is a difficult information extraction task. Li et al. (2014) explore the benefits of modeling event extraction and two related tasks, entity mention and relation extraction, jointly. This joint system achieves state-of-the-art performance in all tasks. However, as a system operating only at the sentence level, it misses valuable information from other parts of the document. In this paper, we present an incremental easy-first approach to make the global context of the entire document available to the intra-sentential, state-of-the-art event extractor. We show that our method robustly increases performance on two datasets, namely ACE 2005 and TAC 2015.
Tasks	Relation Extraction, Word Sense Disambiguation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1215/
PDF	https://www.aclweb.org/anthology/C16-1215
PWC	https://paperswithcode.com/paper/incremental-global-event-extraction
Repo
Framework

Probing the Compositionality of Intuitive Functions


Title	Probing the Compositionality of Intuitive Functions
Authors	Eric Schulz, Josh Tenenbaum, David K. Duvenaud, Maarten Speekenbrink, Samuel J. Gershman
Abstract	How do people learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is accomplished by harnessing compositionality: complex structure is decomposed into simpler building blocks. We formalize this idea within the framework of Bayesian regression using a grammar over Gaussian process kernels. We show that participants prefer compositional over non-compositional function extrapolations, that samples from the human prior over functions are best described by a compositional model, and that people perceive compositional functions as more predictable than their non-compositional but otherwise similar counterparts. We argue that the compositional nature of intuitive functions is consistent with broad principles of human cognition.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6130-probing-the-compositionality-of-intuitive-functions
PDF	http://papers.nips.cc/paper/6130-probing-the-compositionality-of-intuitive-functions.pdf
PWC	https://paperswithcode.com/paper/probing-the-compositionality-of-intuitive
Repo
Framework

Applying the Cognitive Machine Translation Evaluation Approach to Arabic


Title	Applying the Cognitive Machine Translation Evaluation Approach to Arabic
Authors	Irina Temnikova, Wajdi Zaghouani, Stephan Vogel, Nizar Habash
Abstract	The goal of the cognitive machine translation (MT) evaluation approach is to build classifiers which assign post-editing effort scores to new texts. The approach helps estimate fair compensation for post-editors in the translation industry by evaluating the cognitive difficulty of post-editing MT output. The approach counts the number of errors classified in different categories on the basis of how much cognitive effort they require in order to be corrected. In this paper, we present the results of applying an existing cognitive evaluation approach to Modern Standard Arabic (MSA). We provide a comparison of the number of errors and categories of errors in three MSA texts of different MT quality (without any language-specific adaptation), as well as a comparison between MSA texts and texts from three Indo-European languages (Russian, Spanish, and Bulgarian), taken from a previous experiment. The results show how the error distributions change passing from the MSA texts of worse MT quality to MSA texts of better MT quality, as well as a similarity in distinguishing the texts of better MT quality for all four languages.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1578/
PDF	https://www.aclweb.org/anthology/L16-1578
PWC	https://paperswithcode.com/paper/applying-the-cognitive-machine-translation
Repo
Framework

Unsupervised morph segmentation and statistical language models for vocabulary expansion


Title	Unsupervised morph segmentation and statistical language models for vocabulary expansion
Authors	Matti Varjokallio, Dietrich Klakow
Abstract
Tasks	Language Modelling, Machine Translation, Optical Character Recognition, Speech Recognition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2029/
PDF	https://www.aclweb.org/anthology/P16-2029
PWC	https://paperswithcode.com/paper/unsupervised-morph-segmentation-and
Repo
Framework

Deep multi-task learning with low level tasks supervised at lower layers


Title	Deep multi-task learning with low level tasks supervised at lower layers
Authors	Anders S{\o}gaard, Yoav Goldberg
Abstract
Tasks	CCG Supertagging, Chunking, Domain Adaptation, Multi-Task Learning
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2038/
PDF	https://www.aclweb.org/anthology/P16-2038
PWC	https://paperswithcode.com/paper/deep-multi-task-learning-with-low-level-tasks
Repo
Framework

Distributed Representations for Building Profiles of Users and Items from Text Reviews


Title	Distributed Representations for Building Profiles of Users and Items from Text Reviews
Authors	Wenliang Chen, Zhenjie Zhang, Zhenghua Li, Min Zhang
Abstract	In this paper, we propose an approach to learn distributed representations of users and items from text comments for recommendation systems. Traditional recommendation algorithms, e.g. collaborative filtering and matrix completion, are not designed to exploit the key information hidden in the text comments, while existing opinion mining methods do not provide direct support to recommendation systems with useful features on users and items. Our approach attempts to construct vectors to represent profiles of users and items under a unified framework to maximize word appearance likelihood. Then, the vector representations are used for a recommendation task in which we predict scores on unobserved user-item pairs without given texts. The recommendation-aware distributed representation approach is fully supported by effective and efficient learning algorithms over massive text archive. Our empirical evaluations on real datasets show that our system outperforms the state-of-the-art baseline systems.
Tasks	Decision Making, Matrix Completion, Opinion Mining, Recommendation Systems
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1202/
PDF	https://www.aclweb.org/anthology/C16-1202
PWC	https://paperswithcode.com/paper/distributed-representations-for-building
Repo
Framework

Discovering Entity Knowledge Bases on the Web


Title	Discovering Entity Knowledge Bases on the Web
Authors	Andrew Chisholm, Will Radford, Ben Hachey
Abstract
Tasks	Entity Linking, Person Recognition
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1302/
PDF	https://www.aclweb.org/anthology/W16-1302
PWC	https://paperswithcode.com/paper/discovering-entity-knowledge-bases-on-the-web
Repo
Framework

Hierarchical Permutation Complexity for Word Order Evaluation


Title	Hierarchical Permutation Complexity for Word Order Evaluation
Authors	Milo{\v{s}} Stanojevi{'c}, Khalil Sima{'}an
Abstract	Existing approaches for evaluating word order in machine translation work with metrics computed directly over a permutation of word positions in system output relative to a reference translation. However, every permutation factorizes into a permutation tree (PET) built of primal permutations, i.e., atomic units that do not factorize any further. In this paper we explore the idea that permutations factorizing into (on average) shorter primal permutations should represent simpler ordering as well. Consequently, we contribute Permutation Complexity, a class of metrics over PETs and their extension to forests, and define tight metrics, a sub-class of metrics implementing this idea. Subsequently we define example tight metrics and empirically test them in word order evaluation. Experiments on the WMT13 data sets for ten language pairs show that a tight metric is more often than not better than the baselines.
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1204/
PDF	https://www.aclweb.org/anthology/C16-1204
PWC	https://paperswithcode.com/paper/hierarchical-permutation-complexity-for-word
Repo
Framework