Paper Group NANR 136
CSE: Conceptual Sentence Embeddings based on Attention Model. Learning from Within? Comparing PoS Tagging Approaches for Historical Text. Applying Neural Networks to English-Chinese Named Entity Transliteration. Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation. PROTEST: A Test Suite for Evaluating Pronouns in Machine Tran …
CSE: Conceptual Sentence Embeddings based on Attention Model
Title | CSE: Conceptual Sentence Embeddings based on Attention Model |
Authors | Yashen Wang, Heyan Huang, Chong Feng, Qiang Zhou, Jiahui Gu, Xiong Gao |
Abstract | |
Tasks | Information Retrieval, Sentence Embedding, Sentence Embeddings, Text Classification |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1048/ |
https://www.aclweb.org/anthology/P16-1048 | |
PWC | https://paperswithcode.com/paper/cse-conceptual-sentence-embeddings-based-on |
Repo | |
Framework | |
Learning from Within? Comparing PoS Tagging Approaches for Historical Text
Title | Learning from Within? Comparing PoS Tagging Approaches for Historical Text |
Authors | Sarah Schulz, Jonas Kuhn |
Abstract | In this paper, we investigate unsupervised and semi-supervised methods for part-of-speech (PoS) tagging in the context of historical German text. We locate our research in the context of Digital Humanities where the non-canonical nature of text causes issues facing an Natural Language Processing world in which tools are mainly trained on standard data. Data deviating from the norm requires tools adjusted to this data. We explore to which extend the availability of such training material and resources related to it influences the accuracy of PoS tagging. We investigate a variety of algorithms including neural nets, conditional random fields and self-learning techniques in order to find the best-fitted approach to tackle data sparsity. Although methods using resources from related languages outperform weakly supervised methods using just a few training examples, we can still reach a promising accuracy with methods abstaining additional resources. |
Tasks | Part-Of-Speech Tagging |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1684/ |
https://www.aclweb.org/anthology/L16-1684 | |
PWC | https://paperswithcode.com/paper/learning-from-within-comparing-pos-tagging |
Repo | |
Framework | |
Applying Neural Networks to English-Chinese Named Entity Transliteration
Title | Applying Neural Networks to English-Chinese Named Entity Transliteration |
Authors | Yan Shao, Joakim Nivre |
Abstract | |
Tasks | Information Retrieval, Machine Translation, Transliteration |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2710/ |
https://www.aclweb.org/anthology/W16-2710 | |
PWC | https://paperswithcode.com/paper/applying-neural-networks-to-english-chinese |
Repo | |
Framework | |
Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation
Title | Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation |
Authors | Pitambar Behera, Neha Mourya, P, V ey, ana |
Abstract | In Machine Translation, divergence is one of the major barriers which plays a deciding role in determining the efficiency of the system at hand. Translation divergences originate when there is structural discrepancies between the input and the output languages. It can be of various types based on the issues we are addressing to such as linguistic, cultural, communicative and so on. Owing to the fact that two languages owe their origin to different language families, linguistic divergences emerge. The present study attempts at categorizing different types of linguistic divergences: the lexical-semantic and syntactic. In addition, it also helps identify and resolve the divergent linguistic features between English as source language and Bhojpuri as target language pair. Dorr{'}s theoretical framework (1994, 1994a) has been followed in the classification and resolution procedure. Furthermore, so far as the methodology is concerned, we have adhered to the Dorr{'}s Lexical Conceptual Structure for the resolution of divergences. This research will prove to be beneficial for developing efficient MT systems if the mentioned factors are incorporated considering the inherent structural constraints between source and target languages.ated considering the inherent structural constraints between SL and TL pairs. |
Tasks | Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3711/ |
https://www.aclweb.org/anthology/W16-3711 | |
PWC | https://paperswithcode.com/paper/dealing-with-linguistic-divergences-in |
Repo | |
Framework | |
PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation
Title | PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation |
Authors | Liane Guillou, Christian Hardmeier |
Abstract | We present PROTEST, a test suite for the evaluation of pronoun translation by MT systems. The test suite comprises 250 hand-selected pronoun tokens and an automatic evaluation method which compares the translations of pronouns in MT output with those in the reference translation. Pronoun translations that do not match the reference are referred for manual evaluation. PROTEST is designed to support analysis of system performance at the level of individual pronoun groups, rather than to provide a single aggregate measure over all pronouns. We wish to encourage detailed analyses to highlight issues in the handling of specific linguistic mechanisms by MT systems, thereby contributing to a better understanding of those problems involved in translating pronouns. We present two use cases for PROTEST: a) for measuring improvement/degradation of an incremental system change, and b) for comparing the performance of a group of systems whose design may be largely unrelated. Following the latter use case, we demonstrate the application of PROTEST to the evaluation of the systems submitted to the DiscoMT 2015 shared task on pronoun translation. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1100/ |
https://www.aclweb.org/anthology/L16-1100 | |
PWC | https://paperswithcode.com/paper/protest-a-test-suite-for-evaluating-pronouns |
Repo | |
Framework | |
Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity
Title | Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity |
Authors | Kewei Tu |
Abstract | |
Tasks | Dependency Parsing |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1208/ |
https://www.aclweb.org/anthology/D16-1208 | |
PWC | https://paperswithcode.com/paper/modified-dirichlet-distribution-allowing |
Repo | |
Framework | |
Sentiframes: A Resource for Verb-centered German Sentiment Inference
Title | Sentiframes: A Resource for Verb-centered German Sentiment Inference |
Authors | Manfred Klenner, Michael Amsler |
Abstract | In this paper, a German verb resource for verb-centered sentiment inference is introduced and evaluated. Our model specifies verb polarity frames that capture the polarity effects on the fillers of the verb{'}s arguments given a sentence with that verb frame. Verb signatures and selectional restrictions are also part of the model. An algorithm to apply the verb resource to treebank sentences and the results of our first evaluation are discussed. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1461/ |
https://www.aclweb.org/anthology/L16-1461 | |
PWC | https://paperswithcode.com/paper/sentiframes-a-resource-for-verb-centered |
Repo | |
Framework | |
Incremental Global Event Extraction
Title | Incremental Global Event Extraction |
Authors | Alex Judea, Michael Strube |
Abstract | Event extraction is a difficult information extraction task. Li et al. (2014) explore the benefits of modeling event extraction and two related tasks, entity mention and relation extraction, jointly. This joint system achieves state-of-the-art performance in all tasks. However, as a system operating only at the sentence level, it misses valuable information from other parts of the document. In this paper, we present an incremental easy-first approach to make the global context of the entire document available to the intra-sentential, state-of-the-art event extractor. We show that our method robustly increases performance on two datasets, namely ACE 2005 and TAC 2015. |
Tasks | Relation Extraction, Word Sense Disambiguation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1215/ |
https://www.aclweb.org/anthology/C16-1215 | |
PWC | https://paperswithcode.com/paper/incremental-global-event-extraction |
Repo | |
Framework | |
Probing the Compositionality of Intuitive Functions
Title | Probing the Compositionality of Intuitive Functions |
Authors | Eric Schulz, Josh Tenenbaum, David K. Duvenaud, Maarten Speekenbrink, Samuel J. Gershman |
Abstract | How do people learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is accomplished by harnessing compositionality: complex structure is decomposed into simpler building blocks. We formalize this idea within the framework of Bayesian regression using a grammar over Gaussian process kernels. We show that participants prefer compositional over non-compositional function extrapolations, that samples from the human prior over functions are best described by a compositional model, and that people perceive compositional functions as more predictable than their non-compositional but otherwise similar counterparts. We argue that the compositional nature of intuitive functions is consistent with broad principles of human cognition. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6130-probing-the-compositionality-of-intuitive-functions |
http://papers.nips.cc/paper/6130-probing-the-compositionality-of-intuitive-functions.pdf | |
PWC | https://paperswithcode.com/paper/probing-the-compositionality-of-intuitive |
Repo | |
Framework | |
Applying the Cognitive Machine Translation Evaluation Approach to Arabic
Title | Applying the Cognitive Machine Translation Evaluation Approach to Arabic |
Authors | Irina Temnikova, Wajdi Zaghouani, Stephan Vogel, Nizar Habash |
Abstract | The goal of the cognitive machine translation (MT) evaluation approach is to build classifiers which assign post-editing effort scores to new texts. The approach helps estimate fair compensation for post-editors in the translation industry by evaluating the cognitive difficulty of post-editing MT output. The approach counts the number of errors classified in different categories on the basis of how much cognitive effort they require in order to be corrected. In this paper, we present the results of applying an existing cognitive evaluation approach to Modern Standard Arabic (MSA). We provide a comparison of the number of errors and categories of errors in three MSA texts of different MT quality (without any language-specific adaptation), as well as a comparison between MSA texts and texts from three Indo-European languages (Russian, Spanish, and Bulgarian), taken from a previous experiment. The results show how the error distributions change passing from the MSA texts of worse MT quality to MSA texts of better MT quality, as well as a similarity in distinguishing the texts of better MT quality for all four languages. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1578/ |
https://www.aclweb.org/anthology/L16-1578 | |
PWC | https://paperswithcode.com/paper/applying-the-cognitive-machine-translation |
Repo | |
Framework | |
Unsupervised morph segmentation and statistical language models for vocabulary expansion
Title | Unsupervised morph segmentation and statistical language models for vocabulary expansion |
Authors | Matti Varjokallio, Dietrich Klakow |
Abstract | |
Tasks | Language Modelling, Machine Translation, Optical Character Recognition, Speech Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2029/ |
https://www.aclweb.org/anthology/P16-2029 | |
PWC | https://paperswithcode.com/paper/unsupervised-morph-segmentation-and |
Repo | |
Framework | |
Deep multi-task learning with low level tasks supervised at lower layers
Title | Deep multi-task learning with low level tasks supervised at lower layers |
Authors | Anders S{\o}gaard, Yoav Goldberg |
Abstract | |
Tasks | CCG Supertagging, Chunking, Domain Adaptation, Multi-Task Learning |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2038/ |
https://www.aclweb.org/anthology/P16-2038 | |
PWC | https://paperswithcode.com/paper/deep-multi-task-learning-with-low-level-tasks |
Repo | |
Framework | |
Distributed Representations for Building Profiles of Users and Items from Text Reviews
Title | Distributed Representations for Building Profiles of Users and Items from Text Reviews |
Authors | Wenliang Chen, Zhenjie Zhang, Zhenghua Li, Min Zhang |
Abstract | In this paper, we propose an approach to learn distributed representations of users and items from text comments for recommendation systems. Traditional recommendation algorithms, e.g. collaborative filtering and matrix completion, are not designed to exploit the key information hidden in the text comments, while existing opinion mining methods do not provide direct support to recommendation systems with useful features on users and items. Our approach attempts to construct vectors to represent profiles of users and items under a unified framework to maximize word appearance likelihood. Then, the vector representations are used for a recommendation task in which we predict scores on unobserved user-item pairs without given texts. The recommendation-aware distributed representation approach is fully supported by effective and efficient learning algorithms over massive text archive. Our empirical evaluations on real datasets show that our system outperforms the state-of-the-art baseline systems. |
Tasks | Decision Making, Matrix Completion, Opinion Mining, Recommendation Systems |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1202/ |
https://www.aclweb.org/anthology/C16-1202 | |
PWC | https://paperswithcode.com/paper/distributed-representations-for-building |
Repo | |
Framework | |
Discovering Entity Knowledge Bases on the Web
Title | Discovering Entity Knowledge Bases on the Web |
Authors | Andrew Chisholm, Will Radford, Ben Hachey |
Abstract | |
Tasks | Entity Linking, Person Recognition |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-1302/ |
https://www.aclweb.org/anthology/W16-1302 | |
PWC | https://paperswithcode.com/paper/discovering-entity-knowledge-bases-on-the-web |
Repo | |
Framework | |
Hierarchical Permutation Complexity for Word Order Evaluation
Title | Hierarchical Permutation Complexity for Word Order Evaluation |
Authors | Milo{\v{s}} Stanojevi{'c}, Khalil Sima{'}an |
Abstract | Existing approaches for evaluating word order in machine translation work with metrics computed directly over a permutation of word positions in system output relative to a reference translation. However, every permutation factorizes into a permutation tree (PET) built of primal permutations, i.e., atomic units that do not factorize any further. In this paper we explore the idea that permutations factorizing into (on average) shorter primal permutations should represent simpler ordering as well. Consequently, we contribute Permutation Complexity, a class of metrics over PETs and their extension to forests, and define tight metrics, a sub-class of metrics implementing this idea. Subsequently we define example tight metrics and empirically test them in word order evaluation. Experiments on the WMT13 data sets for ten language pairs show that a tight metric is more often than not better than the baselines. |
Tasks | Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1204/ |
https://www.aclweb.org/anthology/C16-1204 | |
PWC | https://paperswithcode.com/paper/hierarchical-permutation-complexity-for-word |
Repo | |
Framework | |