Paper Group NANR 138
Estimating the amenibility of new domains for deception detection. Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading. Joint Learning of Local and Global Features for Entity Linking via Neural Networks. Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ. Multiword Expressions D …
Estimating the amenibility of new domains for deception detection
Title | Estimating the amenibility of new domains for deception detection |
Authors | Eileen Fitzpatrick, Joan Bachenko |
Abstract | |
Tasks | Deception Detection |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0804/ |
https://www.aclweb.org/anthology/W16-0804 | |
PWC | https://paperswithcode.com/paper/estimating-the-amenibility-of-new-domains-for |
Repo | |
Framework | |
Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading
Title | Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading |
Authors | Ulrike Pad{'o} |
Abstract | Automated short-answer grading is key to help close the automation loop for large-scale, computerised testing in education. A wide range of features on different levels of linguistic processing has been proposed so far. We investigate the relative importance of the different types of features across a range of standard corpora (both from a language skill and content assessment context, in English and in German). We find that features on the lexical, text similarity and dependency level often suffice to approximate full-model performance. Features derived from semantic processing particularly benefit the linguistically more varied answers in content assessment corpora. |
Tasks | Natural Language Inference |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1206/ |
https://www.aclweb.org/anthology/C16-1206 | |
PWC | https://paperswithcode.com/paper/get-semantic-with-me-the-usefulness-of |
Repo | |
Framework | |
Joint Learning of Local and Global Features for Entity Linking via Neural Networks
Title | Joint Learning of Local and Global Features for Entity Linking via Neural Networks |
Authors | Thien Huu Nguyen, Nicolas Fauceglia, Mariano Rodriguez Muro, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo, Mohammad Sadoghi |
Abstract | Previous studies have highlighted the necessity for entity linking systems to capture the local entity-mention similarities and the global topical coherence. We introduce a novel framework based on convolutional neural networks and recurrent neural networks to simultaneously model the local and global features for entity linking. The proposed model benefits from the capacity of convolutional neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. Our evaluation on multiple datasets demonstrates the effectiveness of the model and yields the state-of-the-art performance on such datasets. In addition, we examine the entity linking systems on the domain adaptation setting that further demonstrates the cross-domain robustness of the proposed model. |
Tasks | Domain Adaptation, Entity Linking |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1218/ |
https://www.aclweb.org/anthology/C16-1218 | |
PWC | https://paperswithcode.com/paper/joint-learning-of-local-and-global-features |
Repo | |
Framework | |
Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ
Title | Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ |
Authors | David Wible, Nai-Lung Tsao |
Abstract | Adult second language learners face the daunting but underappreciated task of mastering patterns of language use that are neither products of fully productive grammar rules nor frozen items to be memorized. Word Midas, a web browser extention, targets this uncharted territory of lexicogrammar by detecting multiword tokens of lexicogrammatical patterning in real time in situ within the noisy digital texts from the user{'}s unscripted web browsing or other digital venues. The language model powering Word Midas is StringNet, a densely cross-indexed navigable network of one billion lexicogrammatical patterns of English. These resources are described and their functionality is illustrated with a detailed scenario. |
Tasks | Language Modelling |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2005/ |
https://www.aclweb.org/anthology/C16-2005 | |
PWC | https://paperswithcode.com/paper/word-midas-powered-by-stringnet-discovering |
Repo | |
Framework | |
Multiword Expressions Dataset for Indian Languages
Title | Multiword Expressions Dataset for Indian Languages |
Authors | Dhirendra Singh, Sudha Bhingardive, Pushpak Bhattacharyya |
Abstract | Multiword Expressions (MWEs) are used frequently in natural languages, but understanding the diversity in MWEs is one of the open problem in the area of Natural Language Processing. In the context of Indian languages, MWEs play an important role. In this paper, we present MWEs annotation dataset created for Indian languages viz., Hindi and Marathi. We extract possible MWE candidates using two repositories: 1) the POS-tagged corpus and 2) the IndoWordNet synsets. Annotation is done for two types of MWEs: compound nouns and light verb constructions. In the process of annotation, human annotators tag valid MWEs from these candidates based on the standard guidelines provided to them. We obtained 3178 compound nouns and 2556 light verb constructions in Hindi and 1003 compound nouns and 2416 light verb constructions in Marathi using two repositories mentioned before. This created resource is made available publicly and can be used as a gold standard for Hindi and Marathi MWE systems. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1369/ |
https://www.aclweb.org/anthology/L16-1369 | |
PWC | https://paperswithcode.com/paper/multiword-expressions-dataset-for-indian |
Repo | |
Framework | |
A Corpus of Images and Text in Online News
Title | A Corpus of Images and Text in Online News |
Authors | Laura Hollink, Adriatik Bedjeti, Martin van Harmelen, Desmond Elliott |
Abstract | In recent years, several datasets have been released that include images and text, giving impulse to new methods that combine natural language processing and computer vision. However, there is a need for datasets of images in their natural textual context. The ION corpus contains 300K news articles published between August 2014 - 2015 in five online newspapers from two countries. The 1-year coverage over multiple publishers ensures a broad scope in terms of topics, image quality and editorial viewpoints. The corpus consists of JSON-LD files with the following data about each article: the original URL of the article on the news publisher{'}s website, the date of publication, the headline of the article, the URL of the image displayed with the article (if any), and the caption of that image. Neither the article text nor the images themselves are included in the corpus. Instead, the images are distributed as high-dimensional feature vectors extracted from a Convolutional Neural Network, anticipating their use in computer vision tasks. The article text is represented as a list of automatically generated entity and topic annotations in the form of Wikipedia/DBpedia pages. This facilitates the selection of subsets of the corpus for separate analysis or evaluation. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1219/ |
https://www.aclweb.org/anthology/L16-1219 | |
PWC | https://paperswithcode.com/paper/a-corpus-of-images-and-text-in-online-news |
Repo | |
Framework | |
SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task
Title | SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task |
Authors | Minh Nguyen |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-2020/ |
https://www.aclweb.org/anthology/K16-2020 | |
PWC | https://paperswithcode.com/paper/sdp-jaist-a-shallow-discourse-parsing-system |
Repo | |
Framework | |
A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions
Title | A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions |
Authors | Ryohei Sasano, Manabu Okumura |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1211/ |
https://www.aclweb.org/anthology/P16-1211 | |
PWC | https://paperswithcode.com/paper/a-corpus-based-analysis-of-canonical-word |
Repo | |
Framework | |
ArchiMob - A Corpus of Spoken Swiss German
Title | ArchiMob - A Corpus of Spoken Swiss German |
Authors | Tanja Samard{\v{z}}i{'c}, Yves Scherrer, Elvira Glaser |
Abstract | Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety rarely recorded and that it is subject to considerable regional variation. This paper presents a freely available general-purpose corpus of spoken Swiss German suitable for linguistic research, but also for training automatic tools. The corpus is a result of a long design process, intensive manual work and specially adapted computational processing. We first describe how the documents were transcribed, segmented and aligned with the sound source, and how inconsistent transcriptions were unified through an additional normalisation layer. We then present a bootstrapping approach to automatic normalisation using different machine-translation-inspired methods. Furthermore, we evaluate the performance of part-of-speech taggers on our data and show how the same bootstrapping approach improves part-of-speech tagging by 10{%} over four rounds. Finally, we present the modalities of access of the corpus as well as the data format. |
Tasks | Machine Translation, Part-Of-Speech Tagging |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1641/ |
https://www.aclweb.org/anthology/L16-1641 | |
PWC | https://paperswithcode.com/paper/archimob-a-corpus-of-spoken-swiss-german |
Repo | |
Framework | |
Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation
Title | Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation |
Authors | Debajyoty Banik, Sukanta Sen, Asif Ekbal, Pushpak Bhattacharyya |
Abstract | |
Tasks | Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-6303/ |
https://www.aclweb.org/anthology/W16-6303 | |
PWC | https://paperswithcode.com/paper/can-smt-and-rbmt-improve-each-others |
Repo | |
Framework | |
Using Contextual Information for Machine Translation Evaluation
Title | Using Contextual Information for Machine Translation Evaluation |
Authors | Marina Fomicheva, N{'u}ria Bel |
Abstract | Automatic evaluation of Machine Translation (MT) is typically approached by measuring similarity between the candidate MT and a human reference translation. An important limitation of existing evaluation systems is that they are unable to distinguish candidate-reference differences that arise due to acceptable linguistic variation from the differences induced by MT errors. In this paper we present a new metric, UPF-Cobalt, that addresses this issue by taking into consideration the syntactic contexts of candidate and reference words. The metric applies a penalty when the words are similar but the contexts in which they occur are not equivalent. In this way, Machine Translations (MTs) that are different from the human translation but still essentially correct are distinguished from those that share high number of words with the reference but alter the meaning of the sentence due to translation errors. The results show that the method proposed is indeed beneficial for automatic MT evaluation. We report experiments based on two different evaluation tasks with various types of manual quality assessment. The metric significantly outperforms state-of-the-art evaluation systems in varying evaluation settings. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1437/ |
https://www.aclweb.org/anthology/L16-1437 | |
PWC | https://paperswithcode.com/paper/using-contextual-information-for-machine |
Repo | |
Framework | |
AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons
Title | AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons |
Authors | Nora Al-Twairesh, Hend Al-Khalifa, Abdulmalik Al-Salman |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1066/ |
https://www.aclweb.org/anthology/P16-1066 | |
PWC | https://paperswithcode.com/paper/arasenti-large-scale-twitter-specific-arabic |
Repo | |
Framework | |
Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs
Title | Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs |
Authors | Sina Zarrie{\ss}, David Schlangen |
Abstract | |
Tasks | Text Generation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1058/ |
https://www.aclweb.org/anthology/P16-1058 | |
PWC | https://paperswithcode.com/paper/easy-things-first-installments-improve |
Repo | |
Framework | |
On the verbalization patterns of part-whole relations in isiZulu
Title | On the verbalization patterns of part-whole relations in isiZulu |
Authors | C. Maria Keet, Langa Khumalo |
Abstract | |
Tasks | Speech Recognition, Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-6629/ |
https://www.aclweb.org/anthology/W16-6629 | |
PWC | https://paperswithcode.com/paper/on-the-verbalization-patterns-of-part-whole |
Repo | |
Framework | |
Learning Non-Linear Functions for Text Classification
Title | Learning Non-Linear Functions for Text Classification |
Authors | Cohan Sujay Carlos, Geetanjali Rakshit |
Abstract | |
Tasks | Text Categorization, Text Classification |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-6327/ |
https://www.aclweb.org/anthology/W16-6327 | |
PWC | https://paperswithcode.com/paper/learning-non-linear-functions-for-text |
Repo | |
Framework | |