May 5, 2019

1694 words 8 mins read

Paper Group NANR 138

Estimating the amenibility of new domains for deception detection. Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading. Joint Learning of Local and Global Features for Entity Linking via Neural Networks. Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ. Multiword Expressions D …

Estimating the amenibility of new domains for deception detection


Title	Estimating the amenibility of new domains for deception detection
Authors	Eileen Fitzpatrick, Joan Bachenko
Abstract
Tasks	Deception Detection
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0804/
PDF	https://www.aclweb.org/anthology/W16-0804
PWC	https://paperswithcode.com/paper/estimating-the-amenibility-of-new-domains-for
Repo
Framework

Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading


Title	Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading
Authors	Ulrike Pad{'o}
Abstract	Automated short-answer grading is key to help close the automation loop for large-scale, computerised testing in education. A wide range of features on different levels of linguistic processing has been proposed so far. We investigate the relative importance of the different types of features across a range of standard corpora (both from a language skill and content assessment context, in English and in German). We find that features on the lexical, text similarity and dependency level often suffice to approximate full-model performance. Features derived from semantic processing particularly benefit the linguistically more varied answers in content assessment corpora.
Tasks	Natural Language Inference
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1206/
PDF	https://www.aclweb.org/anthology/C16-1206
PWC	https://paperswithcode.com/paper/get-semantic-with-me-the-usefulness-of
Repo
Framework

Joint Learning of Local and Global Features for Entity Linking via Neural Networks


Title	Joint Learning of Local and Global Features for Entity Linking via Neural Networks
Authors	Thien Huu Nguyen, Nicolas Fauceglia, Mariano Rodriguez Muro, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo, Mohammad Sadoghi
Abstract	Previous studies have highlighted the necessity for entity linking systems to capture the local entity-mention similarities and the global topical coherence. We introduce a novel framework based on convolutional neural networks and recurrent neural networks to simultaneously model the local and global features for entity linking. The proposed model benefits from the capacity of convolutional neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. Our evaluation on multiple datasets demonstrates the effectiveness of the model and yields the state-of-the-art performance on such datasets. In addition, we examine the entity linking systems on the domain adaptation setting that further demonstrates the cross-domain robustness of the proposed model.
Tasks	Domain Adaptation, Entity Linking
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1218/
PDF	https://www.aclweb.org/anthology/C16-1218
PWC	https://paperswithcode.com/paper/joint-learning-of-local-and-global-features
Repo
Framework

Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ


Title	Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ
Authors	David Wible, Nai-Lung Tsao
Abstract	Adult second language learners face the daunting but underappreciated task of mastering patterns of language use that are neither products of fully productive grammar rules nor frozen items to be memorized. Word Midas, a web browser extention, targets this uncharted territory of lexicogrammar by detecting multiword tokens of lexicogrammatical patterning in real time in situ within the noisy digital texts from the user{'}s unscripted web browsing or other digital venues. The language model powering Word Midas is StringNet, a densely cross-indexed navigable network of one billion lexicogrammatical patterns of English. These resources are described and their functionality is illustrated with a detailed scenario.
Tasks	Language Modelling
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2005/
PDF	https://www.aclweb.org/anthology/C16-2005
PWC	https://paperswithcode.com/paper/word-midas-powered-by-stringnet-discovering
Repo
Framework

Multiword Expressions Dataset for Indian Languages


Title	Multiword Expressions Dataset for Indian Languages
Authors	Dhirendra Singh, Sudha Bhingardive, Pushpak Bhattacharyya
Abstract	Multiword Expressions (MWEs) are used frequently in natural languages, but understanding the diversity in MWEs is one of the open problem in the area of Natural Language Processing. In the context of Indian languages, MWEs play an important role. In this paper, we present MWEs annotation dataset created for Indian languages viz., Hindi and Marathi. We extract possible MWE candidates using two repositories: 1) the POS-tagged corpus and 2) the IndoWordNet synsets. Annotation is done for two types of MWEs: compound nouns and light verb constructions. In the process of annotation, human annotators tag valid MWEs from these candidates based on the standard guidelines provided to them. We obtained 3178 compound nouns and 2556 light verb constructions in Hindi and 1003 compound nouns and 2416 light verb constructions in Marathi using two repositories mentioned before. This created resource is made available publicly and can be used as a gold standard for Hindi and Marathi MWE systems.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1369/
PDF	https://www.aclweb.org/anthology/L16-1369
PWC	https://paperswithcode.com/paper/multiword-expressions-dataset-for-indian
Repo
Framework

A Corpus of Images and Text in Online News


Title	A Corpus of Images and Text in Online News
Authors	Laura Hollink, Adriatik Bedjeti, Martin van Harmelen, Desmond Elliott
Abstract	In recent years, several datasets have been released that include images and text, giving impulse to new methods that combine natural language processing and computer vision. However, there is a need for datasets of images in their natural textual context. The ION corpus contains 300K news articles published between August 2014 - 2015 in five online newspapers from two countries. The 1-year coverage over multiple publishers ensures a broad scope in terms of topics, image quality and editorial viewpoints. The corpus consists of JSON-LD files with the following data about each article: the original URL of the article on the news publisher{'}s website, the date of publication, the headline of the article, the URL of the image displayed with the article (if any), and the caption of that image. Neither the article text nor the images themselves are included in the corpus. Instead, the images are distributed as high-dimensional feature vectors extracted from a Convolutional Neural Network, anticipating their use in computer vision tasks. The article text is represented as a list of automatically generated entity and topic annotations in the form of Wikipedia/DBpedia pages. This facilitates the selection of subsets of the corpus for separate analysis or evaluation.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1219/
PDF	https://www.aclweb.org/anthology/L16-1219
PWC	https://paperswithcode.com/paper/a-corpus-of-images-and-text-in-online-news
Repo
Framework

SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task


Title	SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task
Authors	Minh Nguyen
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-2020/
PDF	https://www.aclweb.org/anthology/K16-2020
PWC	https://paperswithcode.com/paper/sdp-jaist-a-shallow-discourse-parsing-system
Repo
Framework

A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions


Title	A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions
Authors	Ryohei Sasano, Manabu Okumura
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1211/
PDF	https://www.aclweb.org/anthology/P16-1211
PWC	https://paperswithcode.com/paper/a-corpus-based-analysis-of-canonical-word
Repo
Framework

ArchiMob - A Corpus of Spoken Swiss German


Title	ArchiMob - A Corpus of Spoken Swiss German
Authors	Tanja Samard{\v{z}}i{'c}, Yves Scherrer, Elvira Glaser
Abstract	Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety rarely recorded and that it is subject to considerable regional variation. This paper presents a freely available general-purpose corpus of spoken Swiss German suitable for linguistic research, but also for training automatic tools. The corpus is a result of a long design process, intensive manual work and specially adapted computational processing. We first describe how the documents were transcribed, segmented and aligned with the sound source, and how inconsistent transcriptions were unified through an additional normalisation layer. We then present a bootstrapping approach to automatic normalisation using different machine-translation-inspired methods. Furthermore, we evaluate the performance of part-of-speech taggers on our data and show how the same bootstrapping approach improves part-of-speech tagging by 10{%} over four rounds. Finally, we present the modalities of access of the corpus as well as the data format.
Tasks	Machine Translation, Part-Of-Speech Tagging
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1641/
PDF	https://www.aclweb.org/anthology/L16-1641
PWC	https://paperswithcode.com/paper/archimob-a-corpus-of-spoken-swiss-german
Repo
Framework

Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation


Title	Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation
Authors	Debajyoty Banik, Sukanta Sen, Asif Ekbal, Pushpak Bhattacharyya
Abstract
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6303/
PDF	https://www.aclweb.org/anthology/W16-6303
PWC	https://paperswithcode.com/paper/can-smt-and-rbmt-improve-each-others
Repo
Framework

Using Contextual Information for Machine Translation Evaluation


Title	Using Contextual Information for Machine Translation Evaluation
Authors	Marina Fomicheva, N{'u}ria Bel
Abstract	Automatic evaluation of Machine Translation (MT) is typically approached by measuring similarity between the candidate MT and a human reference translation. An important limitation of existing evaluation systems is that they are unable to distinguish candidate-reference differences that arise due to acceptable linguistic variation from the differences induced by MT errors. In this paper we present a new metric, UPF-Cobalt, that addresses this issue by taking into consideration the syntactic contexts of candidate and reference words. The metric applies a penalty when the words are similar but the contexts in which they occur are not equivalent. In this way, Machine Translations (MTs) that are different from the human translation but still essentially correct are distinguished from those that share high number of words with the reference but alter the meaning of the sentence due to translation errors. The results show that the method proposed is indeed beneficial for automatic MT evaluation. We report experiments based on two different evaluation tasks with various types of manual quality assessment. The metric significantly outperforms state-of-the-art evaluation systems in varying evaluation settings.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1437/
PDF	https://www.aclweb.org/anthology/L16-1437
PWC	https://paperswithcode.com/paper/using-contextual-information-for-machine
Repo
Framework

AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons


Title	AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons
Authors	Nora Al-Twairesh, Hend Al-Khalifa, Abdulmalik Al-Salman
Abstract
Tasks	Sentiment Analysis
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1066/
PDF	https://www.aclweb.org/anthology/P16-1066
PWC	https://paperswithcode.com/paper/arasenti-large-scale-twitter-specific-arabic
Repo
Framework

Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs


Title	Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs
Authors	Sina Zarrie{\ss}, David Schlangen
Abstract
Tasks	Text Generation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1058/
PDF	https://www.aclweb.org/anthology/P16-1058
PWC	https://paperswithcode.com/paper/easy-things-first-installments-improve
Repo
Framework

On the verbalization patterns of part-whole relations in isiZulu


Title	On the verbalization patterns of part-whole relations in isiZulu
Authors	C. Maria Keet, Langa Khumalo
Abstract
Tasks	Speech Recognition, Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-6629/
PDF	https://www.aclweb.org/anthology/W16-6629
PWC	https://paperswithcode.com/paper/on-the-verbalization-patterns-of-part-whole
Repo
Framework

Learning Non-Linear Functions for Text Classification


Title	Learning Non-Linear Functions for Text Classification
Authors	Cohan Sujay Carlos, Geetanjali Rakshit
Abstract
Tasks	Text Categorization, Text Classification
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6327/
PDF	https://www.aclweb.org/anthology/W16-6327
PWC	https://paperswithcode.com/paper/learning-non-linear-functions-for-text
Repo
Framework