May 5, 2019

1694 words 8 mins read

Paper Group NANR 138

Paper Group NANR 138

Estimating the amenibility of new domains for deception detection. Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading. Joint Learning of Local and Global Features for Entity Linking via Neural Networks. Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ. Multiword Expressions D …

Estimating the amenibility of new domains for deception detection

Title Estimating the amenibility of new domains for deception detection
Authors Eileen Fitzpatrick, Joan Bachenko
Abstract
Tasks Deception Detection
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0804/
PDF https://www.aclweb.org/anthology/W16-0804
PWC https://paperswithcode.com/paper/estimating-the-amenibility-of-new-domains-for
Repo
Framework

Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading

Title Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading
Authors Ulrike Pad{'o}
Abstract Automated short-answer grading is key to help close the automation loop for large-scale, computerised testing in education. A wide range of features on different levels of linguistic processing has been proposed so far. We investigate the relative importance of the different types of features across a range of standard corpora (both from a language skill and content assessment context, in English and in German). We find that features on the lexical, text similarity and dependency level often suffice to approximate full-model performance. Features derived from semantic processing particularly benefit the linguistically more varied answers in content assessment corpora.
Tasks Natural Language Inference
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1206/
PDF https://www.aclweb.org/anthology/C16-1206
PWC https://paperswithcode.com/paper/get-semantic-with-me-the-usefulness-of
Repo
Framework

Joint Learning of Local and Global Features for Entity Linking via Neural Networks

Title Joint Learning of Local and Global Features for Entity Linking via Neural Networks
Authors Thien Huu Nguyen, Nicolas Fauceglia, Mariano Rodriguez Muro, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo, Mohammad Sadoghi
Abstract Previous studies have highlighted the necessity for entity linking systems to capture the local entity-mention similarities and the global topical coherence. We introduce a novel framework based on convolutional neural networks and recurrent neural networks to simultaneously model the local and global features for entity linking. The proposed model benefits from the capacity of convolutional neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. Our evaluation on multiple datasets demonstrates the effectiveness of the model and yields the state-of-the-art performance on such datasets. In addition, we examine the entity linking systems on the domain adaptation setting that further demonstrates the cross-domain robustness of the proposed model.
Tasks Domain Adaptation, Entity Linking
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1218/
PDF https://www.aclweb.org/anthology/C16-1218
PWC https://paperswithcode.com/paper/joint-learning-of-local-and-global-features
Repo
Framework

Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ

Title Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ
Authors David Wible, Nai-Lung Tsao
Abstract Adult second language learners face the daunting but underappreciated task of mastering patterns of language use that are neither products of fully productive grammar rules nor frozen items to be memorized. Word Midas, a web browser extention, targets this uncharted territory of lexicogrammar by detecting multiword tokens of lexicogrammatical patterning in real time in situ within the noisy digital texts from the user{'}s unscripted web browsing or other digital venues. The language model powering Word Midas is StringNet, a densely cross-indexed navigable network of one billion lexicogrammatical patterns of English. These resources are described and their functionality is illustrated with a detailed scenario.
Tasks Language Modelling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2005/
PDF https://www.aclweb.org/anthology/C16-2005
PWC https://paperswithcode.com/paper/word-midas-powered-by-stringnet-discovering
Repo
Framework

Multiword Expressions Dataset for Indian Languages

Title Multiword Expressions Dataset for Indian Languages
Authors Dhirendra Singh, Sudha Bhingardive, Pushpak Bhattacharyya
Abstract Multiword Expressions (MWEs) are used frequently in natural languages, but understanding the diversity in MWEs is one of the open problem in the area of Natural Language Processing. In the context of Indian languages, MWEs play an important role. In this paper, we present MWEs annotation dataset created for Indian languages viz., Hindi and Marathi. We extract possible MWE candidates using two repositories: 1) the POS-tagged corpus and 2) the IndoWordNet synsets. Annotation is done for two types of MWEs: compound nouns and light verb constructions. In the process of annotation, human annotators tag valid MWEs from these candidates based on the standard guidelines provided to them. We obtained 3178 compound nouns and 2556 light verb constructions in Hindi and 1003 compound nouns and 2416 light verb constructions in Marathi using two repositories mentioned before. This created resource is made available publicly and can be used as a gold standard for Hindi and Marathi MWE systems.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1369/
PDF https://www.aclweb.org/anthology/L16-1369
PWC https://paperswithcode.com/paper/multiword-expressions-dataset-for-indian
Repo
Framework

A Corpus of Images and Text in Online News

Title A Corpus of Images and Text in Online News
Authors Laura Hollink, Adriatik Bedjeti, Martin van Harmelen, Desmond Elliott
Abstract In recent years, several datasets have been released that include images and text, giving impulse to new methods that combine natural language processing and computer vision. However, there is a need for datasets of images in their natural textual context. The ION corpus contains 300K news articles published between August 2014 - 2015 in five online newspapers from two countries. The 1-year coverage over multiple publishers ensures a broad scope in terms of topics, image quality and editorial viewpoints. The corpus consists of JSON-LD files with the following data about each article: the original URL of the article on the news publisher{'}s website, the date of publication, the headline of the article, the URL of the image displayed with the article (if any), and the caption of that image. Neither the article text nor the images themselves are included in the corpus. Instead, the images are distributed as high-dimensional feature vectors extracted from a Convolutional Neural Network, anticipating their use in computer vision tasks. The article text is represented as a list of automatically generated entity and topic annotations in the form of Wikipedia/DBpedia pages. This facilitates the selection of subsets of the corpus for separate analysis or evaluation.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1219/
PDF https://www.aclweb.org/anthology/L16-1219
PWC https://paperswithcode.com/paper/a-corpus-of-images-and-text-in-online-news
Repo
Framework

SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task

Title SDP-JAIST: A Shallow Discourse Parsing system @ CoNLL 2016 Shared Task
Authors Minh Nguyen
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-2020/
PDF https://www.aclweb.org/anthology/K16-2020
PWC https://paperswithcode.com/paper/sdp-jaist-a-shallow-discourse-parsing-system
Repo
Framework

A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions

Title A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions
Authors Ryohei Sasano, Manabu Okumura
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1211/
PDF https://www.aclweb.org/anthology/P16-1211
PWC https://paperswithcode.com/paper/a-corpus-based-analysis-of-canonical-word
Repo
Framework

ArchiMob - A Corpus of Spoken Swiss German

Title ArchiMob - A Corpus of Spoken Swiss German
Authors Tanja Samard{\v{z}}i{'c}, Yves Scherrer, Elvira Glaser
Abstract Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety rarely recorded and that it is subject to considerable regional variation. This paper presents a freely available general-purpose corpus of spoken Swiss German suitable for linguistic research, but also for training automatic tools. The corpus is a result of a long design process, intensive manual work and specially adapted computational processing. We first describe how the documents were transcribed, segmented and aligned with the sound source, and how inconsistent transcriptions were unified through an additional normalisation layer. We then present a bootstrapping approach to automatic normalisation using different machine-translation-inspired methods. Furthermore, we evaluate the performance of part-of-speech taggers on our data and show how the same bootstrapping approach improves part-of-speech tagging by 10{%} over four rounds. Finally, we present the modalities of access of the corpus as well as the data format.
Tasks Machine Translation, Part-Of-Speech Tagging
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1641/
PDF https://www.aclweb.org/anthology/L16-1641
PWC https://paperswithcode.com/paper/archimob-a-corpus-of-spoken-swiss-german
Repo
Framework

Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation

Title Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation
Authors Debajyoty Banik, Sukanta Sen, Asif Ekbal, Pushpak Bhattacharyya
Abstract
Tasks Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6303/
PDF https://www.aclweb.org/anthology/W16-6303
PWC https://paperswithcode.com/paper/can-smt-and-rbmt-improve-each-others
Repo
Framework

Using Contextual Information for Machine Translation Evaluation

Title Using Contextual Information for Machine Translation Evaluation
Authors Marina Fomicheva, N{'u}ria Bel
Abstract Automatic evaluation of Machine Translation (MT) is typically approached by measuring similarity between the candidate MT and a human reference translation. An important limitation of existing evaluation systems is that they are unable to distinguish candidate-reference differences that arise due to acceptable linguistic variation from the differences induced by MT errors. In this paper we present a new metric, UPF-Cobalt, that addresses this issue by taking into consideration the syntactic contexts of candidate and reference words. The metric applies a penalty when the words are similar but the contexts in which they occur are not equivalent. In this way, Machine Translations (MTs) that are different from the human translation but still essentially correct are distinguished from those that share high number of words with the reference but alter the meaning of the sentence due to translation errors. The results show that the method proposed is indeed beneficial for automatic MT evaluation. We report experiments based on two different evaluation tasks with various types of manual quality assessment. The metric significantly outperforms state-of-the-art evaluation systems in varying evaluation settings.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1437/
PDF https://www.aclweb.org/anthology/L16-1437
PWC https://paperswithcode.com/paper/using-contextual-information-for-machine
Repo
Framework

AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons

Title AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons
Authors Nora Al-Twairesh, Hend Al-Khalifa, Abdulmalik Al-Salman
Abstract
Tasks Sentiment Analysis
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1066/
PDF https://www.aclweb.org/anthology/P16-1066
PWC https://paperswithcode.com/paper/arasenti-large-scale-twitter-specific-arabic
Repo
Framework

Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs

Title Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs
Authors Sina Zarrie{\ss}, David Schlangen
Abstract
Tasks Text Generation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1058/
PDF https://www.aclweb.org/anthology/P16-1058
PWC https://paperswithcode.com/paper/easy-things-first-installments-improve
Repo
Framework

On the verbalization patterns of part-whole relations in isiZulu

Title On the verbalization patterns of part-whole relations in isiZulu
Authors C. Maria Keet, Langa Khumalo
Abstract
Tasks Speech Recognition, Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-6629/
PDF https://www.aclweb.org/anthology/W16-6629
PWC https://paperswithcode.com/paper/on-the-verbalization-patterns-of-part-whole
Repo
Framework

Learning Non-Linear Functions for Text Classification

Title Learning Non-Linear Functions for Text Classification
Authors Cohan Sujay Carlos, Geetanjali Rakshit
Abstract
Tasks Text Categorization, Text Classification
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6327/
PDF https://www.aclweb.org/anthology/W16-6327
PWC https://paperswithcode.com/paper/learning-non-linear-functions-for-text
Repo
Framework
comments powered by Disqus