May 5, 2019

1964 words 10 mins read

Paper Group NANR 51

Paper Group NANR 51

Expressions of Anxiety in Political Texts. Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests. On Generating Characteristic-rich Question Sets for QA Evaluation. Structured Sparse Regression via Greedy Hard Thresholding. Cross-lingual RDF Thesauri Interlinking. Detecting Cross-Cultura …

Expressions of Anxiety in Political Texts

Title Expressions of Anxiety in Political Texts
Authors Ludovic Rheault
Abstract
Tasks Decision Making, Decision Making Under Uncertainty, Semantic Textual Similarity
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5612/
PDF https://www.aclweb.org/anthology/W16-5612
PWC https://paperswithcode.com/paper/expressions-of-anxiety-in-political-texts
Repo
Framework

Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests

Title Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests
Authors Takuya Matsuzaki, Akira Fujita, Naoya Todo, Noriko H. Arai
Abstract This paper reports on an experiment where 795 human participants answered to the questions taken from second language proficiency tests that were translated to their native language. The output of three machine translation systems and two different human translations were used as the test material. We classified the translation errors in the questions according to an error taxonomy and analyzed the participants{'} response on the basis of the type and frequency of the translation errors. Through the analysis, we identified several types of errors that deteriorated most the accuracy of the participants{'} answers, their confidence on the answers, and their overall evaluation of the translation quality.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1440/
PDF https://www.aclweb.org/anthology/L16-1440
PWC https://paperswithcode.com/paper/translation-errors-and-incomprehensibility-a
Repo
Framework

On Generating Characteristic-rich Question Sets for QA Evaluation

Title On Generating Characteristic-rich Question Sets for QA Evaluation
Authors Yu Su, Huan Sun, Brian Sadler, Mudhakar Srivatsa, Izzeddin G{"u}r, Zenghui Yan, Xifeng Yan
Abstract
Tasks Question Answering
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1054/
PDF https://www.aclweb.org/anthology/D16-1054
PWC https://paperswithcode.com/paper/on-generating-characteristic-rich-question
Repo
Framework

Structured Sparse Regression via Greedy Hard Thresholding

Title Structured Sparse Regression via Greedy Hard Thresholding
Authors Prateek Jain, Nikhil Rao, Inderjit S. Dhillon
Abstract Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation $\geq 0.99$), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6425-structured-sparse-regression-via-greedy-hard-thresholding
PDF http://papers.nips.cc/paper/6425-structured-sparse-regression-via-greedy-hard-thresholding.pdf
PWC https://paperswithcode.com/paper/structured-sparse-regression-via-greedy-hard-1
Repo
Framework

Cross-lingual RDF Thesauri Interlinking

Title Cross-lingual RDF Thesauri Interlinking
Authors Tatiana Lesnikova, J{'e}r{^o}me David, J{'e}r{^o}me Euzenat
Abstract Various lexical resources are being published in RDF. To enhance the usability of these resources, identical resources in different data sets should be linked. If lexical resources are described in different natural languages, then techniques to deal with multilinguality are required for interlinking. In this paper, we evaluate machine translation for interlinking concepts, i.e., generic entities named with a common noun or term. In our previous work, the evaluated method has been applied on named entities. We conduct two experiments involving different thesauri in different languages. The first experiment involves concepts from the TheSoz multilingual thesaurus in three languages: English, French and German. The second experiment involves concepts from the EuroVoc and AGROVOC thesauri in English and Chinese respectively. Our results demonstrate that machine translation can be beneficial for cross-lingual thesauri interlinking independently of a dataset structure.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1387/
PDF https://www.aclweb.org/anthology/L16-1387
PWC https://paperswithcode.com/paper/cross-lingual-rdf-thesauri-interlinking
Repo
Framework

Detecting Cross-Cultural Differences Using a Multilingual Topic Model

Title Detecting Cross-Cultural Differences Using a Multilingual Topic Model
Authors E.D. Guti{'e}rrez, Ekaterina Shutova, Patricia Lichtenstein, Gerard de Melo, Luca Gilardi
Abstract Understanding cross-cultural differences has important implications for world affairs and many aspects of the life of society. Yet, the majority of text-mining methods to date focus on the analysis of monolingual texts. In contrast, we present a statistical model that simultaneously learns a set of common topics from multilingual, non-parallel data and automatically discovers the differences in perspectives on these topics across linguistic communities. We perform a behavioural evaluation of a subset of the differences identified by our model in English and Spanish to investigate their psychological validity.
Tasks Decision Making
Published 2016-01-01
URL https://www.aclweb.org/anthology/Q16-1004/
PDF https://www.aclweb.org/anthology/Q16-1004
PWC https://paperswithcode.com/paper/detecting-cross-cultural-differences-using-a
Repo
Framework

A Linear Baseline Classifier for Cross-Lingual Pronoun Prediction

Title A Linear Baseline Classifier for Cross-Lingual Pronoun Prediction
Authors J{"o}rg Tiedemann
Abstract
Tasks Feature Engineering, Language Modelling, Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2356/
PDF https://www.aclweb.org/anthology/W16-2356
PWC https://paperswithcode.com/paper/a-linear-baseline-classifier-for-cross
Repo
Framework

SuperCAT: The (New and Improved) Corpus Analysis Toolkit

Title SuperCAT: The (New and Improved) Corpus Analysis Toolkit
Authors K. Bretonnel Cohen, William A. Baumgartner Jr., Irina Temnikova
Abstract This paper reports SuperCAT, a corpus analysis toolkit. It is a radical extension of SubCAT, the Sublanguage Corpus Analysis Toolkit, from sublanguage analysis to corpus analysis in general. The idea behind SuperCAT is that representative corpora have no tendency towards closure―that is, they tend towards infinity. In contrast, non-representative corpora have a tendency towards closure―roughly, finiteness. SuperCAT focuses on general techniques for the quantitative description of the characteristics of any corpus (or other language sample), particularly concerning the characteristics of lexical distributions. Additionally, SuperCAT features a complete re-engineering of the previous SubCAT architecture.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1442/
PDF https://www.aclweb.org/anthology/L16-1442
PWC https://paperswithcode.com/paper/supercat-the-new-and-improved-corpus-analysis
Repo
Framework

Review on the Existing Language Resources for Languages of France

Title Review on the Existing Language Resources for Languages of France
Authors Thibault Grouas, Val{'e}rie Mapelli, Quentin Samier
Abstract With the support of the DGLFLF, ELDA conducted an inventory of existing language resources for the regional languages of France. The main aim of this inventory was to assess the exploitability of the identified resources within technologies. A total of 2,299 Language Resources were identified. As a second step, a deeper analysis of a set of three language groups (Breton, Occitan, overseas languages) was carried out along with a focus of their exploitability within three technologies: automatic translation, voice recognition/synthesis and spell checkers. The survey was followed by the organisation of the TLRF2015 Conference which aimed to present the state of the art in the field of the Technologies for Regional Languages of France. The next step will be to activate the network of specialists built up during the TLRF conference and to begin the organisation of a second TLRF conference. Meanwhile, the French Ministry of Culture continues its actions related to linguistic diversity and technology, in particular through a project with Wikimedia France related to contributions to Wikipedia in regional languages, the upcoming new version of the {``}Corpus de la Parole{''} and the reinforcement of the DGLFLF{'}s Observatory of Linguistic Practices. |
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1719/
PDF https://www.aclweb.org/anthology/L16-1719
PWC https://paperswithcode.com/paper/review-on-the-existing-language-resources-for
Repo
Framework

The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching

Title The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching
Authors Rouzbeh Shirvani, Mario Piergallini, Gauri Shankar Gautam, Mohamed Chouikha
Abstract
Tasks Language Identification
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5815/
PDF https://www.aclweb.org/anthology/W16-5815
PWC https://paperswithcode.com/paper/the-howard-university-system-submission-for
Repo
Framework

Detecting Optional Arguments of Verbs

Title Detecting Optional Arguments of Verbs
Authors Andr{'a}s Kornai, D{'a}vid M{'a}rk Nemeskey, G{'a}bor Recski
Abstract We propose a novel method for detecting optional arguments of Hungarian verbs using only positive data. We introduce a custom variant of collexeme analysis that explicitly models the noise in verb frames. Our method is, for the most part, unsupervised: we use the spectral clustering algorithm described in Brew and Schulte in Walde (2002) to build a noise model from a short, manually verified seed list of verbs. We experimented with both raw count- and context-based clusterings and found their performance almost identical. The code for our algorithm and the frame list are freely available at http://hlt.bme.hu/en/resources/tade.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1448/
PDF https://www.aclweb.org/anthology/L16-1448
PWC https://paperswithcode.com/paper/detecting-optional-arguments-of-verbs
Repo
Framework

Leveraging Native Data to Correct Preposition Errors in Learners’ Dutch

Title Leveraging Native Data to Correct Preposition Errors in Learners’ Dutch
Authors Lennart Kloppenburg, Malvina Nissim
Abstract We address the task of automatically correcting preposition errors in learners{'} Dutch by modelling preposition usage in native language. Specifically, we build two models exploiting a large corpus of Dutch. The first is a binary model for detecting whether a preposition should be used at all in a given position or not. The second is a multiclass model for selecting the appropriate preposition in case one should be used. The models are tested on native as well as learners data. For the latter we exploit a crowdsourcing strategy to elicit native judgements. On native test data the models perform very well, showing that we can model preposition usage appropriately. However, the evaluation on learners{'} data shows that while detecting that a given preposition is wrong is doable reasonably well, detecting the absence of a preposition is a lot more difficult. Observing such results and the data we deal with, we envisage various ways of improving performance, and report them in the final section of this article.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1449/
PDF https://www.aclweb.org/anthology/L16-1449
PWC https://paperswithcode.com/paper/leveraging-native-data-to-correct-preposition
Repo
Framework

A Preliminary Study of Disputation Behavior in Online Debating Forum

Title A Preliminary Study of Disputation Behavior in Online Debating Forum
Authors Zhongyu Wei, Y Xia, i, Chen Li, Yang Liu, Zachary Stallbohm, Yi Li, Yang Jin
Abstract
Tasks Argument Mining
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2820/
PDF https://www.aclweb.org/anthology/W16-2820
PWC https://paperswithcode.com/paper/a-preliminary-study-of-disputation-behavior
Repo
Framework

Multilevel Annotation of Agreement and Disagreement in Italian News Blogs

Title Multilevel Annotation of Agreement and Disagreement in Italian News Blogs
Authors Fabio Celli, Giuseppe Riccardi, Firoj Alam
Abstract In this paper, we present a corpus of news blog conversations in Italian annotated with gold standard agreement/disagreement relations at message and sentence levels. This is the first resource of this kind in Italian. From the analysis of ADRs at the two levels emerged that agreement annotated at message level is consistent and generally reflected at sentence level, moreover, the argumentation structure of disagreement is more complex than agreement. The manual error analysis revealed that this resource is useful not only for the analysis of argumentation, but also for the detection of irony/sarcasm in online debates. The corpus and annotation tool are available for research purposes on request.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1451/
PDF https://www.aclweb.org/anthology/L16-1451
PWC https://paperswithcode.com/paper/multilevel-annotation-of-agreement-and
Repo
Framework

Discontinuous Verb Phrases in Parsing and Machine Translation of English and German

Title Discontinuous Verb Phrases in Parsing and Machine Translation of English and German
Authors Sharid Lo{'a}iciga, Kristina Gulordava
Abstract In this paper, we focus on the verb-particle (V-Prt) split construction in English and German and its difficulty for parsing and Machine Translation (MT). For German, we use an existing test suite of V-Prt split constructions, while for English, we build a new and comparable test suite from raw data. These two data sets are then used to perform an analysis of errors in dependency parsing, word-level alignment and MT, which arise from the discontinuous order in V-Prt split constructions. In the automatic alignments of parallel corpora, most of the particles align to NULL. These mis-alignments and the inability of phrase-based MT system to recover discontinuous phrases result in low quality translations of V-Prt split constructions both in English and German. However, our results show that the V-Prt split phrases are correctly parsed in 90{%} of cases, suggesting that syntactic-based MT should perform better on these constructions. We evaluate a syntactic-based MT system on German and compare its performance to the phrase-based system.
Tasks Dependency Parsing, Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1453/
PDF https://www.aclweb.org/anthology/L16-1453
PWC https://paperswithcode.com/paper/discontinuous-verb-phrases-in-parsing-and
Repo
Framework
comments powered by Disqus