May 5, 2019

1964 words 10 mins read

Paper Group NANR 51

Expressions of Anxiety in Political Texts. Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests. On Generating Characteristic-rich Question Sets for QA Evaluation. Structured Sparse Regression via Greedy Hard Thresholding. Cross-lingual RDF Thesauri Interlinking. Detecting Cross-Cultura …

Expressions of Anxiety in Political Texts


Title	Expressions of Anxiety in Political Texts
Authors	Ludovic Rheault
Abstract
Tasks	Decision Making, Decision Making Under Uncertainty, Semantic Textual Similarity
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-5612/
PDF	https://www.aclweb.org/anthology/W16-5612
PWC	https://paperswithcode.com/paper/expressions-of-anxiety-in-political-texts
Repo
Framework

Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests


Title	Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests
Authors	Takuya Matsuzaki, Akira Fujita, Naoya Todo, Noriko H. Arai
Abstract	This paper reports on an experiment where 795 human participants answered to the questions taken from second language proficiency tests that were translated to their native language. The output of three machine translation systems and two different human translations were used as the test material. We classified the translation errors in the questions according to an error taxonomy and analyzed the participants{'} response on the basis of the type and frequency of the translation errors. Through the analysis, we identified several types of errors that deteriorated most the accuracy of the participants{'} answers, their confidence on the answers, and their overall evaluation of the translation quality.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1440/
PDF	https://www.aclweb.org/anthology/L16-1440
PWC	https://paperswithcode.com/paper/translation-errors-and-incomprehensibility-a
Repo
Framework

On Generating Characteristic-rich Question Sets for QA Evaluation


Title	On Generating Characteristic-rich Question Sets for QA Evaluation
Authors	Yu Su, Huan Sun, Brian Sadler, Mudhakar Srivatsa, Izzeddin G{"u}r, Zenghui Yan, Xifeng Yan
Abstract
Tasks	Question Answering
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1054/
PDF	https://www.aclweb.org/anthology/D16-1054
PWC	https://paperswithcode.com/paper/on-generating-characteristic-rich-question
Repo
Framework

Structured Sparse Regression via Greedy Hard Thresholding


Title	Structured Sparse Regression via Greedy Hard Thresholding
Authors	Prateek Jain, Nikhil Rao, Inderjit S. Dhillon
Abstract	Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation $\geq 0.99$), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6425-structured-sparse-regression-via-greedy-hard-thresholding
PDF	http://papers.nips.cc/paper/6425-structured-sparse-regression-via-greedy-hard-thresholding.pdf
PWC	https://paperswithcode.com/paper/structured-sparse-regression-via-greedy-hard-1
Repo
Framework

Cross-lingual RDF Thesauri Interlinking


Title	Cross-lingual RDF Thesauri Interlinking
Authors	Tatiana Lesnikova, J{'e}r{^o}me David, J{'e}r{^o}me Euzenat
Abstract	Various lexical resources are being published in RDF. To enhance the usability of these resources, identical resources in different data sets should be linked. If lexical resources are described in different natural languages, then techniques to deal with multilinguality are required for interlinking. In this paper, we evaluate machine translation for interlinking concepts, i.e., generic entities named with a common noun or term. In our previous work, the evaluated method has been applied on named entities. We conduct two experiments involving different thesauri in different languages. The first experiment involves concepts from the TheSoz multilingual thesaurus in three languages: English, French and German. The second experiment involves concepts from the EuroVoc and AGROVOC thesauri in English and Chinese respectively. Our results demonstrate that machine translation can be beneficial for cross-lingual thesauri interlinking independently of a dataset structure.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1387/
PDF	https://www.aclweb.org/anthology/L16-1387
PWC	https://paperswithcode.com/paper/cross-lingual-rdf-thesauri-interlinking
Repo
Framework

Detecting Cross-Cultural Differences Using a Multilingual Topic Model


Title	Detecting Cross-Cultural Differences Using a Multilingual Topic Model
Authors	E.D. Guti{'e}rrez, Ekaterina Shutova, Patricia Lichtenstein, Gerard de Melo, Luca Gilardi
Abstract	Understanding cross-cultural differences has important implications for world affairs and many aspects of the life of society. Yet, the majority of text-mining methods to date focus on the analysis of monolingual texts. In contrast, we present a statistical model that simultaneously learns a set of common topics from multilingual, non-parallel data and automatically discovers the differences in perspectives on these topics across linguistic communities. We perform a behavioural evaluation of a subset of the differences identified by our model in English and Spanish to investigate their psychological validity.
Tasks	Decision Making
Published	2016-01-01
URL	https://www.aclweb.org/anthology/Q16-1004/
PDF	https://www.aclweb.org/anthology/Q16-1004
PWC	https://paperswithcode.com/paper/detecting-cross-cultural-differences-using-a
Repo
Framework

A Linear Baseline Classifier for Cross-Lingual Pronoun Prediction


Title	A Linear Baseline Classifier for Cross-Lingual Pronoun Prediction
Authors	J{"o}rg Tiedemann
Abstract
Tasks	Feature Engineering, Language Modelling, Machine Translation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2356/
PDF	https://www.aclweb.org/anthology/W16-2356
PWC	https://paperswithcode.com/paper/a-linear-baseline-classifier-for-cross
Repo
Framework

SuperCAT: The (New and Improved) Corpus Analysis Toolkit


Title	SuperCAT: The (New and Improved) Corpus Analysis Toolkit
Authors	K. Bretonnel Cohen, William A. Baumgartner Jr., Irina Temnikova
Abstract	This paper reports SuperCAT, a corpus analysis toolkit. It is a radical extension of SubCAT, the Sublanguage Corpus Analysis Toolkit, from sublanguage analysis to corpus analysis in general. The idea behind SuperCAT is that representative corpora have no tendency towards closure―that is, they tend towards infinity. In contrast, non-representative corpora have a tendency towards closure―roughly, finiteness. SuperCAT focuses on general techniques for the quantitative description of the characteristics of any corpus (or other language sample), particularly concerning the characteristics of lexical distributions. Additionally, SuperCAT features a complete re-engineering of the previous SubCAT architecture.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1442/
PDF	https://www.aclweb.org/anthology/L16-1442
PWC	https://paperswithcode.com/paper/supercat-the-new-and-improved-corpus-analysis
Repo
Framework

Review on the Existing Language Resources for Languages of France


Title	Review on the Existing Language Resources for Languages of France
Authors	Thibault Grouas, Val{'e}rie Mapelli, Quentin Samier
Abstract	With the support of the DGLFLF, ELDA conducted an inventory of existing language resources for the regional languages of France. The main aim of this inventory was to assess the exploitability of the identified resources within technologies. A total of 2,299 Language Resources were identified. As a second step, a deeper analysis of a set of three language groups (Breton, Occitan, overseas languages) was carried out along with a focus of their exploitability within three technologies: automatic translation, voice recognition/synthesis and spell checkers. The survey was followed by the organisation of the TLRF2015 Conference which aimed to present the state of the art in the field of the Technologies for Regional Languages of France. The next step will be to activate the network of specialists built up during the TLRF conference and to begin the organisation of a second TLRF conference. Meanwhile, the French Ministry of Culture continues its actions related to linguistic diversity and technology, in particular through a project with Wikimedia France related to contributions to Wikipedia in regional languages, the upcoming new version of the {``}Corpus de la Parole{''} and the reinforcement of the DGLFLF{'}s Observatory of Linguistic Practices. \|
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1719/
PDF	https://www.aclweb.org/anthology/L16-1719
PWC	https://paperswithcode.com/paper/review-on-the-existing-language-resources-for
Repo
Framework

The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching


Title	The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching
Authors	Rouzbeh Shirvani, Mario Piergallini, Gauri Shankar Gautam, Mohamed Chouikha
Abstract
Tasks	Language Identification
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-5815/
PDF	https://www.aclweb.org/anthology/W16-5815
PWC	https://paperswithcode.com/paper/the-howard-university-system-submission-for
Repo
Framework

Detecting Optional Arguments of Verbs


Title	Detecting Optional Arguments of Verbs
Authors	Andr{'a}s Kornai, D{'a}vid M{'a}rk Nemeskey, G{'a}bor Recski
Abstract	We propose a novel method for detecting optional arguments of Hungarian verbs using only positive data. We introduce a custom variant of collexeme analysis that explicitly models the noise in verb frames. Our method is, for the most part, unsupervised: we use the spectral clustering algorithm described in Brew and Schulte in Walde (2002) to build a noise model from a short, manually verified seed list of verbs. We experimented with both raw count- and context-based clusterings and found their performance almost identical. The code for our algorithm and the frame list are freely available at http://hlt.bme.hu/en/resources/tade.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1448/
PDF	https://www.aclweb.org/anthology/L16-1448
PWC	https://paperswithcode.com/paper/detecting-optional-arguments-of-verbs
Repo
Framework

Leveraging Native Data to Correct Preposition Errors in Learners’ Dutch


Title	Leveraging Native Data to Correct Preposition Errors in Learners’ Dutch
Authors	Lennart Kloppenburg, Malvina Nissim
Abstract	We address the task of automatically correcting preposition errors in learners{'} Dutch by modelling preposition usage in native language. Specifically, we build two models exploiting a large corpus of Dutch. The first is a binary model for detecting whether a preposition should be used at all in a given position or not. The second is a multiclass model for selecting the appropriate preposition in case one should be used. The models are tested on native as well as learners data. For the latter we exploit a crowdsourcing strategy to elicit native judgements. On native test data the models perform very well, showing that we can model preposition usage appropriately. However, the evaluation on learners{'} data shows that while detecting that a given preposition is wrong is doable reasonably well, detecting the absence of a preposition is a lot more difficult. Observing such results and the data we deal with, we envisage various ways of improving performance, and report them in the final section of this article.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1449/
PDF	https://www.aclweb.org/anthology/L16-1449
PWC	https://paperswithcode.com/paper/leveraging-native-data-to-correct-preposition
Repo
Framework

A Preliminary Study of Disputation Behavior in Online Debating Forum


Title	A Preliminary Study of Disputation Behavior in Online Debating Forum
Authors	Zhongyu Wei, Y Xia, i, Chen Li, Yang Liu, Zachary Stallbohm, Yi Li, Yang Jin
Abstract
Tasks	Argument Mining
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2820/
PDF	https://www.aclweb.org/anthology/W16-2820
PWC	https://paperswithcode.com/paper/a-preliminary-study-of-disputation-behavior
Repo
Framework

Multilevel Annotation of Agreement and Disagreement in Italian News Blogs


Title	Multilevel Annotation of Agreement and Disagreement in Italian News Blogs
Authors	Fabio Celli, Giuseppe Riccardi, Firoj Alam
Abstract	In this paper, we present a corpus of news blog conversations in Italian annotated with gold standard agreement/disagreement relations at message and sentence levels. This is the first resource of this kind in Italian. From the analysis of ADRs at the two levels emerged that agreement annotated at message level is consistent and generally reflected at sentence level, moreover, the argumentation structure of disagreement is more complex than agreement. The manual error analysis revealed that this resource is useful not only for the analysis of argumentation, but also for the detection of irony/sarcasm in online debates. The corpus and annotation tool are available for research purposes on request.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1451/
PDF	https://www.aclweb.org/anthology/L16-1451
PWC	https://paperswithcode.com/paper/multilevel-annotation-of-agreement-and
Repo
Framework

Discontinuous Verb Phrases in Parsing and Machine Translation of English and German


Title	Discontinuous Verb Phrases in Parsing and Machine Translation of English and German
Authors	Sharid Lo{'a}iciga, Kristina Gulordava
Abstract	In this paper, we focus on the verb-particle (V-Prt) split construction in English and German and its difficulty for parsing and Machine Translation (MT). For German, we use an existing test suite of V-Prt split constructions, while for English, we build a new and comparable test suite from raw data. These two data sets are then used to perform an analysis of errors in dependency parsing, word-level alignment and MT, which arise from the discontinuous order in V-Prt split constructions. In the automatic alignments of parallel corpora, most of the particles align to NULL. These mis-alignments and the inability of phrase-based MT system to recover discontinuous phrases result in low quality translations of V-Prt split constructions both in English and German. However, our results show that the V-Prt split phrases are correctly parsed in 90{%} of cases, suggesting that syntactic-based MT should perform better on these constructions. We evaluate a syntactic-based MT system on German and compare its performance to the phrase-based system.
Tasks	Dependency Parsing, Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1453/
PDF	https://www.aclweb.org/anthology/L16-1453
PWC	https://paperswithcode.com/paper/discontinuous-verb-phrases-in-parsing-and
Repo
Framework