Paper Group NANR 5
Orthogonality regularizer for question answering. Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers. Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario. Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies. Chatbot Technology with …
Orthogonality regularizer for question answering
Title | Orthogonality regularizer for question answering |
Authors | Chunyang Xiao, Guillaume Bouchard, Marc Dymetman, Claire Gardent |
Abstract | |
Tasks | Information Retrieval, Open-Domain Question Answering, Question Answering |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/S16-2019/ |
https://www.aclweb.org/anthology/S16-2019 | |
PWC | https://paperswithcode.com/paper/orthogonality-regularizer-for-question |
Repo | |
Framework | |
Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers
Title | Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers |
Authors | Marie Garnier, Patrick Saint-Dizier |
Abstract | In most international industries, English is the main language of communication for technical documents. These documents are designed to be as unambiguous as possible for their users. For international industries based in non-English speaking countries, the professionals in charge of writing requirements are often non-native speakers of English, who rarely receive adequate training in the use of English for this task. As a result, requirements can contain a relatively large diversity of lexical and grammatical errors, which are not eliminated by the use of guidelines from controlled languages. This article investigates the distribution of errors in a corpus of requirements written in English by native speakers of French. Errors are defined on the basis of grammaticality and acceptability principles, and classified using comparable categories. Results show a high proportion of errors in the Noun Phrase, notably through modifier stacking, and errors consistent with simplification strategies. Comparisons with similar corpora in other genres reveal the specificity of the distribution of errors in requirements. This research also introduces possible applied uses, in the form of strategies for the automatic detection of errors, and in-person training provided by certification boards in requirements authoring. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1029/ |
https://www.aclweb.org/anthology/L16-1029 | |
PWC | https://paperswithcode.com/paper/error-typology-and-remediation-strategies-for |
Repo | |
Framework | |
Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario
Title | Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario |
Authors | Lena Keiper, Andrea Horbach, Stefan Thater |
Abstract | We present a novel method to automatically improve the accurracy of part-of-speech taggers on learner language. The key idea underlying our approach is to exploit the structure of a typical language learner task and automatically induce POS information for out-of-vocabulary (OOV) words. To evaluate the effectiveness of our approach, we add manual POS and normalization information to an existing language learner corpus. Our evaluation shows an increase in accurracy from 72.4{%} to 81.5{%} on OOV words. |
Tasks | Reading Comprehension |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1030/ |
https://www.aclweb.org/anthology/L16-1030 | |
PWC | https://paperswithcode.com/paper/improving-pos-tagging-of-german-learner |
Repo | |
Framework | |
Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies
Title | Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies |
Authors | Kadri Muischnek, Kaili M{"u}{"u}risep, Tiina Puolakainen |
Abstract | This paper presents the first version of Estonian Universal Dependencies Treebank which has been semi-automatically acquired from Estonian Dependency Treebank and comprises ca 400,000 words (ca 30,000 sentences) representing the genres of fiction, newspapers and scientific writing. Article analyses the differences between two annotation schemes and the conversion procedure to Universal Dependencies format. The conversion has been conducted by manually created Constraint Grammar transfer rules. As the rules enable to consider unbounded context, include lexical information and both flat and tree structure features at the same time, the method has proved to be reliable and flexible enough to handle most of transformations. The automatic conversion procedure achieved LAS 95.2{%}, UAS 96.3{%} and LA 98.4{%}. If punctuation marks were excluded from the calculations, we observed LAS 96.4{%}, UAS 97.7{%} and LA 98.2{%}. Still the refinement of the guidelines and methodology is needed in order to re-annotate some syntactic phenomena, e.g. inter-clausal relations. Although automatic rules usually make quite a good guess even in obscure conditions, some relations should be checked and annotated manually after the main conversion. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1247/ |
https://www.aclweb.org/anthology/L16-1247 | |
PWC | https://paperswithcode.com/paper/estonian-dependency-treebank-from-constraint |
Repo | |
Framework | |
Chatbot Technology with Synthetic Voices in the Acquisition of an Endangered Language: Motivation, Development and Evaluation of a Platform for Irish
Title | Chatbot Technology with Synthetic Voices in the Acquisition of an Endangered Language: Motivation, Development and Evaluation of a Platform for Irish |
Authors | Neasa N{'\i} Chiar{'a}in, Ailbhe N{'\i} Chasaide |
Abstract | This paper describes the development and evaluation of a chatbot platform designed for the teaching/learning of Irish. The chatbot uses synthetic voices developed for the dialects of Irish. Speech-enabled chatbot technology offers a potentially powerful tool for dealing with the challenges of teaching/learning an endangered language where learners have limited access to native speaker models of the language and limited exposure to the language in a truly communicative setting. The sociolinguistic context that motivates the present development is explained. The evaluation of the chatbot was carried out in 13 schools by 228 pupils and consisted of two parts. Firstly, learners{'} opinions of the overall chatbot platform as a learning environment were elicited. Secondly, learners evaluated the intelligibility, quality, and attractiveness of the synthetic voices used in this platform. Results were overwhelmingly positive to both the learning platform and the synthetic voices and indicate that the time may now be ripe for language learning applications which exploit speech and language technologies. It is further argued that these technologies have a particularly vital role to play in the maintenance of the endangered language. |
Tasks | Chatbot |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1547/ |
https://www.aclweb.org/anthology/L16-1547 | |
PWC | https://paperswithcode.com/paper/chatbot-technology-with-synthetic-voices-in |
Repo | |
Framework | |
Read my points: Effect of animation type when speech-reading from EMA data
Title | Read my points: Effect of animation type when speech-reading from EMA data |
Authors | Kristy James, Martijn Wieling |
Abstract | |
Tasks | Motion Capture |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2014/ |
https://www.aclweb.org/anthology/W16-2014 | |
PWC | https://paperswithcode.com/paper/read-my-points-effect-of-animation-type-when |
Repo | |
Framework | |
Using longest common subsequence and character models to predict word forms
Title | Using longest common subsequence and character models to predict word forms |
Authors | Alexey Sorokin |
Abstract | |
Tasks | Lemmatization, Morphological Inflection |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2009/ |
https://www.aclweb.org/anthology/W16-2009 | |
PWC | https://paperswithcode.com/paper/using-longest-common-subsequence-and |
Repo | |
Framework | |
Arabic Language WEKA-Based Dialect Classifier for Arabic Automatic Speech Recognition Transcripts
Title | Arabic Language WEKA-Based Dialect Classifier for Arabic Automatic Speech Recognition Transcripts |
Authors | Areej Alshutayri, Eric Atwell, Abdulrahman Alosaimy, James Dickins, Michael Ingleby, Janet Watson |
Abstract | This paper describes an Arabic dialect identification system which we developed for the Discriminating Similar Languages (DSL) 2016 shared task. We classified Arabic dialects by using Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many alternative filters and classifiers for machine learning. We experimented with several classifiers and the best accuracy was achieved using the Sequential Minimal Optimization (SMO) algorithm for training and testing process set to three different feature-sets for each testing process. Our approach achieved an accuracy equal to 42.85{%} which is considerably worse in comparison to the evaluation scores on the training set of 80-90{%} and with training set {``}60:40{''} percentage split which achieved accuracy around 50{%}. We observed that Buckwalter transcripts from the Saarland Automatic Speech Recognition (ASR) system are given without short vowels, though the Buckwalter system has notation for these. We elaborate such observations, describe our methods and analyse the training dataset. | |
Tasks | Language Identification, Speech Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4826/ |
https://www.aclweb.org/anthology/W16-4826 | |
PWC | https://paperswithcode.com/paper/arabic-language-weka-based-dialect-classifier |
Repo | |
Framework | |
QASSIT at SemEval-2016 Task 13: On the integration of Semantic Vectors in Pretopological Spaces for Lexical Taxonomy Acquisition
Title | QASSIT at SemEval-2016 Task 13: On the integration of Semantic Vectors in Pretopological Spaces for Lexical Taxonomy Acquisition |
Authors | Guillaume Cleuziou, Jose G. Moreno |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1205/ |
https://www.aclweb.org/anthology/S16-1205 | |
PWC | https://paperswithcode.com/paper/qassit-at-semeval-2016-task-13-on-the |
Repo | |
Framework | |
Implicit Polarity and Implicit Aspect Recognition in Opinion Mining
Title | Implicit Polarity and Implicit Aspect Recognition in Opinion Mining |
Authors | Huan-Yuan Chen, Hsin-Hsi Chen |
Abstract | |
Tasks | Opinion Mining, Sentiment Analysis |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2004/ |
https://www.aclweb.org/anthology/P16-2004 | |
PWC | https://paperswithcode.com/paper/implicit-polarity-and-implicit-aspect |
Repo | |
Framework | |
Inferring Perceived Demographics from User Emotional Tone and User-Environment Emotional Contrast
Title | Inferring Perceived Demographics from User Emotional Tone and User-Environment Emotional Contrast |
Authors | Svitlana Volkova, Yoram Bachrach |
Abstract | |
Tasks | Recommendation Systems |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1148/ |
https://www.aclweb.org/anthology/P16-1148 | |
PWC | https://paperswithcode.com/paper/inferring-perceived-demographics-from-user |
Repo | |
Framework | |
An Unsupervised Morphological Criterion for Discriminating Similar Languages
Title | An Unsupervised Morphological Criterion for Discriminating Similar Languages |
Authors | Adrien Barbaresi |
Abstract | In this study conducted on the occasion of the Discriminating between Similar Languages shared task, I introduce an additional decision factor focusing on the token and subtoken level. The motivation behind this submission is to test whether a morphologically-informed criterion can add linguistically relevant information to global categorization and thus improve performance. The contributions of this paper are (1) a description of the unsupervised, low-resource method; (2) an evaluation and analysis of its raw performance; and (3) an assessment of its impact within a model comprising common indicators used in language identification. I present and discuss the systems used in the task A, a 12-way language identification task comprising varieties of five main language groups. Additionally I introduce a new off-the-shelf Naive Bayes classifier using a contrastive word and subword n-gram model ({``}Bayesline{''}) which outperforms the best submissions. | |
Tasks | Language Identification, Text Categorization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4827/ |
https://www.aclweb.org/anthology/W16-4827 | |
PWC | https://paperswithcode.com/paper/an-unsupervised-morphological-criterion-for |
Repo | |
Framework | |
Morphological Smoothing and Extrapolation of Word Embeddings
Title | Morphological Smoothing and Extrapolation of Word Embeddings |
Authors | Ryan Cotterell, Hinrich Sch{"u}tze, Jason Eisner |
Abstract | |
Tasks | Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1156/ |
https://www.aclweb.org/anthology/P16-1156 | |
PWC | https://paperswithcode.com/paper/morphological-smoothing-and-extrapolation-of |
Repo | |
Framework | |
Brave New World: Uncovering Topical Dynamics in the ACL Anthology Reference Corpus Using Term Life Cycle Information
Title | Brave New World: Uncovering Topical Dynamics in the ACL Anthology Reference Corpus Using Term Life Cycle Information |
Authors | Anne-Kathrin Schumann |
Abstract | |
Tasks | Text Classification |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2101/ |
https://www.aclweb.org/anthology/W16-2101 | |
PWC | https://paperswithcode.com/paper/brave-new-world-uncovering-topical-dynamics |
Repo | |
Framework | |
COMMIT at SemEval-2016 Task 5: Sentiment Analysis with Rhetorical Structure Theory
Title | COMMIT at SemEval-2016 Task 5: Sentiment Analysis with Rhetorical Structure Theory |
Authors | Kim Schouten, Flavius Frasincar |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1057/ |
https://www.aclweb.org/anthology/S16-1057 | |
PWC | https://paperswithcode.com/paper/commit-at-semeval-2016-task-5-sentiment |
Repo | |
Framework | |