Paper Group NANR 59
Dealing with Data Sparseness in SMT with Factured Models and Morphological Expansion: a Case Study on Croatian. Disfluent but effective? A quantitative study of disfluencies and conversational moves in team discourse. Real-Time Understanding of Complex Discriminative Scene Descriptions. Recognition of non-domain phrases in automatically extracted l …
Dealing with Data Sparseness in SMT with Factured Models and Morphological Expansion: a Case Study on Croatian
Title | Dealing with Data Sparseness in SMT with Factured Models and Morphological Expansion: a Case Study on Croatian |
Authors | Victor M. S{'a}nchez-Cartagena, Nikola Ljube{\v{s}}i{'c}, Filip Klubi{\v{c}}ka |
Abstract | |
Tasks | Machine Translation |
Published | 2016-01-01 |
URL | https://www.aclweb.org/anthology/W16-3421/ |
https://www.aclweb.org/anthology/W16-3421 | |
PWC | https://paperswithcode.com/paper/dealing-with-data-sparseness-in-smt-with |
Repo | |
Framework | |
Disfluent but effective? A quantitative study of disfluencies and conversational moves in team discourse
Title | Disfluent but effective? A quantitative study of disfluencies and conversational moves in team discourse |
Authors | Felix Gervits, Kathleen Eberhard, Matthias Scheutz |
Abstract | Situated dialogue systems that interact with humans as part of a team (e.g., robot teammates) need to be able to use information from communication channels to gauge the coordination level and effectiveness of the team. Currently, the feasibility of this end goal is limited by several gaps in both the empirical and computational literature. The purpose of this paper is to address those gaps in the following ways: (1) investigate which properties of task-oriented discourse correspond with effective performance in human teams, and (2) discuss how and to what extent these properties can be utilized in spoken dialogue systems. To this end, we analyzed natural language data from a unique corpus of spontaneous, task-oriented dialogue (CReST corpus), which was annotated for disfluencies and conversational moves. We found that effective teams made more self-repair disfluencies and used specific communication strategies to facilitate grounding and coordination. Our results indicate that truly robust and natural dialogue systems will need to interpret highly disfluent utterances and also utilize specific collaborative mechanisms to facilitate grounding. These data shed light on effective communication in performance scenarios and directly inform the development of robust dialogue systems for situated artificial agents. |
Tasks | Decision Making, Spoken Dialogue Systems |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1317/ |
https://www.aclweb.org/anthology/C16-1317 | |
PWC | https://paperswithcode.com/paper/disfluent-but-effective-a-quantitative-study |
Repo | |
Framework | |
Real-Time Understanding of Complex Discriminative Scene Descriptions
Title | Real-Time Understanding of Complex Discriminative Scene Descriptions |
Authors | Ramesh Manuvinakurike, Casey Kennington, David DeVault, David Schlangen |
Abstract | |
Tasks | Spoken Dialogue Systems |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3630/ |
https://www.aclweb.org/anthology/W16-3630 | |
PWC | https://paperswithcode.com/paper/real-time-understanding-of-complex |
Repo | |
Framework | |
Recognition of non-domain phrases in automatically extracted lists of terms
Title | Recognition of non-domain phrases in automatically extracted lists of terms |
Authors | Agnieszka Mykowiecka, Malgorzata Marciniak, Piotr Rychlik |
Abstract | In the paper, we address the problem of recognition of non-domain phrases in terminology lists obtained with an automatic term extraction tool. We focus on identification of multi-word phrases that are general terms and discourse function expressions. We tested several methods based on domain corpora comparison and a method based on contexts of phrases identified in a large corpus of general language. We compared the results of the methods to manual annotation. The results show that the task is quite hard as the inter-annotator agreement is low. Several tested methods achieved similar overall results, although the phrase ordering varied between methods. The most successful method with the precision about 0.75 at the half of the tested list was the context based method using a modified contextual diversity coefficient. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4703/ |
https://www.aclweb.org/anthology/W16-4703 | |
PWC | https://paperswithcode.com/paper/recognition-of-non-domain-phrases-in |
Repo | |
Framework | |
Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization
Title | Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization |
Authors | Muhammad Humayoun, Hwanjo Yu |
Abstract | Preprocessing is a preliminary step in many fields including IR and NLP. The effect of basic preprocessing settings on English for text summarization is well-studied. However, there is no such effort found for the Urdu language (with the best of our knowledge). In this study, we analyze the effect of basic preprocessing settings for single-document text summarization for Urdu, on a benchmark corpus using various experiments. The analysis is performed using the state-of-the-art algorithms for extractive summarization and the effect of stopword removal, lemmatization, and stemming is analyzed. Results showed that these pre-processing settings improve the results. |
Tasks | Lemmatization, Text Summarization |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1585/ |
https://www.aclweb.org/anthology/L16-1585 | |
PWC | https://paperswithcode.com/paper/analyzing-pre-processing-settings-for-urdu |
Repo | |
Framework | |
Supervised classification of end-of-lines in clinical text with no manual annotation
Title | Supervised classification of end-of-lines in clinical text with no manual annotation |
Authors | Pierre Zweigenbaum, Cyril Grouin, Thomas Lavergne |
Abstract | In some plain text documents, end-of-line marks may or may not mark the boundary of a text unit (e.g., of a paragraph). This vexing problem is likely to impact subsequent natural language processing components, but is seldom addressed in the literature. We propose a method which uses no manual annotation to classify whether end-of-lines must actually be seen as simple spaces (soft line breaks) or as true text unit boundaries. This method, which includes self-training and co-training steps based on token and line length features, achieves 0.943 F-measure on a corpus of short e-books with controlled format, F=0.904 on a random sample of 24 clinical texts with soft line breaks, and F=0.898 on a larger set of mixed clinical texts which may or may not contain soft line breaks, a fairly high value for a method with no manual annotation. |
Tasks | Language Modelling, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5109/ |
https://www.aclweb.org/anthology/W16-5109 | |
PWC | https://paperswithcode.com/paper/supervised-classification-of-end-of-lines-in |
Repo | |
Framework | |
Parallel Chinese-English Entities, Relations and Events Corpora
Title | Parallel Chinese-English Entities, Relations and Events Corpora |
Authors | Justin Mott, Ann Bies, Zhiyi Song, Stephanie Strassel |
Abstract | This paper introduces the parallel Chinese-English Entities, Relations and Events (ERE) corpora developed by Linguistic Data Consortium under the DARPA Deep Exploration and Filtering of Text (DEFT) Program. Original Chinese newswire and discussion forum documents are annotated for two versions of the ERE task. The texts are manually translated into English and then annotated for the same ERE tasks on the English translation, resulting in a rich parallel resource that has utility for performers within the DEFT program, for participants in NIST{'}s Knowledge Base Population evaluations, and for cross-language projection research more generally. |
Tasks | Knowledge Base Population |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1589/ |
https://www.aclweb.org/anthology/L16-1589 | |
PWC | https://paperswithcode.com/paper/parallel-chinese-english-entities-relations |
Repo | |
Framework | |
Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties
Title | Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties |
Authors | Alex Balahur, ra, Hristo Tanev |
Abstract | Emotions are an important part of the human experience. They are responsible for the adaptation and integration in the environment, offering, most of the time together with the cognitive system, the appropriate responses to stimuli in the environment. As such, they are an important component in decision-making processes. In today{'}s society, the avalanche of stimuli present in the environment (physical or virtual) makes people more prone to respond to stronger affective stimuli (i.e., those that are related to their basic needs and motivations ― survival, food, shelter, etc.). In media reporting, this is translated in the use of arguments (factual data) that are known to trigger specific (strong, affective) behavioural reactions from the readers. This paper describes initial efforts to detect such arguments from text, based on the properties of concepts. The final system able to retrieve and label this type of data from the news in traditional and social platforms is intended to be integrated Europe Media Monitor family of applications to detect texts that trigger certain (especially negative) reactions from the public, with consequences on citizen safety and security. |
Tasks | Decision Making |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1185/ |
https://www.aclweb.org/anthology/L16-1185 | |
PWC | https://paperswithcode.com/paper/detecting-implicit-expressions-of-affect-from |
Repo | |
Framework | |
mib at SemEval-2016 Task 4a: Exploiting lexicon based features for Sentiment Analysis in Twitter
Title | mib at SemEval-2016 Task 4a: Exploiting lexicon based features for Sentiment Analysis in Twitter |
Authors | Vittoria Cozza, Marinella Petrocchi |
Abstract | |
Tasks | Recommendation Systems, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1019/ |
https://www.aclweb.org/anthology/S16-1019 | |
PWC | https://paperswithcode.com/paper/mib-at-semeval-2016-task-4a-exploiting |
Repo | |
Framework | |
SoNLP-DP System for ConLL-2016 Chinese Shallow Discourse Parsing
Title | SoNLP-DP System for ConLL-2016 Chinese Shallow Discourse Parsing |
Authors | Junhui Li, Fang Kong, Sheng Li, Muhua Zhu, Guodong Zhou |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-2011/ |
https://www.aclweb.org/anthology/K16-2011 | |
PWC | https://paperswithcode.com/paper/sonlp-dp-system-for-conll-2016-chinese |
Repo | |
Framework | |
The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval
Title | The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval |
Authors | Kira Griffitt, Stephanie Strassel |
Abstract | The DARPA BOLT Information Retrieval evaluations target open-domain natural-language queries over a large corpus of informal text in English, Chinese and Egyptian Arabic. We outline the goals of BOLT IR, comparing it with the prior GALE Distillation task. After discussing the properties of the BOLT IR corpus, we provide a detailed description of the query creation process, contrasting the summary query format presented to systems at run time with the full query format created by annotators. We describe the relevance criteria used to assess BOLT system responses, highlighting the evolution of the procedures used over the three evaluation phases. We provide a detailed review of the decision points model for relevance assessment introduced during Phase 2, and conclude with information about inter-assessor consistency achieved with the decision points assessment model. |
Tasks | Information Retrieval |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1593/ |
https://www.aclweb.org/anthology/L16-1593 | |
PWC | https://paperswithcode.com/paper/the-query-of-everything-developing-open |
Repo | |
Framework | |
bunji at SemEval-2016 Task 5: Neural and Syntactic Models of Entity-Attribute Relationship for Aspect-based Sentiment Analysis
Title | bunji at SemEval-2016 Task 5: Neural and Syntactic Models of Entity-Attribute Relationship for Aspect-based Sentiment Analysis |
Authors | Toshihiko Yanase, Kohsuke Yanai, Misa Sato, Toshinori Miyoshi, Yoshiki Niwa |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Multi-Label Classification, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1046/ |
https://www.aclweb.org/anthology/S16-1046 | |
PWC | https://paperswithcode.com/paper/bunji-at-semeval-2016-task-5-neural-and |
Repo | |
Framework | |
BUTknot at SemEval-2016 Task 5: Supervised Machine Learning with Term Substitution Approach in Aspect Category Detection
Title | BUTknot at SemEval-2016 Task 5: Supervised Machine Learning with Term Substitution Approach in Aspect Category Detection |
Authors | Jakub Mach{'a}{\v{c}}ek |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Multi-Label Classification, Opinion Mining, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1048/ |
https://www.aclweb.org/anthology/S16-1048 | |
PWC | https://paperswithcode.com/paper/butknot-at-semeval-2016-task-5-supervised |
Repo | |
Framework | |
ECNU at SemEval 2016 Task 6: Relevant or Not? Supportive or Not? A Two-step Learning System for Automatic Detecting Stance in Tweets
Title | ECNU at SemEval 2016 Task 6: Relevant or Not? Supportive or Not? A Two-step Learning System for Automatic Detecting Stance in Tweets |
Authors | Zhihua Zhang, Man Lan |
Abstract | |
Tasks | Feature Engineering, Sentiment Analysis, Stance Detection |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1073/ |
https://www.aclweb.org/anthology/S16-1073 | |
PWC | https://paperswithcode.com/paper/ecnu-at-semeval-2016-task-6-relevant-or-not |
Repo | |
Framework | |
NileTMRG at SemEval-2016 Task 7: Deriving Prior Polarities for Arabic Sentiment Terms
Title | NileTMRG at SemEval-2016 Task 7: Deriving Prior Polarities for Arabic Sentiment Terms |
Authors | Samhaa R. El-Beltagy |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1079/ |
https://www.aclweb.org/anthology/S16-1079 | |
PWC | https://paperswithcode.com/paper/niletmrg-at-semeval-2016-task-7-deriving |
Repo | |
Framework | |