Paper Group NANR 135
Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities. Connecting the Dots: Towards Human-Level Grammatical Error Correction. Neural Relation Extraction with Multi-lingual Attention. Log-linear Models for Uyghur Segmentation in Spoken Language Translation. TWINA at SemEval-2017 Task 4: Twitter Sentiment Analys …
Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities
Title | Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities |
Authors | Victoria Yaneva, Constantin Or{\u{a}}san, Richard Evans, Omid Rohanian |
Abstract | Given the lack of large user-evaluated corpora in disability-related NLP research (e.g. text simplification or readability assessment for people with cognitive disabilities), the question of choosing suitable training data for NLP models is not straightforward. The use of large generic corpora may be problematic because such data may not reflect the needs of the target population. The use of the available user-evaluated corpora may be problematic because these datasets are not large enough to be used as training data. In this paper we explore a third approach, in which a large generic corpus is combined with a smaller population-specific corpus to train a classifier which is evaluated using two sets of unseen user-evaluated data. One of these sets, the ASD Comprehension corpus, is developed for the purposes of this study and made freely available. We explore the effects of the size and type of the training data used on the performance of the classifiers, and the effects of the type of the unseen test datasets on the classification performance. |
Tasks | Text Simplification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5013/ |
https://www.aclweb.org/anthology/W17-5013 | |
PWC | https://paperswithcode.com/paper/combining-multiple-corpora-for-readability |
Repo | |
Framework | |
Connecting the Dots: Towards Human-Level Grammatical Error Correction
Title | Connecting the Dots: Towards Human-Level Grammatical Error Correction |
Authors | Shamil Chollampatt, Hwee Tou Ng |
Abstract | We build a grammatical error correction (GEC) system primarily based on the state-of-the-art statistical machine translation (SMT) approach, using task-specific features and tuning, and further enhance it with the modeling power of neural network joint models. The SMT-based system is weak in generalizing beyond patterns seen during training and lacks granularity below the word level. To address this issue, we incorporate a character-level SMT component targeting the misspelled words that the original SMT-based system fails to correct. Our final system achieves 53.14{%} F 0.5 score on the benchmark CoNLL-2014 test set, an improvement of 3.62{%} F 0.5 over the best previous published score. |
Tasks | Grammatical Error Correction, Language Modelling, Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5037/ |
https://www.aclweb.org/anthology/W17-5037 | |
PWC | https://paperswithcode.com/paper/connecting-the-dots-towards-human-level |
Repo | |
Framework | |
Neural Relation Extraction with Multi-lingual Attention
Title | Neural Relation Extraction with Multi-lingual Attention |
Authors | Yankai Lin, Zhiyuan Liu, Maosong Sun |
Abstract | Relation extraction has been widely used for finding unknown relational facts from plain text. Most existing methods focus on exploiting mono-lingual data for relation extraction, ignoring massive information from the texts in various languages. To address this issue, we introduce a multi-lingual neural relation extraction framework, which employs mono-lingual attention to utilize the information within mono-lingual texts and further proposes cross-lingual attention to consider the information consistency and complementarity among cross-lingual texts. Experimental results on real-world datasets show that, our model can take advantage of multi-lingual texts and consistently achieve significant improvements on relation extraction as compared with baselines. |
Tasks | Information Retrieval, Question Answering, Relation Extraction |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1004/ |
https://www.aclweb.org/anthology/P17-1004 | |
PWC | https://paperswithcode.com/paper/neural-relation-extraction-with-multi-lingual |
Repo | |
Framework | |
Log-linear Models for Uyghur Segmentation in Spoken Language Translation
Title | Log-linear Models for Uyghur Segmentation in Spoken Language Translation |
Authors | Chenggang Mi, Yating Yang, Rui Dong, Xi Zhou, Lei Wang, Xiao Li, Tonghai Jiang |
Abstract | To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random field (CRF) feature, bilingual word alignment feature and monolingual suffixword co-occurrence feature. Experimental results shown that our proposed segmentation model for Uyghur spoken translation achieved 1.6 BLEU score improvements compared with the state-of-the-art baseline. |
Tasks | Machine Translation, Word Alignment |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1065/ |
https://doi.org/10.26615/978-954-452-049-6_065 | |
PWC | https://paperswithcode.com/paper/log-linear-models-for-uyghur-segmentation-in |
Repo | |
Framework | |
TWINA at SemEval-2017 Task 4: Twitter Sentiment Analysis with Ensemble Gradient Boost Tree Classifier
Title | TWINA at SemEval-2017 Task 4: Twitter Sentiment Analysis with Ensemble Gradient Boost Tree Classifier |
Authors | Naveen Kumar Laskari, Suresh Kumar Sanampudi |
Abstract | This paper describes the TWINA system, with which we participated in SemEval-2017 Task 4B (Topic Based Message Polarity Classification {–} Two point scale) and 4D (two-point scale Tweet quantification). We implemented ensemble based Gradient Boost Trees classification method for both the tasks. Our system could perform well for the task 4D and ranked 13th among 15 teams, for the task 4B our model ranked 23rd position. |
Tasks | Information Retrieval, Sentiment Analysis, Twitter Sentiment Analysis |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2109/ |
https://www.aclweb.org/anthology/S17-2109 | |
PWC | https://paperswithcode.com/paper/twina-at-semeval-2017-task-4-twitter |
Repo | |
Framework | |
Processo de constru\cc~ao de um corpus anotado com Entidades Geol'ogicas visando REN (Building an annotated corpus with geological entities for NER)[In Portuguese]
Title | Processo de constru\cc~ao de um corpus anotado com Entidades Geol'ogicas visando REN (Building an annotated corpus with geological entities for NER)[In Portuguese] |
Authors | Daniela Amaral, S Collovini, ra, Anny Figueira, Renata Vieira, Renata Vieira, Marco Gonzalez |
Abstract | |
Tasks | Named Entity Recognition |
Published | 2017-10-01 |
URL | https://www.aclweb.org/anthology/W17-6609/ |
https://www.aclweb.org/anthology/W17-6609 | |
PWC | https://paperswithcode.com/paper/processo-de-construaao-de-um-corpus-anotado |
Repo | |
Framework | |
Proceedings of the Third Arabic Natural Language Processing Workshop
Title | Proceedings of the Third Arabic Natural Language Processing Workshop |
Authors | |
Abstract | |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1300/ |
https://www.aclweb.org/anthology/W17-1300 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-third-arabic-natural |
Repo | |
Framework | |
Interactive Visual Analysis of Transcribed Multi-Party Discourse
Title | Interactive Visual Analysis of Transcribed Multi-Party Discourse |
Authors | Mennatallah El-Assady, Annette Hautli-Janisz, Valentin Gold, Miriam Butt, Katharina Holzinger, Daniel Keim |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4009/ |
https://www.aclweb.org/anthology/P17-4009 | |
PWC | https://paperswithcode.com/paper/interactive-visual-analysis-of-transcribed |
Repo | |
Framework | |
Demographic-aware word associations
Title | Demographic-aware word associations |
Authors | Aparna Garimella, Carmen Banea, Rada Mihalcea |
Abstract | Variations of word associations across different groups of people can provide insights into people{'}s psychologies and their world views. To capture these variations, we introduce the task of demographic-aware word associations. We build a new gold standard dataset consisting of word association responses for approximately 300 stimulus words, collected from more than 800 respondents of different gender (male/female) and from different locations (India/United States), and show that there are significant variations in the word associations made by these groups. We also introduce a new demographic-aware word association model based on a neural net skip-gram architecture, and show how computational methods for measuring word associations that specifically account for writer demographics can outperform generic methods that are agnostic to such information. |
Tasks | Information Retrieval, Keyword Extraction, Relation Extraction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1242/ |
https://www.aclweb.org/anthology/D17-1242 | |
PWC | https://paperswithcode.com/paper/demographic-aware-word-associations |
Repo | |
Framework | |
Learning Multimodal Gender Profile using Neural Networks
Title | Learning Multimodal Gender Profile using Neural Networks |
Authors | Carlos P{'e}rez Estruch, Roberto Paredes Palacios, Paolo Rosso |
Abstract | Gender identification in social networks is one of the most popular aspects of user profile learning. Traditionally it has been linked to author profiling, a difficult problem to solve because of the little difference in the use of language between genders. This situation has led to the need of taking into account other information apart from textual data, favoring the emergence of multimodal data. The aim of this paper is to apply neural networks to perform data fusion, using an existing multimodal corpus, the NUS-MSS data set, that (not only) contains text data, but also image and location information. We improved previous results in terms of macro accuracy (87.8{%}) obtaining the state-of-the-art performance of 91.3{%}. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1075/ |
https://doi.org/10.26615/978-954-452-049-6_075 | |
PWC | https://paperswithcode.com/paper/learning-multimodal-gender-profile-using |
Repo | |
Framework | |
Distinguishing Japanese Non-standard Usages from Standard Ones
Title | Distinguishing Japanese Non-standard Usages from Standard Ones |
Authors | Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, Manabu Okumura |
Abstract | We focus on non-standard usages of common words on social media. In the context of social media, words sometimes have other usages that are totally different from their original. In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner. Our basic idea is that non-standardness can be measured by the inconsistency between the expected meaning of the target word and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and provide us with findings, for example, on how to construct context embeddings and which corpus to use. |
Tasks | Machine Translation, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1246/ |
https://www.aclweb.org/anthology/D17-1246 | |
PWC | https://paperswithcode.com/paper/distinguishing-japanese-non-standard-usages |
Repo | |
Framework | |
Structured Learning for Context-aware Spoken Language Understanding of Robotic Commands
Title | Structured Learning for Context-aware Spoken Language Understanding of Robotic Commands |
Authors | Andrea Vanzo, Danilo Croce, Roberto Basili, Daniele Nardi |
Abstract | Service robots are expected to operate in specific environments, where the presence of humans plays a key role. A major feature of such robotics platforms is thus the ability to react to spoken commands. This requires the understanding of the user utterance with an accuracy able to trigger the robot reaction. Such correct interpretation of linguistic exchanges depends on physical, cognitive and language-dependent aspects related to the environment. In this work, we present the empirical evaluation of an adaptive Spoken Language Understanding chain for robotic commands, that explicitly depends on the operational environment during both the learning and recognition stages. The effectiveness of such a context-sensitive command interpretation is tested against an extension of an already existing corpus of commands, that introduced explicit perceptual knowledge: this enabled deeper measures proving that more accurate disambiguation capabilities can be actually obtained. |
Tasks | Spoken Language Understanding |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2804/ |
https://www.aclweb.org/anthology/W17-2804 | |
PWC | https://paperswithcode.com/paper/structured-learning-for-context-aware-spoken |
Repo | |
Framework | |
Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech
Title | Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech |
Authors | Julian Hough, David Schlangen |
Abstract | We present the joint task of incremental disfluency detection and utterance segmentation and a simple deep learning system which performs it on transcripts and ASR results. We show how the constraints of the two tasks interact. Our joint-task system outperforms the equivalent individual task systems, provides competitive results and is suitable for future use in conversation agents in the psychiatric domain. |
Tasks | Speech Recognition |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1031/ |
https://www.aclweb.org/anthology/E17-1031 | |
PWC | https://paperswithcode.com/paper/joint-incremental-disfluency-detection-and |
Repo | |
Framework | |
Inter-Weighted Alignment Network for Sentence Pair Modeling
Title | Inter-Weighted Alignment Network for Sentence Pair Modeling |
Authors | Gehui Shen, Yunlun Yang, Zhi-Hong Deng |
Abstract | Sentence pair modeling is a crucial problem in the field of natural language processing. In this paper, we propose a model to measure the similarity of a sentence pair focusing on the interaction information. We utilize the word level similarity matrix to discover fine-grained alignment of two sentences. It should be emphasized that each word in a sentence has a different importance from the perspective of semantic composition, so we exploit two novel and efficient strategies to explicitly calculate a weight for each word. Although the proposed model only use a sequential LSTM for sentence modeling without any external resource such as syntactic parser tree and additional lexicon features, experimental results show that our model achieves state-of-the-art performance on three datasets of two tasks. |
Tasks | Machine Translation, Natural Language Inference, Paraphrase Identification, Question Answering, Relation Classification, Semantic Composition, Semantic Textual Similarity, Sentence Pair Modeling, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1122/ |
https://www.aclweb.org/anthology/D17-1122 | |
PWC | https://paperswithcode.com/paper/inter-weighted-alignment-network-for-sentence |
Repo | |
Framework | |
KeLP at SemEval-2017 Task 3: Learning Pairwise Patterns in Community Question Answering
Title | KeLP at SemEval-2017 Task 3: Learning Pairwise Patterns in Community Question Answering |
Authors | Simone Filice, Giovanni Da San Martino, Aless Moschitti, ro |
Abstract | This paper describes the KeLP system participating in the SemEval-2017 community Question Answering (cQA) task. The system is a refinement of the kernel-based sentence pair modeling we proposed for the previous year challenge. It is implemented within the Kernel-based Learning Platform called KeLP, from which we inherit the team{'}s name. Our primary submission ranked first in subtask A, and third in subtasks B and C, being the only systems appearing in the top-3 ranking for all the English subtasks. This shows that the proposed framework, which has minor variations among the three subtasks, is extremely flexible and effective in tackling learning tasks defined on sentence pairs. |
Tasks | Community Question Answering, Question Answering, Relational Reasoning, Sentence Pair Modeling |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2053/ |
https://www.aclweb.org/anthology/S17-2053 | |
PWC | https://paperswithcode.com/paper/kelp-at-semeval-2017-task-3-learning-pairwise |
Repo | |
Framework | |