Paper Group NANR 142
N-ary Biographical Relation Extraction using Shortest Path Dependencies. Multilingual Multimodal Language Processing Using Neural Networks. Phrase Level Segmentation and Labelling of Machine Translation Errors. Multimodal Mood Classification - A Case Study of Differences in Hindi and Western Songs. Mining linguistic tone patterns with symbolic repr …
N-ary Biographical Relation Extraction using Shortest Path Dependencies
Title | N-ary Biographical Relation Extraction using Shortest Path Dependencies |
Authors | Gitansh Khirbat, Jianzhong Qi, Rui Zhang |
Abstract | |
Tasks | Entity Extraction, Named Entity Recognition, Question Answering, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/U16-1008/ |
https://www.aclweb.org/anthology/U16-1008 | |
PWC | https://paperswithcode.com/paper/n-ary-biographical-relation-extraction-using |
Repo | |
Framework | |
Multilingual Multimodal Language Processing Using Neural Networks
Title | Multilingual Multimodal Language Processing Using Neural Networks |
Authors | Mitesh M Khapra, Ch, Sarath ar |
Abstract | |
Tasks | Machine Translation, Representation Learning |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-4002/ |
https://www.aclweb.org/anthology/N16-4002 | |
PWC | https://paperswithcode.com/paper/multilingual-multimodal-language-processing |
Repo | |
Framework | |
Phrase Level Segmentation and Labelling of Machine Translation Errors
Title | Phrase Level Segmentation and Labelling of Machine Translation Errors |
Authors | Fr{'e}d{'e}ric Blain, Varvara Logacheva, Lucia Specia |
Abstract | This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases. This new level of QE aims to provide a natural balance between QE at word and sentence-level, which are either too fine grained or too coarse levels for some applications. However, phrase-level QE implies an intrinsic challenge: how to segment a machine translation into sequence of words (contiguous or not) that represent an error. We discuss three possible segmentation strategies to automatically extract erroneous phrases. We evaluate these strategies against annotations at phrase-level produced by humans, using a new dataset collected for this purpose. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1356/ |
https://www.aclweb.org/anthology/L16-1356 | |
PWC | https://paperswithcode.com/paper/phrase-level-segmentation-and-labelling-of |
Repo | |
Framework | |
Multimodal Mood Classification - A Case Study of Differences in Hindi and Western Songs
Title | Multimodal Mood Classification - A Case Study of Differences in Hindi and Western Songs |
Authors | Braja Gopal Patra, Dipankar Das, B, Sivaji yopadhyay |
Abstract | Music information retrieval has emerged as a mainstream research area in the past two decades. Experiments on music mood classification have been performed mainly on Western music based on audio, lyrics and a combination of both. Unfortunately, due to the scarcity of digitalized resources, Indian music fares poorly in music mood retrieval research. In this paper, we identified the mood taxonomy and prepared multimodal mood annotated datasets for Hindi and Western songs. We identified important audio and lyric features using correlation based feature selection technique. Finally, we developed mood classification systems using Support Vector Machines and Feed Forward Neural Networks based on the features collected from audio, lyrics, and a combination of both. The best performing multimodal systems achieved F-measures of 75.1 and 83.5 for classifying the moods of the Hindi and Western songs respectively using Feed Forward Neural Networks. A comparative analysis indicates that the selected features work well for mood classification of the Western songs and produces better results as compared to the mood classification systems for Hindi songs. |
Tasks | Feature Selection, Information Retrieval, Music Information Retrieval |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1186/ |
https://www.aclweb.org/anthology/C16-1186 | |
PWC | https://paperswithcode.com/paper/multimodal-mood-classification-a-case-study |
Repo | |
Framework | |
Mining linguistic tone patterns with symbolic representation
Title | Mining linguistic tone patterns with symbolic representation |
Authors | Shuo Zhang |
Abstract | |
Tasks | Information Retrieval, Music Information Retrieval, Speech Recognition, Time Series, Time Series Classification |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2001/ |
https://www.aclweb.org/anthology/W16-2001 | |
PWC | https://paperswithcode.com/paper/mining-linguistic-tone-patterns-with-symbolic |
Repo | |
Framework | |
Constructing a Dictionary Describing Feature Changes of Arguments in Event Sentences
Title | Constructing a Dictionary Describing Feature Changes of Arguments in Event Sentences |
Authors | Tetsuaki Nakamura, Daisuke Kawahara |
Abstract | |
Tasks | Common Sense Reasoning |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-1006/ |
https://www.aclweb.org/anthology/W16-1006 | |
PWC | https://paperswithcode.com/paper/constructing-a-dictionary-describing-feature |
Repo | |
Framework | |
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP
Title | Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP |
Authors | |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2500/ |
https://www.aclweb.org/anthology/W16-2500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-1st-workshop-on-evaluating |
Repo | |
Framework | |
Discriminative Analysis of Linguistic Features for Typological Study
Title | Discriminative Analysis of Linguistic Features for Typological Study |
Authors | Hiroya Takamura, Ryo Nagata, Yoshifumi Kawasaki |
Abstract | We address the task of automatically estimating the missing values of linguistic features by making use of the fact that some linguistic features in typological databases are informative to each other. The questions to address in this work are (i) how much predictive power do features have on the value of another feature? (ii) to what extent can we attribute this predictive power to genealogical or areal factors, as opposed to being provided by tendencies or implicational universals? To address these questions, we conduct a discriminative or predictive analysis on the typological database. Specifically, we use a machine-learning classifier to estimate the value of each feature of each language using the values of the other features, under different choices of training data: all the other languages, or all the other languages except for the ones having the same origin or area with the target language. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1011/ |
https://www.aclweb.org/anthology/L16-1011 | |
PWC | https://paperswithcode.com/paper/discriminative-analysis-of-linguistic |
Repo | |
Framework | |
Improving Topic Model Clustering of Newspaper Comments for Summarisation
Title | Improving Topic Model Clustering of Newspaper Comments for Summarisation |
Authors | Clare Llewellyn, Claire Grover, Oberl, Jon er |
Abstract | |
Tasks | Topic Models |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-3007/ |
https://www.aclweb.org/anthology/P16-3007 | |
PWC | https://paperswithcode.com/paper/improving-topic-model-clustering-of-newspaper |
Repo | |
Framework | |
Adapting an Entity Centric Model for Portuguese Coreference Resolution
Title | Adapting an Entity Centric Model for Portuguese Coreference Resolution |
Authors | Ev Fonseca, ro, Renata Vieira, Aline Vanin |
Abstract | This paper presents the adaptation of an Entity Centric Model for Portuguese coreference resolution, considering 10 named entity categories. The model was evaluated on named e using the HAREM Portuguese corpus and the results are 81.0{%} of precision and 58.3{%} of recall overall, the resulting system is freely available |
Tasks | Coreference Resolution |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1023/ |
https://www.aclweb.org/anthology/L16-1023 | |
PWC | https://paperswithcode.com/paper/adapting-an-entity-centric-model-for |
Repo | |
Framework | |
Task-Oriented Intrinsic Evaluation of Semantic Textual Similarity
Title | Task-Oriented Intrinsic Evaluation of Semantic Textual Similarity |
Authors | Nils Reimers, Philip Beyer, Iryna Gurevych |
Abstract | Semantic Textual Similarity (STS) is a foundational NLP task and can be used in a wide range of tasks. To determine the STS of two texts, hundreds of different STS systems exist, however, for an NLP system designer, it is hard to decide which system is the best one. To answer this question, an intrinsic evaluation of the STS systems is conducted by comparing the output of the system to human judgments on semantic similarity. The comparison is usually done using Pearson correlation. In this work, we show that relying on intrinsic evaluations with Pearson correlation can be misleading. In three common STS based tasks we could observe that the Pearson correlation was especially ill-suited to detect the best STS system for the task and other evaluation measures were much better suited. In this work we define how the validity of an intrinsic evaluation can be assessed and compare different intrinsic evaluation methods. Understanding of the properties of the targeted task is crucial and we propose a framework for conducting the intrinsic evaluation which takes the properties of the targeted task into account. |
Tasks | Question Answering, Semantic Similarity, Semantic Textual Similarity, Text Summarization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1009/ |
https://www.aclweb.org/anthology/C16-1009 | |
PWC | https://paperswithcode.com/paper/task-oriented-intrinsic-evaluation-of |
Repo | |
Framework | |
Expanding wordnets to new languages with multilingual sense disambiguation
Title | Expanding wordnets to new languages with multilingual sense disambiguation |
Authors | Mihael Arcan, John Philip McCrae, Paul Buitelaar |
Abstract | Princeton WordNet is one of the most important resources for natural language processing, but is only available for English. While it has been translated using the expand approach to many other languages, this is an expensive manual process. Therefore it would be beneficial to have a high-quality automatic translation approach that would support NLP techniques, which rely on WordNet in new languages. The translation of wordnets is fundamentally complex because of the need to translate all senses of a word including low frequency senses, which is very challenging for current machine translation approaches. For this reason we leverage existing translations of WordNet in other languages to identify contextual information for wordnet senses from a large set of generic parallel corpora. We evaluate our approach using 10 translated wordnets for European languages. Our experiment shows a significant improvement over translation without any contextual information. Furthermore, we evaluate how the choice of pivot languages affects performance of multilingual word sense disambiguation. |
Tasks | Information Retrieval, Machine Translation, Sentiment Analysis, Word Sense Disambiguation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1010/ |
https://www.aclweb.org/anthology/C16-1010 | |
PWC | https://paperswithcode.com/paper/expanding-wordnets-to-new-languages-with |
Repo | |
Framework | |
Deceptive Opinion Spam Detection Using Neural Network
Title | Deceptive Opinion Spam Detection Using Neural Network |
Authors | Yafeng Ren, Yue Zhang |
Abstract | Deceptive opinion spam detection has attracted significant attention from both business and research communities. Existing approaches are based on manual discrete features, which can capture linguistic and psychological cues. However, such features fail to encode the semantic meaning of a document from the discourse perspective, which limits the performance. In this paper, we empirically explore a neural network model to learn document-level representation for detecting deceptive opinion spam. In particular, given a document, the model learns sentence representations with a convolutional neural network, which are combined using a gated recurrent neural network with attention mechanism to model discourse information and yield a document vector. Finally, the document representation is used directly as features to identify deceptive opinion spam. Experimental results on three domains (Hotel, Restaurant, and Doctor) show that our proposed method outperforms state-of-the-art methods. |
Tasks | Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1014/ |
https://www.aclweb.org/anthology/C16-1014 | |
PWC | https://paperswithcode.com/paper/deceptive-opinion-spam-detection-using-neural |
Repo | |
Framework | |
Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data
Title | Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data |
Authors | Julian Bleicken, Thomas Hanke, Uta Salden, Sven Wagner |
Abstract | For publishing sign language corpus data on the web, anonymization is crucial even if it is impossible to hide the visual appearance of the signers: In a small community, even vague references to third persons may be enough to identify those persons. In the case of the DGS Korpus (German Sign Language corpus) project, we want to publish data as a contribution to the cultural heritage of the sign language community while annotation of the data is still ongoing. This poses the question how well anonymization can be achieved given that no full linguistic analysis of the data is available. Basically, we combine analysis of all data that we have, including named entity recognition on translations into German. For this, we use the WebLicht language technology infrastructure. We report on the reliability of these methods in this special context and also illustrate how the anonymization of the video data is technically achieved in order to minimally disturb the viewer. |
Tasks | Named Entity Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1526/ |
https://www.aclweb.org/anthology/L16-1526 | |
PWC | https://paperswithcode.com/paper/using-a-language-technology-infrastructure |
Repo | |
Framework | |
Exploring Text Links for Coherent Multi-Document Summarization
Title | Exploring Text Links for Coherent Multi-Document Summarization |
Authors | Xun Wang, Masaaki Nishino, Tsutomu Hirao, Katsuhito Sudoh, Masaaki Nagata |
Abstract | Summarization aims to represent source documents by a shortened passage. Existing methods focus on the extraction of key information, but often neglect coherence. Hence the generated summaries suffer from a lack of readability. To address this problem, we have developed a graph-based method by exploring the links between text to produce coherent summaries. Our approach involves finding a sequence of sentences that best represent the key information in a coherent way. In contrast to the previous methods that focus only on salience, the proposed method addresses both coherence and informativeness based on textual linkages. We conduct experiments on the DUC2004 summarization task data set. A performance comparison reveals that the summaries generated by the proposed system achieve comparable results in terms of the ROUGE metric, and show improvements in readability by human evaluation. |
Tasks | Document Summarization, Multi-Document Summarization, Text Generation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1021/ |
https://www.aclweb.org/anthology/C16-1021 | |
PWC | https://paperswithcode.com/paper/exploring-text-links-for-coherent-multi |
Repo | |
Framework | |