Paper Group NANR 204
Recurrent Support Vector Machines For Slot Tagging In Spoken Language Understanding. Identification, characterization, and grounding of gradable terms in clinical text. Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus. Strategy and Policy Learning for Non-Task-Oriented …
Recurrent Support Vector Machines For Slot Tagging In Spoken Language Understanding
Title | Recurrent Support Vector Machines For Slot Tagging In Spoken Language Understanding |
Authors | Yangyang Shi, Kaisheng Yao, Hu Chen, Dong Yu, Yi-Cheng Pan, Mei-Yuh Hwang |
Abstract | |
Tasks | Chunking, Spoken Language Understanding |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1044/ |
https://www.aclweb.org/anthology/N16-1044 | |
PWC | https://paperswithcode.com/paper/recurrent-support-vector-machines-for-slot |
Repo | |
Framework | |
Identification, characterization, and grounding of gradable terms in clinical text
Title | Identification, characterization, and grounding of gradable terms in clinical text |
Authors | Chaitanya Shivade, Marie-Catherine de Marneffe, Eric Fosler-Lussier, Albert M. Lai |
Abstract | |
Tasks | Decision Making |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2903/ |
https://www.aclweb.org/anthology/W16-2903 | |
PWC | https://paperswithcode.com/paper/identification-characterization-and-grounding |
Repo | |
Framework | |
Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus
Title | Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus |
Authors | Yoshiko Arimoto, Kazuo Okanoya |
Abstract | In online computer-mediated communication, speakers were considered to have experienced difficulties in catching their partner{'}s emotions and in conveying their own emotions. To explain why online emotional communication is so difficult and to investigate how this problem should be solved, multimodal online emotional communication corpus was constructed by recording approximately 100 speakers{'} emotional expressions and reactions in a modality-controlled environment. Speakers communicated over the Internet using video chat, voice chat or text chat; their face-to-face conversations were used for comparison purposes. The corpora incorporated emotional labels by evaluating the speaker{'}s dynamic emotional states and the measurements of the speaker{'}s facial expression, vocal expression and autonomic nervous system activity. For the initial study of this project, which used a large-scale emotional communication corpus, the accuracy of online emotional understanding was assessed to demonstrate the emotional labels evaluated by the speakers and to summarize the speaker{'}s answers on the questionnaire regarding the difference between an online chat and face-to-face conversations in which they actually participated. The results revealed that speakers have difficulty communicating their emotions in online communication environments, regardless of the type of communication modality and that inaccurate emotional understanding occurs more frequently in online computer-mediated communication than in face-to-face communication. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1343/ |
https://www.aclweb.org/anthology/L16-1343 | |
PWC | https://paperswithcode.com/paper/comparison-of-emotional-understanding-in |
Repo | |
Framework | |
Strategy and Policy Learning for Non-Task-Oriented Conversational Systems
Title | Strategy and Policy Learning for Non-Task-Oriented Conversational Systems |
Authors | Zhou Yu, Ziyu Xu, Alan W Black, Alex Rudnicky, er |
Abstract | |
Tasks | Machine Translation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3649/ |
https://www.aclweb.org/anthology/W16-3649 | |
PWC | https://paperswithcode.com/paper/strategy-and-policy-learning-for-non-task |
Repo | |
Framework | |
Deep LSTM based Feature Mapping for Query Classification
Title | Deep LSTM based Feature Mapping for Query Classification |
Authors | Yangyang Shi, Kaisheng Yao, Le Tian, Daxin Jiang |
Abstract | |
Tasks | Speech Recognition |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1176/ |
https://www.aclweb.org/anthology/N16-1176 | |
PWC | https://paperswithcode.com/paper/deep-lstm-based-feature-mapping-for-query |
Repo | |
Framework | |
Sinhala Short Sentence Similarity Calculation using Corpus-Based and Knowledge-Based Similarity Measures
Title | Sinhala Short Sentence Similarity Calculation using Corpus-Based and Knowledge-Based Similarity Measures |
Authors | Jcs Kadupitiya, Surangika Ranathunga, Gihan Dias |
Abstract | Currently, corpus based-similarity, string-based similarity, and knowledge-based similarity techniques are used to compare short phrases. However, no work has been conducted on the similarity of phrases in Sinhala language. In this paper, we present a hybrid methodology to compute the similarity between two Sinhala sentences using a Semantic Similarity Measurement technique (corpus-based similarity measurement plus knowledge-based similarity measurement) that makes use of word order information. Since Sinhala WordNet is still under construction, we used lexical resources in performing this semantic similarity calculation. Evaluation using 4000 sentence pairs yielded an average MSE of 0.145 and a Pearson correla-tion factor of 0.832. |
Tasks | Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3705/ |
https://www.aclweb.org/anthology/W16-3705 | |
PWC | https://paperswithcode.com/paper/sinhala-short-sentence-similarity-calculation |
Repo | |
Framework | |
WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words
Title | WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words |
Authors | Luisa Bentivogli, Mauro Cettolo, M. Amin Farajian, Marcello Federico |
Abstract | This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive evaluation of WA tools on out-of-vocabulary (OOV) and rare words. WAGS is a subset of the Common Test section of the Europarl English-Italian parallel corpus, and is specifically tailored to OOV and rare words. WAGS is composed of 6,715 sentence pairs containing 11,958 occurrences of OOV and rare words up to frequency 15 in the Europarl Training set (5,080 English words and 6,878 Italian words), representing almost 3{%} of the whole text. Since WAGS is focused on OOV/rare words, manual alignments are provided for these words only, and not for the whole sentences. Two off-the-shelf word aligners have been evaluated on WAGS, and results have been compared to those obtained on an existing benchmark tailored to full text alignment. The results obtained confirm that WAGS is a valuable resource, which allows a statistically sound evaluation of WA systems{'} performance on OOV and rare words, as well as extensive data analyses. WAGS is publicly released under a Creative Commons Attribution license. |
Tasks | Word Alignment |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1562/ |
https://www.aclweb.org/anthology/L16-1562 | |
PWC | https://paperswithcode.com/paper/wags-a-beautiful-english-italian-benchmark |
Repo | |
Framework | |
Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets
Title | Crowdsourcing-based Annotation of Emotions in Filipino and English Tweets |
Authors | Fermin Roberto Lapitan, Riza Theresa Batista-Navarro, Eliezer Albacea |
Abstract | The automatic analysis of emotions conveyed in social media content, e.g., tweets, has many beneficial applications. In the Philippines, one of the most disaster-prone countries in the world, such methods could potentially enable first responders to make timely decisions despite the risk of data deluge. However, recognising emotions expressed in Philippine-generated tweets, which are mostly written in Filipino, English or a mix of both, is a non-trivial task. In order to facilitate the development of natural language processing (NLP) methods that will automate such type of analysis, we have built a corpus of tweets whose predominant emotions have been manually annotated by means of crowdsourcing. Defining measures ensuring that only high-quality annotations were retained, we have produced a gold standard corpus of 1,146 emotion-labelled Filipino and English tweets. We validate the value of this manually produced resource by demonstrating that an automatic emotion-prediction method based on the use of a publicly available word-emotion association lexicon was unable to reproduce the labels assigned via crowdsourcing. While we are planning to make a few extensions to the corpus in the near future, its current version has been made publicly available in order to foster the development of emotion analysis methods based on advanced Filipino and English NLP. |
Tasks | Emotion Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3708/ |
https://www.aclweb.org/anthology/W16-3708 | |
PWC | https://paperswithcode.com/paper/crowdsourcing-based-annotation-of-emotions-in |
Repo | |
Framework | |
Automatic Creation of a Sentence Aligned Sinhala-Tamil Parallel Corpus
Title | Automatic Creation of a Sentence Aligned Sinhala-Tamil Parallel Corpus |
Authors | Riyafa Abdul Hameed, Nadeeshani Pathirennehelage, Anusha Ihalapathirana, Maryam Ziyad Mohamed, Surangika Ranathunga, Sanath Jayasena, Gihan Dias, Fern, S o, areka |
Abstract | A sentence aligned parallel corpus is an important prerequisite in statistical machine translation. However, manual creation of such a parallel corpus is time consuming, and requires experts fluent in both languages. Automatic creation of a sentence aligned parallel corpus using parallel text is the solution to this problem. In this paper, we present the first ever empirical evaluation carried out to identify the best method to automatically create a sentence aligned Sinhala-Tamil parallel corpus. Annual reports from Sri Lankan government institutions were used as the parallel text for aligning. Despite both Sinhala and Tamil being under-resourced languages, we were able to achieve an F-score value of 0.791 using a hybrid approach that makes use of a bilingual dictionary. |
Tasks | Machine Translation, Word Alignment |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3713/ |
https://www.aclweb.org/anthology/W16-3713 | |
PWC | https://paperswithcode.com/paper/automatic-creation-of-a-sentence-aligned |
Repo | |
Framework | |
Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR
Title | Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR |
Authors | Wenda Chen, Mark Hasegawa-Johnson, Nancy Chen, Preethi Jyothi, Lav Varshney |
Abstract | Acquiring labeled speech for low-resource languages is a difficult task in the absence of native speakers of the language. One solution to this problem involves collecting speech transcriptions from crowd workers who are foreign or non-native speakers of a given target language. From these mismatched transcriptions, one can derive probabilistic phone transcriptions that are defined over the set of all target language phones using a noisy channel model. This paper extends prior work on deriving probabilistic transcriptions (PTs) from mismatched transcriptions by 1) modelling multilingual channels and 2) introducing a clustering-based phonetic mapping technique to improve the quality of PTs. Mismatched crowdsourcing for multilingual channels has certain properties of projection mapping, e.g., it can be interpreted as a clustering based on singular value decomposition of the segment alignments. To this end, we explore the use of distinctive feature weights, lexical tone confusions, and a two-step clustering algorithm to learn projections of phoneme segments from mismatched multilingual transcriber languages to the target language. We evaluate our techniques using mismatched transcriptions for Cantonese speech acquired from native English and Mandarin speakers. We observe a 5-9{%} relative reduction in phone error rate for the predicted Cantonese phone transcriptions using our proposed techniques compared with the previous PT method. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3714/ |
https://www.aclweb.org/anthology/W16-3714 | |
PWC | https://paperswithcode.com/paper/clustering-based-phonetic-projection-in |
Repo | |
Framework | |
DISCO: A System Leveraging Semantic Search in Document Review
Title | DISCO: A System Leveraging Semantic Search in Document Review |
Authors | Ngoc Phuoc An Vo, Fabien Guillot, Caroline Privault |
Abstract | This paper presents Disco, a prototype for supporting knowledge workers in exploring, reviewing and sorting collections of textual data. The goal is to facilitate, accelerate and improve the discovery of information. To this end, it combines Semantic Relatedness techniques with a review workflow developed in a tangible environment. Disco uses a semantic model that is leveraged on-line in the course of search sessions, and accessed through natural hand-gesture, in a simple and intuitive way. |
Tasks | Text Categorization, Text Clustering |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2014/ |
https://www.aclweb.org/anthology/C16-2014 | |
PWC | https://paperswithcode.com/paper/disco-a-system-leveraging-semantic-search-in |
Repo | |
Framework | |
Accounting ngrams and multi-word terms can improve topic models
Title | Accounting ngrams and multi-word terms can improve topic models |
Authors | Michael Nokel, Natalia Loukachevitch |
Abstract | |
Tasks | Information Retrieval, Text Clustering, Topic Models, Word Sense Disambiguation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1806/ |
https://www.aclweb.org/anthology/W16-1806 | |
PWC | https://paperswithcode.com/paper/accounting-ngrams-and-multi-word-terms-can |
Repo | |
Framework | |
Improve Sentiment Analysis of Citations with Author Modelling
Title | Improve Sentiment Analysis of Citations with Author Modelling |
Authors | Zheng Ma, Jinseok Nam, Karsten Weihe |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0420/ |
https://www.aclweb.org/anthology/W16-0420 | |
PWC | https://paperswithcode.com/paper/improve-sentiment-analysis-of-citations-with |
Repo | |
Framework | |
Active Learning with Oracle Epiphany
Title | Active Learning with Oracle Epiphany |
Authors | Tzu-Kuo Huang, Lihong Li, Ara Vartanian, Saleema Amershi, Jerry Zhu |
Abstract | We present a theoretical analysis of active learning with more realistic interactions with human oracles. Previous empirical studies have shown oracles abstaining on difficult queries until accumulating enough information to make label decisions. We formalize this phenomenon with an “oracle epiphany model” and analyze active learning query complexity under such oracles for both the realizable and the agnos- tic cases. Our analysis shows that active learning is possible with oracle epiphany, but incurs an additional cost depending on when the epiphany happens. Our results suggest new, principled active learning approaches with realistic oracles. |
Tasks | Active Learning |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6155-active-learning-with-oracle-epiphany |
http://papers.nips.cc/paper/6155-active-learning-with-oracle-epiphany.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-with-oracle-epiphany |
Repo | |
Framework | |
Coreference Annotation Scheme and Relation Types for Hindi
Title | Coreference Annotation Scheme and Relation Types for Hindi |
Authors | V Mujadia, an, Palash Gupta, Dipti Misra Sharma |
Abstract | This paper describes a coreference annotation scheme, coreference annotation specific issues and their solutions through our proposed annotation scheme for Hindi. We introduce different co-reference relation types between continuous mentions of the same coreference chain such as {}Part-of{''}, { }Function-value pair{''} etc. We used Jaccard similarity based Krippendorff{`}s{'} alpha to demonstrate consistency in annotation scheme, annotation and corpora. To ease the coreference annotation process, we built a semi-automatic Coreference Annotation Tool (CAT). We also provide statistics of coreference annotation on Hindi Dependency Treebank (HDTB). | |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1025/ |
https://www.aclweb.org/anthology/L16-1025 | |
PWC | https://paperswithcode.com/paper/coreference-annotation-scheme-and-relation |
Repo | |
Framework | |