July 26, 2019

1986 words 10 mins read

Paper Group NANR 138

Paper Group NANR 138

Correcting General Purpose ASR Errors using Posteriors. Multi-source morphosyntactic tagging for spoken Rusyn. Did you ever read about Frogs drinking Coffee? Investigating the Compositionality of Multi-Emoji Expressions. PLN-PUCRS at EmoInt-2017: Psycholinguistic features for emotion intensity prediction in tweets. Stanford’s Graph-based Neural Dep …

Correcting General Purpose ASR Errors using Posteriors

Title Correcting General Purpose ASR Errors using Posteriors
Authors Sunil Kumar Kopparapu, C. Anantaram
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-7535/
PDF https://www.aclweb.org/anthology/W17-7535
PWC https://paperswithcode.com/paper/correcting-general-purpose-asr-errors-using
Repo
Framework

Multi-source morphosyntactic tagging for spoken Rusyn

Title Multi-source morphosyntactic tagging for spoken Rusyn
Authors Yves Scherrer, Achim Rabus
Abstract This paper deals with the development of morphosyntactic taggers for spoken varieties of the Slavic minority language Rusyn. As neither annotated corpora nor parallel corpora are electronically available for Rusyn, we propose to combine existing resources from the etymologically close Slavic languages Russian, Ukrainian, Slovak, and Polish and adapt them to Rusyn. Using MarMoT as tagging toolkit, we show that a tagger trained on a balanced set of the four source languages outperforms single language taggers by about 9{%}, and that additional automatically induced morphosyntactic lexicons lead to further improvements. The best observed accuracies for Rusyn are 82.4{%} for part-of-speech tagging and 75.5{%} for full morphological tagging.
Tasks Morphological Tagging, Part-Of-Speech Tagging
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1210/
PDF https://www.aclweb.org/anthology/W17-1210
PWC https://paperswithcode.com/paper/multi-source-morphosyntactic-tagging-for
Repo
Framework

Did you ever read about Frogs drinking Coffee? Investigating the Compositionality of Multi-Emoji Expressions

Title Did you ever read about Frogs drinking Coffee? Investigating the Compositionality of Multi-Emoji Expressions
Authors Rebeca Padilla L{'o}pez, Fabienne Cap
Abstract In this work, we present a first attempt to investigate multi-emoji expressions and whether they behave similarly to multiword expressions in terms of non-compositionality. We focus on the combination of the frog and the hot beverage emoji, but also show some preliminary results for other non-compositional emoji combinations. We use off-the-shelf sentiment analysers as well as manual classifications to approach the compositionality of these emoji combinations.
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5215/
PDF https://www.aclweb.org/anthology/W17-5215
PWC https://paperswithcode.com/paper/did-you-ever-read-about-frogs-drinking-coffee
Repo
Framework

PLN-PUCRS at EmoInt-2017: Psycholinguistic features for emotion intensity prediction in tweets

Title PLN-PUCRS at EmoInt-2017: Psycholinguistic features for emotion intensity prediction in tweets
Authors Henrique Santos, Renata Vieira
Abstract Linguistic Inquiry and Word Count (LIWC) is a rich dictionary that map words into several psychological categories such as Affective, Social, Cognitive, Perceptual and Biological processes. In this work, we have used LIWC psycholinguistic categories to train regression models and predict emotion intensity in tweets for the EmoInt-2017 task. Results show that LIWC features may boost emotion intensity prediction on the basis of a low dimension set.
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5225/
PDF https://www.aclweb.org/anthology/W17-5225
PWC https://paperswithcode.com/paper/pln-pucrs-at-emoint-2017-psycholinguistic
Repo
Framework

Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task

Title Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task
Authors Timothy Dozat, Peng Qi, Christopher D. Manning
Abstract This paper describes the neural dependency parser submitted by Stanford to the CoNLL 2017 Shared Task on parsing Universal Dependencies. Our system uses relatively simple LSTM networks to produce part of speech tags and labeled dependency parses from segmented and tokenized sequences of words. In order to address the rare word problem that abounds in languages with complex morphology, we include a character-based word representation that uses an LSTM to produce embeddings from sequences of characters. Our system was ranked first according to all five relevant metrics for the system: UPOS tagging (93.09{%}), XPOS tagging (82.27{%}), unlabeled attachment score (81.30{%}), labeled attachment score (76.30{%}), and content word labeled attachment score (72.57{%}).
Tasks Accuracy Metrics, Dependency Parsing
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-3002/
PDF https://www.aclweb.org/anthology/K17-3002
PWC https://paperswithcode.com/paper/stanfords-graph-based-neural-dependency
Repo
Framework

IITP at EmoInt-2017: Measuring Intensity of Emotions using Sentence Embeddings and Optimized Features

Title IITP at EmoInt-2017: Measuring Intensity of Emotions using Sentence Embeddings and Optimized Features
Authors Md Shad Akhtar, Palaash Sawant, Asif Ekbal, Jyoti Pawar, Pushpak Bhattacharyya
Abstract This paper describes the system that we submitted as part of our participation in the shared task on Emotion Intensity (EmoInt-2017). We propose a Long short term memory (LSTM) based architecture cascaded with Support Vector Regressor (SVR) for intensity prediction. We also employ Particle Swarm Optimization (PSO) based feature selection algorithm for obtaining an optimized feature set for training and evaluation. System evaluation shows interesting results on the four emotion datasets i.e. anger, fear, joy and sadness. In comparison to the other participating teams our system was ranked 5th in the competition.
Tasks Emotion Recognition, Feature Selection, Sentence Embeddings, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5229/
PDF https://www.aclweb.org/anthology/W17-5229
PWC https://paperswithcode.com/paper/iitp-at-emoint-2017-measuring-intensity-of
Repo
Framework

Deep Learning for Biomedical Information Retrieval: Learning Textual Relevance from Click Logs

Title Deep Learning for Biomedical Information Retrieval: Learning Textual Relevance from Click Logs
Authors Sunil Mohan, Nicolas Fiorini, Sun Kim, Zhiyong Lu
Abstract We describe a Deep Learning approach to modeling the relevance of a document{'}s text to a query, applied to biomedical literature. Instead of mapping each document and query to a common semantic space, we compute a variable-length difference vector between the query and document which is then passed through a deep convolution stage followed by a deep regression network to produce the estimated probability of the document{'}s relevance to the query. Despite the small amount of training data, this approach produces a more robust predictor than computing similarities between semantic vector representations of the query and document, and also results in significant improvements over traditional IR text factors. In the future, we plan to explore its application in improving PubMed search.
Tasks Information Retrieval, Text Matching
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2328/
PDF https://www.aclweb.org/anthology/W17-2328
PWC https://paperswithcode.com/paper/deep-learning-for-biomedical-information
Repo
Framework

Traversal-Free Word Vector Evaluation in Analogy Space

Title Traversal-Free Word Vector Evaluation in Analogy Space
Authors Xiaoyin Che, Nico Ring, Willi Raschkowski, Haojin Yang, Christoph Meinel
Abstract In this paper, we propose an alternative evaluating metric for word analogy questions (A to B is as C to D) in word vector evaluation. Different from the traditional method which predicts the fourth word by the given three, we measure the similarity directly on the {``}relations{''} of two pairs of given words, just as shifting the relation vectors into a new analogy space. Cosine and Euclidean distances are then calculated as measurements. Observation and experiments shows the proposed analogy space evaluation could offer a more comprehensive evaluating result on word vectors with word analogy questions. Meanwhile, computational complexity are remarkably reduced by avoiding traversing the vocabulary. |
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5302/
PDF https://www.aclweb.org/anthology/W17-5302
PWC https://paperswithcode.com/paper/traversal-free-word-vector-evaluation-in
Repo
Framework

Iconic Locations in Swedish Sign Language: Mapping Form to Meaning with Lexical Databases

Title Iconic Locations in Swedish Sign Language: Mapping Form to Meaning with Lexical Databases
Authors Carl B{"o}rstell, Robert {"O}stling
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0226/
PDF https://www.aclweb.org/anthology/W17-0226
PWC https://paperswithcode.com/paper/iconic-locations-in-swedish-sign-language
Repo
Framework

Recognizing Textual Entailment in Twitter Using Word Embeddings

Title Recognizing Textual Entailment in Twitter Using Word Embeddings
Authors Octavia-Maria {\c{S}}ulea
Abstract In this paper, we investigate the application of machine learning techniques and word embeddings to the task of Recognizing Textual Entailment (RTE) in Social Media. We look at a manually labeled dataset consisting of user generated short texts posted on Twitter (tweets) and related to four recent media events (the Charlie Hebdo shooting, the Ottawa shooting, the Sydney Siege, and the German Wings crash) and test to what extent neural techniques and embeddings are able to distinguish between tweets that entail or contradict each other or that claim unrelated things. We obtain comparable results to the state of the art in a train-test setting, but we show that, due to the noisy aspect of the data, results plummet in an evaluation strategy crafted to better simulate a real-life train-test scenario.
Tasks Information Retrieval, Machine Translation, Natural Language Inference, Question Answering, Text Classification, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5306/
PDF https://www.aclweb.org/anthology/W17-5306
PWC https://paperswithcode.com/paper/recognizing-textual-entailment-in-twitter
Repo
Framework

Towards Full Text Shallow Discourse Relation Annotation: Experiments with Cross-Paragraph Implicit Relations in the PDTB

Title Towards Full Text Shallow Discourse Relation Annotation: Experiments with Cross-Paragraph Implicit Relations in the PDTB
Authors Rashmi Prasad, Katherine Forbes Riley, Alan Lee
Abstract Full text discourse parsing relies on texts comprehensively annotated with discourse relations. To this end, we address a significant gap in the inter-sentential discourse relations annotated in the Penn Discourse Treebank (PDTB), namely the class of cross-paragraph implicit relations, which account for 30{%} of inter-sentential relations in the corpus. We present our annotation study to explore the incidence rate of adjacent vs. non-adjacent implicit relations in cross-paragraph contexts, and the relative degree of difficulty in annotating them. Our experiments show a high incidence of non-adjacent relations that are difficult to annotate reliably, suggesting the practicality of backing off from their annotation to reduce noise for corpus-based studies. Our resulting guidelines follow the PDTB adjacency constraint for implicits while employing an underspecified representation of non-adjacent implicits, and yield 62{%} inter-annotator agreement on this task.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-5502/
PDF https://www.aclweb.org/anthology/W17-5502
PWC https://paperswithcode.com/paper/towards-full-text-shallow-discourse-relation
Repo
Framework

As bases de dados verbais ADESSE e ViPEr: uma an'alise constrastiva das constru\cc~oes locativas em espanhol e em portugu^es (The verbal databases ADESSE and ViPEr: a contrastive analysis of locative constructs in Spanish and Portuguese)[In Portuguese]

Title As bases de dados verbais ADESSE e ViPEr: uma an'alise constrastiva das constru\cc~oes locativas em espanhol e em portugu^es (The verbal databases ADESSE and ViPEr: a contrastive analysis of locative constructs in Spanish and Portuguese)[In Portuguese]
Authors Roana Rodrigues, Oto Vale, Laura Alonso Alemany
Abstract
Tasks
Published 2017-10-01
URL https://www.aclweb.org/anthology/W17-6631/
PDF https://www.aclweb.org/anthology/W17-6631
PWC https://paperswithcode.com/paper/as-bases-de-dados-verbais-adesse-e-viper-uma
Repo
Framework

Multilingual and Cross-Lingual Complex Word Identification

Title Multilingual and Cross-Lingual Complex Word Identification
Authors Seid Muhie Yimam, Sanja {\v{S}}tajner, Martin Riedl, Chris Biemann
Abstract Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility. Due to the lack of CWI datasets, previous works largely depend on Simple English Wikipedia and edit histories for obtaining {`}gold standard{'} annotations, which are of doubtable quality, and limited only to English. We collect complex words/phrases (CP) for English, German and Spanish, annotated by both native and non-native speakers, and propose language independent features that can be used to train multilingual and cross-lingual CWI models. We show that the performance of cross-lingual CWI systems (using a model trained on one language and applying it on the other languages) is comparable to the performance of monolingual CWI systems. |
Tasks Complex Word Identification, Lexical Simplification
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1104/
PDF https://doi.org/10.26615/978-954-452-049-6_104
PWC https://paperswithcode.com/paper/multilingual-and-cross-lingual-complex-word
Repo
Framework

Detecting Untranslated Content for Neural Machine Translation

Title Detecting Untranslated Content for Neural Machine Translation
Authors Isao Goto, Hideki Tanaka
Abstract Despite its promise, neural machine translation (NMT) has a serious problem in that source content may be mistakenly left untranslated. The ability to detect untranslated content is important for the practical use of NMT. We evaluate two types of probability with which to detect untranslated content: the cumulative attention (ATN) probability and back translation (BT) probability from the target sentence to the source sentence. Experiments on detecting untranslated content in Japanese-English patent translations show that ATN and BT are each more effective than random choice, BT is more effective than ATN, and the combination of the two provides further improvements. We also confirmed the effectiveness of using ATN and BT to rerank the n-best NMT outputs.
Tasks Machine Translation
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3206/
PDF https://www.aclweb.org/anthology/W17-3206
PWC https://paperswithcode.com/paper/detecting-untranslated-content-for-neural
Repo
Framework

Demonstration of interactive teaching for end-to-end dialog control with hybrid code networks

Title Demonstration of interactive teaching for end-to-end dialog control with hybrid code networks
Authors Jason D. Williams, Lars Liden
Abstract This is a demonstration of interactive teaching for practical end-to-end dialog systems driven by a recurrent neural network. In this approach, a developer teaches the network by interacting with the system and providing on-the-spot corrections. Once a system is deployed, a developer can also correct mistakes in logged dialogs. This demonstration shows both of these teaching methods applied to dialog systems in three domains: pizza ordering, restaurant information, and weather forecasts.
Tasks Dialog Learning, Entity Extraction, Intent Detection
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-5511/
PDF https://www.aclweb.org/anthology/W17-5511
PWC https://paperswithcode.com/paper/demonstration-of-interactive-teaching-for-end
Repo
Framework
comments powered by Disqus