May 4, 2019

1524 words 8 mins read

Paper Group NANR 202

Paper Group NANR 202

Classifying ASR Transcriptions According to Arabic Dialect. Correcting Errors in a Treebank Based on Tree Mining. UTA DLNLP at SemEval-2016 Task 12: Deep Learning Based Natural Language Processing System for Clinical Information Identification from Clinical Notes and Pathology Reports. Dealing with word-internal modification and spelling variation …

Classifying ASR Transcriptions According to Arabic Dialect

Title Classifying ASR Transcriptions According to Arabic Dialect
Authors Abualsoud Hanani, Aziz Qaroush, Stephen Taylor
Abstract We describe several systems for identifying short samples of Arabic dialects. The systems were prepared for the shared task of the 2016 DSL Workshop. Our best system, an SVM using character tri-gram features, achieved an accuracy on the test data for the task of 0.4279, compared to a baseline of 0.20 for chance guesses or 0.2279 if we had always chosen the same most frequent class in the test set. This compares with the results of the team with the best weighted F1 score, which was an accuracy of 0.5117. The team entries seem to fall into cohorts, with all the teams in a cohort within a standard-deviation of each other, and our three entries are in the third cohort, which is about seven standard deviations from the top.
Tasks Language Modelling
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4817/
PDF https://www.aclweb.org/anthology/W16-4817
PWC https://paperswithcode.com/paper/classifying-asr-transcriptions-according-to
Repo
Framework

Correcting Errors in a Treebank Based on Tree Mining

Title Correcting Errors in a Treebank Based on Tree Mining
Authors Kanta Suzuki, Yoshihide Kato, Shigeki Matsubara
Abstract This paper provides a new method to correct annotation errors in a treebank. The previous error correction method constructs a pseudo parallel corpus where incorrect partial parse trees are paired with correct ones, and extracts error correction rules from the parallel corpus. By applying these rules to a treebank, the method corrects errors. However, this method does not achieve wide coverage of error correction. To achieve wide coverage, our method adopts a different approach. In our method, we consider that an infrequent pattern which can be transformed to a frequent one is an annotation error pattern. Based on a tree mining technique, our method seeks such infrequent tree patterns, and constructs error correction rules each of which consists of an infrequent pattern and a corresponding frequent pattern. We conducted an experiment using the Penn Treebank. We obtained 1,987 rules which are not constructed by the previous method, and the rules achieved good precision.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1244/
PDF https://www.aclweb.org/anthology/L16-1244
PWC https://paperswithcode.com/paper/correcting-errors-in-a-treebank-based-on-tree
Repo
Framework

UTA DLNLP at SemEval-2016 Task 12: Deep Learning Based Natural Language Processing System for Clinical Information Identification from Clinical Notes and Pathology Reports

Title UTA DLNLP at SemEval-2016 Task 12: Deep Learning Based Natural Language Processing System for Clinical Information Identification from Clinical Notes and Pathology Reports
Authors Peng Li, Heng Huang
Abstract
Tasks Information Retrieval, Language Modelling, Machine Translation, Named Entity Recognition, Paraphrase Identification, Question Answering, Representation Learning, Semantic Role Labeling
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1197/
PDF https://www.aclweb.org/anthology/S16-1197
PWC https://paperswithcode.com/paper/uta-dlnlp-at-semeval-2016-task-12-deep
Repo
Framework

Dealing with word-internal modification and spelling variation in data-driven lemmatization

Title Dealing with word-internal modification and spelling variation in data-driven lemmatization
Authors Fabian Barteld, Ingrid Schr{"o}der, Heike Zinsmeister
Abstract
Tasks Information Retrieval, Lemmatization
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2106/
PDF https://www.aclweb.org/anthology/W16-2106
PWC https://paperswithcode.com/paper/dealing-with-word-internal-modification-and
Repo
Framework

EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis

Title EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis
Authors David Vilares, Miguel A. Alonso, Carlos G{'o}mez-Rodr{'\i}guez
Abstract Code-switching texts are those that contain terms in two or more different languages, and they appear increasingly often in social media. The aim of this paper is to provide a resource to the research community to evaluate the performance of sentiment classification techniques on this complex multilingual environment, proposing an English-Spanish corpus of tweets with code-switching (EN-ES-CS CORPUS). The tweets are labeled according to two well-known criteria used for this purpose: SentiStrength and a trinary scale (positive, neutral and negative categories). Preliminary work on the resource is already done, providing a set of baselines for the research community.
Tasks Sentiment Analysis
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1655/
PDF https://www.aclweb.org/anthology/L16-1655
PWC https://paperswithcode.com/paper/en-es-cs-an-english-spanish-code-switching
Repo
Framework

WordForce: Visualizing Controversial Words in Debates

Title WordForce: Visualizing Controversial Words in Debates
Authors Wei-Fan Chen, Fang-Yu Lin, Lun-Wei Ku
Abstract This paper presents WordForce, a system powered by the state of the art neural network model to visualize the learned user-dependent word embeddings from each post according to the post content and its engaged users. It generates the scatter plots to show the force of a word, i.e., whether the semantics of word embeddings from posts of different stances are clearly separated from the aspect of this controversial word. In addition, WordForce provides the dispersion and the distance of word embeddings from posts of different stance groups, and proposes the most controversial words accordingly to show clues to what people argue about in a debate.
Tasks Sentiment Analysis, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2057/
PDF https://www.aclweb.org/anthology/C16-2057
PWC https://paperswithcode.com/paper/wordforce-visualizing-controversial-words-in
Repo
Framework

Classifying Emotions in Customer Support Dialogues in Social Media

Title Classifying Emotions in Customer Support Dialogues in Social Media
Authors Jonathan Herzig, Guy Feigenblat, Michal Shmueli-Scheuer, David Konopnicki, Anat Rafaeli, Daniel Altman, David Spivak
Abstract
Tasks
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3609/
PDF https://www.aclweb.org/anthology/W16-3609
PWC https://paperswithcode.com/paper/classifying-emotions-in-customer-support
Repo
Framework

Universal Dependencies for Persian

Title Universal Dependencies for Persian
Authors Mojgan Seraji, Filip Ginter, Joakim Nivre
Abstract The Persian Universal Dependency Treebank (Persian UD) is a recent effort of treebanking Persian with Universal Dependencies (UD), an ongoing project that designs unified and cross-linguistically valid grammatical representations including part-of-speech tags, morphological features, and dependency relations. The Persian UD is the converted version of the Uppsala Persian Dependency Treebank (UPDT) to the universal dependencies framework and consists of nearly 6,000 sentences and 152,871 word tokens with an average sentence length of 25 words. In addition to the universal dependencies syntactic annotation guidelines, the two treebanks differ in tokenization. All words containing unsegmented clitics (pronominal and copula clitics) annotated with complex labels in the UPDT have been separated from the clitics and appear with distinct labels in the Persian UD. The treebank has its original syntactic annotation scheme based on Stanford Typed Dependencies. In this paper, we present the approaches taken in the development of the Persian UD.
Tasks Tokenization
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1374/
PDF https://www.aclweb.org/anthology/L16-1374
PWC https://paperswithcode.com/paper/universal-dependencies-for-persian
Repo
Framework

Retrieval Term Prediction Using Deep Learning Methods

Title Retrieval Term Prediction Using Deep Learning Methods
Authors Qing Ma, Ibuki Tanigawa, Masaki Murata
Abstract
Tasks Chunking, Denoising, Information Retrieval, Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Semantic Role Labeling, Speech Recognition
Published 2016-10-01
URL https://www.aclweb.org/anthology/Y16-3001/
PDF https://www.aclweb.org/anthology/Y16-3001
PWC https://paperswithcode.com/paper/retrieval-term-prediction-using-deep-learning
Repo
Framework

Empirical comparison of dependency conversions for RST discourse trees

Title Empirical comparison of dependency conversions for RST discourse trees
Authors Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata
Abstract
Tasks Text Summarization
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3616/
PDF https://www.aclweb.org/anthology/W16-3616
PWC https://paperswithcode.com/paper/empirical-comparison-of-dependency
Repo
Framework

Extracting PDTB Discourse Relations from Student Essays

Title Extracting PDTB Discourse Relations from Student Essays
Authors Kate Forbes-Riley, Fan Zhang, Diane Litman
Abstract
Tasks
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3615/
PDF https://www.aclweb.org/anthology/W16-3615
PWC https://paperswithcode.com/paper/extracting-pdtb-discourse-relations-from
Repo
Framework

Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features

Title Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features
Authors Maximilian K{"o}per, Sabine Schulte im Walde
Abstract
Tasks Machine Translation, Word Sense Disambiguation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2042/
PDF https://www.aclweb.org/anthology/P16-2042
PWC https://paperswithcode.com/paper/automatic-semantic-classification-of-german
Repo
Framework

Automatic parsing as an efficient pre-annotation tool for historical texts

Title Automatic parsing as an efficient pre-annotation tool for historical texts
Authors Hanne Martine Eckhoff, Aleks Berdi{\v{c}}evskis, rs
Abstract Historical treebanks tend to be manually annotated, which is not surprising, since state-of-the-art parsers are not accurate enough to ensure high-quality annotation for historical texts. We test whether automatic parsing can be an efficient pre-annotation tool for Old East Slavic texts. We use the TOROT treebank from the PROIEL treebank family. We convert the PROIEL format to the CONLL format and use MaltParser to create syntactic pre-annotation. Using the most conservative evaluation method, which takes into account PROIEL-specific features, MaltParser by itself yields 0.845 unlabelled attachment score, 0.779 labelled attachment score and 0.741 secondary dependency accuracy (note, though, that the test set comes from a relatively simple genre and contains rather short sentences). Experiments with human annotators show that preparsing, if limited to sentences where no changes to word or sentence boundaries are required, increases their annotation rate. For experienced annotators, the speed gain varies from 5.80{%} to 16.57{%}, for inexperienced annotators from 14.61{%} to 32.17{%} (using conservative estimates). There are no strong reliable differences in the annotation accuracy, which means that there is no reason to suspect that using preparsing might lower the final annotation quality.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4009/
PDF https://www.aclweb.org/anthology/W16-4009
PWC https://paperswithcode.com/paper/automatic-parsing-as-an-efficient-pre
Repo
Framework

Fast and Easy Short Answer Grading with High Accuracy

Title Fast and Easy Short Answer Grading with High Accuracy
Authors Md Arafat Sultan, Cristobal Salazar, Tamara Sumner
Abstract
Tasks Semantic Textual Similarity
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1123/
PDF https://www.aclweb.org/anthology/N16-1123
PWC https://paperswithcode.com/paper/fast-and-easy-short-answer-grading-with-high
Repo
Framework

Semantic classifications for detection of verb metaphors

Title Semantic classifications for detection of verb metaphors
Authors Beata Beigman Klebanov, Chee Wee Leong, E. Dario Gutierrez, Ekaterina Shutova, Michael Flor
Abstract
Tasks Topic Models
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2017/
PDF https://www.aclweb.org/anthology/P16-2017
PWC https://paperswithcode.com/paper/semantic-classifications-for-detection-of
Repo
Framework
comments powered by Disqus