July 26, 2019

1441 words 7 mins read

Paper Group NANR 128

Paper Group NANR 128

Improving POS Tagging in Old Spanish Using TEITOK. Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language. The Labeled Segmentation of Printed Books. Detecting spelling variants in non-standard texts. Estonian Copular and Existential Constructions as an UD Annotation Problem. Churn Identification in Microblog …

Improving POS Tagging in Old Spanish Using TEITOK

Title Improving POS Tagging in Old Spanish Using TEITOK
Authors Maarten Janssen, Josep Ausensi, Josep Fontana
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0502/
PDF https://www.aclweb.org/anthology/W17-0502
PWC https://paperswithcode.com/paper/improving-pos-tagging-in-old-spanish-using
Repo
Framework

Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language

Title Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language
Authors Benyamin Ahmadnia, Javier Serrano, Gholamreza Haffari
Abstract This paper is an attempt to exclusively focus on investigating the pivot language technique in which a bridging language is utilized to increase the quality of the Persian-Spanish low-resource Statistical Machine Translation (SMT). In this case, English is used as the bridging language, and the Persian-English SMT is combined with the English-Spanish one, where the relatively large corpora of each may be used in support of the Persian-Spanish pairing. Our results indicate that the pivot language technique outperforms the direct SMT processes currently in use between Persian and Spanish. Furthermore, we investigate the sentence translation pivot strategy and the phrase translation in turn, and demonstrate that, in the context of the Persian-Spanish SMT system, the phrase-level pivoting outperforms the sentence-level pivoting. Finally we suggest a method called combination model in which the standard direct model and the best triangulation pivoting model are blended in order to reach a high-quality translation.
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1004/
PDF https://doi.org/10.26615/978-954-452-049-6_004
PWC https://paperswithcode.com/paper/persian-spanish-low-resource-statistical
Repo
Framework

The Labeled Segmentation of Printed Books

Title The Labeled Segmentation of Printed Books
Authors Lara McConnaughey, Jennifer Dai, David Bamman
Abstract We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books. We manually annotate the page-level structural categories for a large dataset totaling 294,816 pages in 1,055 books evenly sampled from 1750-1922, and present empirical results comparing the performance of several classes of models. The best-performing model, a bidirectional LSTM with rich features, achieves an overall accuracy of 95.8 and a class-balanced macro F-score of 71.4.
Tasks Optical Character Recognition
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1077/
PDF https://www.aclweb.org/anthology/D17-1077
PWC https://paperswithcode.com/paper/the-labeled-segmentation-of-printed-books
Repo
Framework

Detecting spelling variants in non-standard texts

Title Detecting spelling variants in non-standard texts
Authors Fabian Barteld
Abstract Spelling variation in non-standard language, e.g. computer-mediated communication and historical texts, is usually treated as a deviation from a standard spelling, e.g. 2mr as an non-standard spelling for tomorrow. Consequently, in normalization {–} the standard approach of dealing with spelling variation {–} so-called non-standard words are mapped to their corresponding standard words. However, there is not always a corresponding standard word. This can be the case for single types (like emoticons in computer-mediated communication) or a complete language, e.g. texts from historical languages that did not develop to a standard variety. The approach presented in this thesis proposal deals with spelling variation in absence of reference to a standard. The task is to detect pairs of types that are variants of the same morphological word. An approach for spelling-variant detection is presented, where pairs of potential spelling variants are generated with Levenshtein distance and subsequently filtered by supervised machine learning. The approach is evaluated on historical Low German texts. Finally, further perspectives are discussed.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-4002/
PDF https://www.aclweb.org/anthology/E17-4002
PWC https://paperswithcode.com/paper/detecting-spelling-variants-in-non-standard
Repo
Framework

Estonian Copular and Existential Constructions as an UD Annotation Problem

Title Estonian Copular and Existential Constructions as an UD Annotation Problem
Authors Kadri Muischnek, Kaili M{"u}{"u}risep
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0410/
PDF https://www.aclweb.org/anthology/W17-0410
PWC https://paperswithcode.com/paper/estonian-copular-and-existential
Repo
Framework

Churn Identification in Microblogs using Convolutional Neural Networks with Structured Logical Knowledge

Title Churn Identification in Microblogs using Convolutional Neural Networks with Structured Logical Knowledge
Authors Mourad Gridach, Hatem Haddad, Hala Mulki
Abstract For brands, gaining new customer is more expensive than keeping an existing one. Therefore, the ability to keep customers in a brand is becoming more challenging these days. Churn happens when a customer leaves a brand to another competitor. Most of the previous work considers the problem of churn prediction using the Call Detail Records (CDRs). In this paper, we use micro-posts to classify customers into churny or non-churny. We explore the power of convolutional neural networks (CNNs) since they achieved state-of-the-art in various computer vision and NLP applications. However, the robustness of end-to-end models has some limitations such as the availability of a large amount of labeled data and uninterpretability of these models. We investigate the use of CNNs augmented with structured logic rules to overcome or reduce this issue. We developed our system called Churn{_}teacher by using an iterative distillation method that transfers the knowledge, extracted using just the combination of three logic rules, directly into the weight of the DNNs. Furthermore, we used weight normalization to speed up training our convolutional neural networks. Experimental results showed that with just these three rules, we were able to get state-of-the-art on publicly available Twitter dataset about three Telecom brands.
Tasks Language Modelling, Machine Translation, Sentiment Analysis, Speech Recognition
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4403/
PDF https://www.aclweb.org/anthology/W17-4403
PWC https://paperswithcode.com/paper/churn-identification-in-microblogs-using
Repo
Framework

Bootstrapping for Numerical Open IE

Title Bootstrapping for Numerical Open IE
Authors Swarnadeep Saha, Harinder Pal, {Mausam}
Abstract We design and release BONIE, the first open numerical relation extractor, for extracting Open IE tuples where one of the arguments is a number or a quantity-unit phrase. BONIE uses bootstrapping to learn the specific dependency patterns that express numerical relations in a sentence. BONIE{'}s novelty lies in task-specific customizations, such as inferring implicit relations, which are clear due to context such as units (for e.g., {}square kilometers{'} suggests area, even if the word {}area{'} is missing in the sentence). BONIE obtains 1.5x yield and 15 point precision gain on numerical facts over a state-of-the-art Open IE system.
Tasks Open Information Extraction
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2050/
PDF https://www.aclweb.org/anthology/P17-2050
PWC https://paperswithcode.com/paper/bootstrapping-for-numerical-open-ie
Repo
Framework

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

Title Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-5000/
PDF https://www.aclweb.org/anthology/P17-5000
PWC https://paperswithcode.com/paper/proceedings-of-acl-2017-tutorial-abstracts
Repo
Framework

Inter-Annotator Agreement in Sentiment Analysis: Machine Learning Perspective

Title Inter-Annotator Agreement in Sentiment Analysis: Machine Learning Perspective
Authors Victoria Bobicev, Marina Sokolova
Abstract Manual text annotation is an essential part of Big Text analytics. Although annotators work with limited parts of data sets, their results are extrapolated by automated text classification and affect the final classification results. Reliability of annotations and adequacy of assigned labels are especially important in the case of sentiment annotations. In the current study we examine inter-annotator agreement in multi-class, multi-label sentiment annotation of messages. We used several annotation agreement measures, as well as statistical analysis and Machine Learning to assess the resulting annotations.
Tasks Sentiment Analysis, Text Classification
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1015/
PDF https://doi.org/10.26615/978-954-452-049-6_015
PWC https://paperswithcode.com/paper/inter-annotator-agreement-in-sentiment
Repo
Framework

A Crowdsourcing Approach for Annotating Causal Relation Instances in Wikipedia

Title A Crowdsourcing Approach for Annotating Causal Relation Instances in Wikipedia
Authors Kazuaki Hanawa, Akira Sasaki, Naoaki Okazaki, Kentaro Inui
Abstract
Tasks Named Entity Recognition, Question Answering, Stance Detection
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1045/
PDF https://www.aclweb.org/anthology/Y17-1045
PWC https://paperswithcode.com/paper/a-crowdsourcing-approach-for-annotating
Repo
Framework

Extracting a Lexicon of Discourse Connectives in Czech from an Annotated Corpus

Title Extracting a Lexicon of Discourse Connectives in Czech from an Annotated Corpus
Authors Pavl{'\i}na Synkov{'a}, Magdal{'e}na Rysov{'a}, Lucie Pol{'a}kov{'a}, Ji{\v{r}}{'\i} M{'\i}rovsk{'y}
Abstract
Tasks Machine Translation, Text Generation
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1032/
PDF https://www.aclweb.org/anthology/Y17-1032
PWC https://paperswithcode.com/paper/extracting-a-lexicon-of-discourse-connectives
Repo
Framework

Standard and nonstandard lexicon in aviation English: A corpus linguistic study

Title Standard and nonstandard lexicon in aviation English: A corpus linguistic study
Authors Ramsey Ferrer, Jollene Empinado, Eloisa Marie Calico, Jan Yharie Floro
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1010/
PDF https://www.aclweb.org/anthology/Y17-1010
PWC https://paperswithcode.com/paper/standard-and-nonstandard-lexicon-in-aviation
Repo
Framework

Constructing narrative using a generative model and continuous action policies

Title Constructing narrative using a generative model and continuous action policies
Authors Emmanouil Theofanis Chourdakis, Joshua Reiss
Abstract
Tasks Paraphrase Identification, Q-Learning, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3905/
PDF https://www.aclweb.org/anthology/W17-3905
PWC https://paperswithcode.com/paper/constructing-narrative-using-a-generative
Repo
Framework

On the order of Words in Italian: a Study on Genre vs Complexity

Title On the order of Words in Italian: a Study on Genre vs Complexity
Authors Dominique Brunato, Felice Dell{'}Orletta
Abstract
Tasks Dependency Parsing
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6505/
PDF https://www.aclweb.org/anthology/W17-6505
PWC https://paperswithcode.com/paper/on-the-order-of-words-in-italian-a-study-on
Repo
Framework

A Dictionary-Based Comparison of Autobiographies by People and Murderous Monsters

Title A Dictionary-Based Comparison of Autobiographies by People and Murderous Monsters
Authors Micah Iserman, Molly Ireland
Abstract
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/papers/W17-3109/w17-3109
PDF https://www.aclweb.org/anthology/W17-3109
PWC https://paperswithcode.com/paper/a-dictionary-based-comparison-of
Repo
Framework
comments powered by Disqus