July 26, 2019

1441 words 7 mins read

Paper Group NANR 128

Improving POS Tagging in Old Spanish Using TEITOK. Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language. The Labeled Segmentation of Printed Books. Detecting spelling variants in non-standard texts. Estonian Copular and Existential Constructions as an UD Annotation Problem. Churn Identification in Microblog …

Improving POS Tagging in Old Spanish Using TEITOK


Title	Improving POS Tagging in Old Spanish Using TEITOK
Authors	Maarten Janssen, Josep Ausensi, Josep Fontana
Abstract
Tasks
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0502/
PDF	https://www.aclweb.org/anthology/W17-0502
PWC	https://paperswithcode.com/paper/improving-pos-tagging-in-old-spanish-using
Repo
Framework

Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language


Title	Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language
Authors	Benyamin Ahmadnia, Javier Serrano, Gholamreza Haffari
Abstract	This paper is an attempt to exclusively focus on investigating the pivot language technique in which a bridging language is utilized to increase the quality of the Persian-Spanish low-resource Statistical Machine Translation (SMT). In this case, English is used as the bridging language, and the Persian-English SMT is combined with the English-Spanish one, where the relatively large corpora of each may be used in support of the Persian-Spanish pairing. Our results indicate that the pivot language technique outperforms the direct SMT processes currently in use between Persian and Spanish. Furthermore, we investigate the sentence translation pivot strategy and the phrase translation in turn, and demonstrate that, in the context of the Persian-Spanish SMT system, the phrase-level pivoting outperforms the sentence-level pivoting. Finally we suggest a method called combination model in which the standard direct model and the best triangulation pivoting model are blended in order to reach a high-quality translation.
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1004/
PDF	https://doi.org/10.26615/978-954-452-049-6_004
PWC	https://paperswithcode.com/paper/persian-spanish-low-resource-statistical
Repo
Framework

The Labeled Segmentation of Printed Books


Title	The Labeled Segmentation of Printed Books
Authors	Lara McConnaughey, Jennifer Dai, David Bamman
Abstract	We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books. We manually annotate the page-level structural categories for a large dataset totaling 294,816 pages in 1,055 books evenly sampled from 1750-1922, and present empirical results comparing the performance of several classes of models. The best-performing model, a bidirectional LSTM with rich features, achieves an overall accuracy of 95.8 and a class-balanced macro F-score of 71.4.
Tasks	Optical Character Recognition
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1077/
PDF	https://www.aclweb.org/anthology/D17-1077
PWC	https://paperswithcode.com/paper/the-labeled-segmentation-of-printed-books
Repo
Framework

Detecting spelling variants in non-standard texts


Title	Detecting spelling variants in non-standard texts
Authors	Fabian Barteld
Abstract	Spelling variation in non-standard language, e.g. computer-mediated communication and historical texts, is usually treated as a deviation from a standard spelling, e.g. 2mr as an non-standard spelling for tomorrow. Consequently, in normalization {–} the standard approach of dealing with spelling variation {–} so-called non-standard words are mapped to their corresponding standard words. However, there is not always a corresponding standard word. This can be the case for single types (like emoticons in computer-mediated communication) or a complete language, e.g. texts from historical languages that did not develop to a standard variety. The approach presented in this thesis proposal deals with spelling variation in absence of reference to a standard. The task is to detect pairs of types that are variants of the same morphological word. An approach for spelling-variant detection is presented, where pairs of potential spelling variants are generated with Levenshtein distance and subsequently filtered by supervised machine learning. The approach is evaluated on historical Low German texts. Finally, further perspectives are discussed.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-4002/
PDF	https://www.aclweb.org/anthology/E17-4002
PWC	https://paperswithcode.com/paper/detecting-spelling-variants-in-non-standard
Repo
Framework

Estonian Copular and Existential Constructions as an UD Annotation Problem


Title	Estonian Copular and Existential Constructions as an UD Annotation Problem
Authors	Kadri Muischnek, Kaili M{"u}{"u}risep
Abstract
Tasks
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0410/
PDF	https://www.aclweb.org/anthology/W17-0410
PWC	https://paperswithcode.com/paper/estonian-copular-and-existential
Repo
Framework

Churn Identification in Microblogs using Convolutional Neural Networks with Structured Logical Knowledge


Title	Churn Identification in Microblogs using Convolutional Neural Networks with Structured Logical Knowledge
Authors	Mourad Gridach, Hatem Haddad, Hala Mulki
Abstract	For brands, gaining new customer is more expensive than keeping an existing one. Therefore, the ability to keep customers in a brand is becoming more challenging these days. Churn happens when a customer leaves a brand to another competitor. Most of the previous work considers the problem of churn prediction using the Call Detail Records (CDRs). In this paper, we use micro-posts to classify customers into churny or non-churny. We explore the power of convolutional neural networks (CNNs) since they achieved state-of-the-art in various computer vision and NLP applications. However, the robustness of end-to-end models has some limitations such as the availability of a large amount of labeled data and uninterpretability of these models. We investigate the use of CNNs augmented with structured logic rules to overcome or reduce this issue. We developed our system called Churn{_}teacher by using an iterative distillation method that transfers the knowledge, extracted using just the combination of three logic rules, directly into the weight of the DNNs. Furthermore, we used weight normalization to speed up training our convolutional neural networks. Experimental results showed that with just these three rules, we were able to get state-of-the-art on publicly available Twitter dataset about three Telecom brands.
Tasks	Language Modelling, Machine Translation, Sentiment Analysis, Speech Recognition
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4403/
PDF	https://www.aclweb.org/anthology/W17-4403
PWC	https://paperswithcode.com/paper/churn-identification-in-microblogs-using
Repo
Framework

Bootstrapping for Numerical Open IE


Title	Bootstrapping for Numerical Open IE
Authors	Swarnadeep Saha, Harinder Pal, {Mausam}
Abstract	We design and release BONIE, the first open numerical relation extractor, for extracting Open IE tuples where one of the arguments is a number or a quantity-unit phrase. BONIE uses bootstrapping to learn the specific dependency patterns that express numerical relations in a sentence. BONIE{'}s novelty lies in task-specific customizations, such as inferring implicit relations, which are clear due to context such as units (for e.g., {`}square kilometers{'} suggests area, even if the word {`}area{'} is missing in the sentence). BONIE obtains 1.5x yield and 15 point precision gain on numerical facts over a state-of-the-art Open IE system.
Tasks	Open Information Extraction
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2050/
PDF	https://www.aclweb.org/anthology/P17-2050
PWC	https://paperswithcode.com/paper/bootstrapping-for-numerical-open-ie
Repo
Framework

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts


Title	Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-5000/
PDF	https://www.aclweb.org/anthology/P17-5000
PWC	https://paperswithcode.com/paper/proceedings-of-acl-2017-tutorial-abstracts
Repo
Framework

Inter-Annotator Agreement in Sentiment Analysis: Machine Learning Perspective


Title	Inter-Annotator Agreement in Sentiment Analysis: Machine Learning Perspective
Authors	Victoria Bobicev, Marina Sokolova
Abstract	Manual text annotation is an essential part of Big Text analytics. Although annotators work with limited parts of data sets, their results are extrapolated by automated text classification and affect the final classification results. Reliability of annotations and adequacy of assigned labels are especially important in the case of sentiment annotations. In the current study we examine inter-annotator agreement in multi-class, multi-label sentiment annotation of messages. We used several annotation agreement measures, as well as statistical analysis and Machine Learning to assess the resulting annotations.
Tasks	Sentiment Analysis, Text Classification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1015/
PDF	https://doi.org/10.26615/978-954-452-049-6_015
PWC	https://paperswithcode.com/paper/inter-annotator-agreement-in-sentiment
Repo
Framework

A Crowdsourcing Approach for Annotating Causal Relation Instances in Wikipedia


Title	A Crowdsourcing Approach for Annotating Causal Relation Instances in Wikipedia
Authors	Kazuaki Hanawa, Akira Sasaki, Naoaki Okazaki, Kentaro Inui
Abstract
Tasks	Named Entity Recognition, Question Answering, Stance Detection
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1045/
PDF	https://www.aclweb.org/anthology/Y17-1045
PWC	https://paperswithcode.com/paper/a-crowdsourcing-approach-for-annotating
Repo
Framework

Extracting a Lexicon of Discourse Connectives in Czech from an Annotated Corpus


Title	Extracting a Lexicon of Discourse Connectives in Czech from an Annotated Corpus
Authors	Pavl{'\i}na Synkov{'a}, Magdal{'e}na Rysov{'a}, Lucie Pol{'a}kov{'a}, Ji{\v{r}}{'\i} M{'\i}rovsk{'y}
Abstract
Tasks	Machine Translation, Text Generation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1032/
PDF	https://www.aclweb.org/anthology/Y17-1032
PWC	https://paperswithcode.com/paper/extracting-a-lexicon-of-discourse-connectives
Repo
Framework

Standard and nonstandard lexicon in aviation English: A corpus linguistic study


Title	Standard and nonstandard lexicon in aviation English: A corpus linguistic study
Authors	Ramsey Ferrer, Jollene Empinado, Eloisa Marie Calico, Jan Yharie Floro
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1010/
PDF	https://www.aclweb.org/anthology/Y17-1010
PWC	https://paperswithcode.com/paper/standard-and-nonstandard-lexicon-in-aviation
Repo
Framework

Constructing narrative using a generative model and continuous action policies


Title	Constructing narrative using a generative model and continuous action policies
Authors	Emmanouil Theofanis Chourdakis, Joshua Reiss
Abstract
Tasks	Paraphrase Identification, Q-Learning, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3905/
PDF	https://www.aclweb.org/anthology/W17-3905
PWC	https://paperswithcode.com/paper/constructing-narrative-using-a-generative
Repo
Framework

On the order of Words in Italian: a Study on Genre vs Complexity


Title	On the order of Words in Italian: a Study on Genre vs Complexity
Authors	Dominique Brunato, Felice Dell{'}Orletta
Abstract
Tasks	Dependency Parsing
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6505/
PDF	https://www.aclweb.org/anthology/W17-6505
PWC	https://paperswithcode.com/paper/on-the-order-of-words-in-italian-a-study-on
Repo
Framework

A Dictionary-Based Comparison of Autobiographies by People and Murderous Monsters


Title	A Dictionary-Based Comparison of Autobiographies by People and Murderous Monsters
Authors	Micah Iserman, Molly Ireland
Abstract
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/papers/W17-3109/w17-3109
PDF	https://www.aclweb.org/anthology/W17-3109
PWC	https://paperswithcode.com/paper/a-dictionary-based-comparison-of
Repo
Framework