May 5, 2019

1869 words 9 mins read

Paper Group NANR 34

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. Fine-Grained Chinese Discourse Relation Labelling. TopoText: Interactive Digital Mapping of Literary Text. The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources. Identifying Referenced Text i …

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning


Title	Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning
Authors
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1000/
PDF	https://www.aclweb.org/anthology/K16-1000
PWC	https://paperswithcode.com/paper/proceedings-of-the-20th-signll-conference-on
Repo
Framework

Fine-Grained Chinese Discourse Relation Labelling


Title	Fine-Grained Chinese Discourse Relation Labelling
Authors	Huan-Yuan Chen, Wan-Shan Liao, Hen-Hsen Huang, Hsin-Hsi Chen
Abstract	This paper explores several aspects together for a fine-grained Chinese discourse analysis. We deal with the issues of ambiguous discourse markers, ambiguous marker linkings, and more than one discourse marker. A universal feature representation is proposed. The pair-once postulation, cross-discourse-unit-first rule and word-pair-marker-first rule select a set of discourse markers from ambiguous linkings. Marker-Sum feature considers total contribution of markers and Marker-Preference feature captures the probability distribution of discourse functions of a representative marker by using preference rule. The HIT Chinese discourse relation treebank (HIT-CDTB) is used to evaluate the proposed models. The 25-way classifier achieves 0.57 micro-averaged F-score.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1164/
PDF	https://www.aclweb.org/anthology/L16-1164
PWC	https://paperswithcode.com/paper/fine-grained-chinese-discourse-relation
Repo
Framework

TopoText: Interactive Digital Mapping of Literary Text


Title	TopoText: Interactive Digital Mapping of Literary Text
Authors	R El Khatib, a, Julia El Zini, David Wrisley, Mohamad Jaber, Shady Elbassuoni
Abstract	We demonstrate TopoText, an interactive tool for digital mapping of literary text. TopoText takes as input a literary piece of text such as a novel or a biography article and automatically extracts all place names in the text. The identified places are then geoparsed and displayed on an interactive map. TopoText calculates the number of times a place was mentioned in the text, which is then reflected on the map allowing the end-user to grasp the importance of the different places within the text. It also displays the most frequent words mentioned within a specified proximity of a place name in context or across the entire text. This can also be faceted according to part of speech tags. Finally, TopoText keeps the human in the loop by allowing the end-user to disambiguate places and to provide specific place annotations. All extracted information such as geolocations, place frequencies, as well as all user-provided annotations can be automatically exported as a CSV file that can be imported later by the same user or other users.
Tasks	Part-Of-Speech Tagging
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2040/
PDF	https://www.aclweb.org/anthology/C16-2040
PWC	https://paperswithcode.com/paper/topotext-interactive-digital-mapping-of
Repo
Framework

The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources


Title	The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources
Authors	Georg Rehm
Abstract	Language Resources (LRs) are an essential ingredient of current approaches in Linguistics, Computational Linguistics, Language Technology and related fields. LRs are collections of spoken or written language data, typically annotated with linguistic analysis information. Different types of LRs exist, for example, corpora, ontologies, lexicons, collections of spoken language data (audio), or collections that also include video (multimedia, multimodal). Often, LRs are distributed with specific tools, documentation, manuals or research publications. The different phases that involve creating and distributing an LR can be conceptualised as a life cycle. While the idea of handling the LR production and maintenance process in terms of a life cycle has been brought up quite some time ago, a best practice model or common approach can still be considered a research gap. This article wants to help fill this gap by proposing an initial version of a generic Language Resource Life Cycle that can be used to inform, direct, control and evaluate LR research and development activities (including description, management, production, validation and evaluation workflows).
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1388/
PDF	https://www.aclweb.org/anthology/L16-1388
PWC	https://paperswithcode.com/paper/the-language-resource-life-cycle-towards-a
Repo
Framework

Identifying Referenced Text in Scientific Publications by Summarisation and Classification Techniques


Title	Identifying Referenced Text in Scientific Publications by Summarisation and Classification Techniques
Authors	Stefan Klampfl, Andi Rexha, Roman Kern
Abstract
Tasks	Document Summarization, Information Retrieval
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1514/
PDF	https://www.aclweb.org/anthology/W16-1514
PWC	https://paperswithcode.com/paper/identifying-referenced-text-in-scientific
Repo
Framework

Efficient techniques for parsing with tree automata


Title	Efficient techniques for parsing with tree automata
Authors	Jonas Groschwitz, Alex Koller, er, Mark Johnson
Abstract
Tasks	Machine Translation, Semantic Parsing
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1192/
PDF	https://www.aclweb.org/anthology/P16-1192
PWC	https://paperswithcode.com/paper/efficient-techniques-for-parsing-with-tree
Repo
Framework

Towards Building a Political Protest Database to Explain Changes in the Welfare State


Title	Towards Building a Political Protest Database to Explain Changes in the Welfare State
Authors	{\c{C}}a{\u{g}}{\i}l S{"o}nmez, Arzucan {"O}zg{"u}r, Erdem Y{"o}r{"u}k
Abstract
Tasks	Time Series
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2113/
PDF	https://www.aclweb.org/anthology/W16-2113
PWC	https://paperswithcode.com/paper/towards-building-a-political-protest-database
Repo
Framework

Neural Attention for Learning to Rank Questions in Community Question Answering


Title	Neural Attention for Learning to Rank Questions in Community Question Answering
Authors	Salvatore Romeo, Giovanni Da San Martino, Alberto Barr{'o}n-Cede{~n}o, Aless Moschitti, ro, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Mitra Mohtarami, James Glass
Abstract	In real-world data, e.g., from Web forums, text is often contaminated with redundant or irrelevant content, which leads to introducing noise in machine learning algorithms. In this paper, we apply Long Short-Term Memory networks with an attention mechanism, which can select important parts of text for the task of similar question retrieval from community Question Answering (cQA) forums. In particular, we use the attention weights for both selecting entire sentences and their subparts, i.e., word/chunk, from shallow syntactic trees. More interestingly, we apply tree kernels to the filtered text representations, thus exploiting the implicit features of the subtree space for learning question reranking. Our results show that the attention-based pruning allows for achieving the top position in the cQA challenge of SemEval 2016, with a relatively large gap from the other participants while greatly decreasing running time.
Tasks	Community Question Answering, Learning-To-Rank, Natural Language Inference, Question Answering
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1163/
PDF	https://www.aclweb.org/anthology/C16-1163
PWC	https://paperswithcode.com/paper/neural-attention-for-learning-to-rank
Repo
Framework

Cooperative Graphical Models


Title	Cooperative Graphical Models
Authors	Josip Djolonga, Stefanie Jegelka, Sebastian Tschiatschek, Andreas Krause
Abstract	We study a rich family of distributions that capture variable interactions significantly more expressive than those representable with low-treewidth or pairwise graphical models, or log-supermodular models. We call these cooperative graphical models. Yet, this family retains structure, which we carefully exploit for efficient inference techniques. Our algorithms combine the polyhedral structure of submodular functions in new ways with variational inference methods to obtain both lower and upper bounds on the partition function. While our fully convex upper bound is minimized as an SDP or via tree-reweighted belief propagation, our lower bound is tightened via belief propagation or mean-field algorithms. The resulting algorithms are easy to implement and, as our experiments show, effectively obtain good bounds and marginals for synthetic and real-world examples.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6122-cooperative-graphical-models
PDF	http://papers.nips.cc/paper/6122-cooperative-graphical-models.pdf
PWC	https://paperswithcode.com/paper/cooperative-graphical-models
Repo
Framework

Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis


Title	Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis
Authors	Orph{'e}e De Clercq, V{'e}ronique Hoste
Abstract	The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1465/
PDF	https://www.aclweb.org/anthology/L16-1465
PWC	https://paperswithcode.com/paper/rude-waiter-but-mouthwatering-pastries-an
Repo
Framework

Substring-based unsupervised transliteration with phonetic and contextual knowledge


Title	Substring-based unsupervised transliteration with phonetic and contextual knowledge
Authors	Anoop Kunchukuttan, Pushpak Bhattacharyya, Mitesh M. Khapra
Abstract
Tasks	Information Retrieval, Machine Translation, Transliteration
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1027/
PDF	https://www.aclweb.org/anthology/K16-1027
PWC	https://paperswithcode.com/paper/substring-based-unsupervised-transliteration
Repo
Framework

The Gavagai Living Lexicon


Title	The Gavagai Living Lexicon
Authors	Magnus Sahlgren, Amaru Cuba Gyllensten, Fredrik Espinoza, Ola Hamfors, Jussi Karlgren, Fredrik Olsson, Per Persson, Akshay Viswanathan, Anders Holst
Abstract	This paper presents the Gavagai Living Lexicon, which is an online distributional semantic model currently available in 20 different languages. We describe the underlying distributional semantic model, and how we have solved some of the challenges in applying such a model to large amounts of streaming data. We also describe the architecture of our implementation, and discuss how we deal with continuous quality assurance of the lexicon.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1053/
PDF	https://www.aclweb.org/anthology/L16-1053
PWC	https://paperswithcode.com/paper/the-gavagai-living-lexicon
Repo
Framework

Odin’s Runes: A Rule Language for Information Extraction


Title	Odin’s Runes: A Rule Language for Information Extraction
Authors	Marco A. Valenzuela-Esc{'a}rcega, Gus Hahn-Powell, Mihai Surdeanu
Abstract	Odin is an information extraction framework that applies cascades of finite state automata over both surface text and syntactic dependency graphs. Support for syntactic patterns allow us to concisely define relations that are otherwise difficult to express in languages such as Common Pattern Specification Language (CPSL), which are currently limited to shallow linguistic features. The interaction of lexical and syntactic automata provides robustness and flexibility when writing extraction rules. This paper describes Odin{'}s declarative language for writing these cascaded automata.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1050/
PDF	https://www.aclweb.org/anthology/L16-1050
PWC	https://paperswithcode.com/paper/odins-runes-a-rule-language-for-information
Repo
Framework

Automatic Detection of Arabicized Berber and Arabic Varieties


Title	Automatic Detection of Arabicized Berber and Arabic Varieties
Authors	Wafia Adouane, Nasredine Semmar, Richard Johansson, Victoria Bobicev
Abstract	Automatic Language Identification (ALI) is the detection of the natural language of an input text by a machine. It is the first necessary step to do any language-dependent natural language processing task. Various methods have been successfully applied to a wide range of languages, and the state-of-the-art automatic language identifiers are mainly based on character n-gram models trained on huge corpora. However, there are many languages which are not yet automatically processed, for instance minority and informal languages. Many of these languages are only spoken and do not exist in a written format. Social media platforms and new technologies have facilitated the emergence of written format for these spoken languages based on pronunciation. The latter are not well represented on the Web, commonly referred to as under-resourced languages, and the current available ALI tools fail to properly recognize them. In this paper, we revisit the problem of ALI with the focus on Arabicized Berber and dialectal Arabic short texts. We introduce new resources and evaluate the existing methods. The results show that machine learning models combined with lexicons are well suited for detecting Arabicized Berber and different Arabic varieties and distinguishing between them, giving a macro-average F-score of 92.94{%}.
Tasks	Language Identification
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4809/
PDF	https://www.aclweb.org/anthology/W16-4809
PWC	https://paperswithcode.com/paper/automatic-detection-of-arabicized-berber-and
Repo
Framework

The Gun Violence Database: A new task and data set for NLP


Title	The Gun Violence Database: A new task and data set for NLP
Authors	Ellie Pavlick, Heng Ji, Xiaoman Pan, Chris Callison-Burch
Abstract
Tasks	Coreference Resolution, Relation Extraction
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1106/
PDF	https://www.aclweb.org/anthology/D16-1106
PWC	https://paperswithcode.com/paper/the-gun-violence-database-a-new-task-and-data
Repo
Framework