May 5, 2019

1869 words 9 mins read

Paper Group NANR 34

Paper Group NANR 34

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. Fine-Grained Chinese Discourse Relation Labelling. TopoText: Interactive Digital Mapping of Literary Text. The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources. Identifying Referenced Text i …

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

Title Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning
Authors
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-1000/
PDF https://www.aclweb.org/anthology/K16-1000
PWC https://paperswithcode.com/paper/proceedings-of-the-20th-signll-conference-on
Repo
Framework

Fine-Grained Chinese Discourse Relation Labelling

Title Fine-Grained Chinese Discourse Relation Labelling
Authors Huan-Yuan Chen, Wan-Shan Liao, Hen-Hsen Huang, Hsin-Hsi Chen
Abstract This paper explores several aspects together for a fine-grained Chinese discourse analysis. We deal with the issues of ambiguous discourse markers, ambiguous marker linkings, and more than one discourse marker. A universal feature representation is proposed. The pair-once postulation, cross-discourse-unit-first rule and word-pair-marker-first rule select a set of discourse markers from ambiguous linkings. Marker-Sum feature considers total contribution of markers and Marker-Preference feature captures the probability distribution of discourse functions of a representative marker by using preference rule. The HIT Chinese discourse relation treebank (HIT-CDTB) is used to evaluate the proposed models. The 25-way classifier achieves 0.57 micro-averaged F-score.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1164/
PDF https://www.aclweb.org/anthology/L16-1164
PWC https://paperswithcode.com/paper/fine-grained-chinese-discourse-relation
Repo
Framework

TopoText: Interactive Digital Mapping of Literary Text

Title TopoText: Interactive Digital Mapping of Literary Text
Authors R El Khatib, a, Julia El Zini, David Wrisley, Mohamad Jaber, Shady Elbassuoni
Abstract We demonstrate TopoText, an interactive tool for digital mapping of literary text. TopoText takes as input a literary piece of text such as a novel or a biography article and automatically extracts all place names in the text. The identified places are then geoparsed and displayed on an interactive map. TopoText calculates the number of times a place was mentioned in the text, which is then reflected on the map allowing the end-user to grasp the importance of the different places within the text. It also displays the most frequent words mentioned within a specified proximity of a place name in context or across the entire text. This can also be faceted according to part of speech tags. Finally, TopoText keeps the human in the loop by allowing the end-user to disambiguate places and to provide specific place annotations. All extracted information such as geolocations, place frequencies, as well as all user-provided annotations can be automatically exported as a CSV file that can be imported later by the same user or other users.
Tasks Part-Of-Speech Tagging
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2040/
PDF https://www.aclweb.org/anthology/C16-2040
PWC https://paperswithcode.com/paper/topotext-interactive-digital-mapping-of
Repo
Framework

The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources

Title The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources
Authors Georg Rehm
Abstract Language Resources (LRs) are an essential ingredient of current approaches in Linguistics, Computational Linguistics, Language Technology and related fields. LRs are collections of spoken or written language data, typically annotated with linguistic analysis information. Different types of LRs exist, for example, corpora, ontologies, lexicons, collections of spoken language data (audio), or collections that also include video (multimedia, multimodal). Often, LRs are distributed with specific tools, documentation, manuals or research publications. The different phases that involve creating and distributing an LR can be conceptualised as a life cycle. While the idea of handling the LR production and maintenance process in terms of a life cycle has been brought up quite some time ago, a best practice model or common approach can still be considered a research gap. This article wants to help fill this gap by proposing an initial version of a generic Language Resource Life Cycle that can be used to inform, direct, control and evaluate LR research and development activities (including description, management, production, validation and evaluation workflows).
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1388/
PDF https://www.aclweb.org/anthology/L16-1388
PWC https://paperswithcode.com/paper/the-language-resource-life-cycle-towards-a
Repo
Framework

Identifying Referenced Text in Scientific Publications by Summarisation and Classification Techniques

Title Identifying Referenced Text in Scientific Publications by Summarisation and Classification Techniques
Authors Stefan Klampfl, Andi Rexha, Roman Kern
Abstract
Tasks Document Summarization, Information Retrieval
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-1514/
PDF https://www.aclweb.org/anthology/W16-1514
PWC https://paperswithcode.com/paper/identifying-referenced-text-in-scientific
Repo
Framework

Efficient techniques for parsing with tree automata

Title Efficient techniques for parsing with tree automata
Authors Jonas Groschwitz, Alex Koller, er, Mark Johnson
Abstract
Tasks Machine Translation, Semantic Parsing
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1192/
PDF https://www.aclweb.org/anthology/P16-1192
PWC https://paperswithcode.com/paper/efficient-techniques-for-parsing-with-tree
Repo
Framework

Towards Building a Political Protest Database to Explain Changes in the Welfare State

Title Towards Building a Political Protest Database to Explain Changes in the Welfare State
Authors {\c{C}}a{\u{g}}{\i}l S{"o}nmez, Arzucan {"O}zg{"u}r, Erdem Y{"o}r{"u}k
Abstract
Tasks Time Series
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2113/
PDF https://www.aclweb.org/anthology/W16-2113
PWC https://paperswithcode.com/paper/towards-building-a-political-protest-database
Repo
Framework

Neural Attention for Learning to Rank Questions in Community Question Answering

Title Neural Attention for Learning to Rank Questions in Community Question Answering
Authors Salvatore Romeo, Giovanni Da San Martino, Alberto Barr{'o}n-Cede{~n}o, Aless Moschitti, ro, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Mitra Mohtarami, James Glass
Abstract In real-world data, e.g., from Web forums, text is often contaminated with redundant or irrelevant content, which leads to introducing noise in machine learning algorithms. In this paper, we apply Long Short-Term Memory networks with an attention mechanism, which can select important parts of text for the task of similar question retrieval from community Question Answering (cQA) forums. In particular, we use the attention weights for both selecting entire sentences and their subparts, i.e., word/chunk, from shallow syntactic trees. More interestingly, we apply tree kernels to the filtered text representations, thus exploiting the implicit features of the subtree space for learning question reranking. Our results show that the attention-based pruning allows for achieving the top position in the cQA challenge of SemEval 2016, with a relatively large gap from the other participants while greatly decreasing running time.
Tasks Community Question Answering, Learning-To-Rank, Natural Language Inference, Question Answering
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1163/
PDF https://www.aclweb.org/anthology/C16-1163
PWC https://paperswithcode.com/paper/neural-attention-for-learning-to-rank
Repo
Framework

Cooperative Graphical Models

Title Cooperative Graphical Models
Authors Josip Djolonga, Stefanie Jegelka, Sebastian Tschiatschek, Andreas Krause
Abstract We study a rich family of distributions that capture variable interactions significantly more expressive than those representable with low-treewidth or pairwise graphical models, or log-supermodular models. We call these cooperative graphical models. Yet, this family retains structure, which we carefully exploit for efficient inference techniques. Our algorithms combine the polyhedral structure of submodular functions in new ways with variational inference methods to obtain both lower and upper bounds on the partition function. While our fully convex upper bound is minimized as an SDP or via tree-reweighted belief propagation, our lower bound is tightened via belief propagation or mean-field algorithms. The resulting algorithms are easy to implement and, as our experiments show, effectively obtain good bounds and marginals for synthetic and real-world examples.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6122-cooperative-graphical-models
PDF http://papers.nips.cc/paper/6122-cooperative-graphical-models.pdf
PWC https://paperswithcode.com/paper/cooperative-graphical-models
Repo
Framework

Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis

Title Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis
Authors Orph{'e}e De Clercq, V{'e}ronique Hoste
Abstract The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results.
Tasks Aspect-Based Sentiment Analysis, Sentiment Analysis
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1465/
PDF https://www.aclweb.org/anthology/L16-1465
PWC https://paperswithcode.com/paper/rude-waiter-but-mouthwatering-pastries-an
Repo
Framework

Substring-based unsupervised transliteration with phonetic and contextual knowledge

Title Substring-based unsupervised transliteration with phonetic and contextual knowledge
Authors Anoop Kunchukuttan, Pushpak Bhattacharyya, Mitesh M. Khapra
Abstract
Tasks Information Retrieval, Machine Translation, Transliteration
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-1027/
PDF https://www.aclweb.org/anthology/K16-1027
PWC https://paperswithcode.com/paper/substring-based-unsupervised-transliteration
Repo
Framework

The Gavagai Living Lexicon

Title The Gavagai Living Lexicon
Authors Magnus Sahlgren, Amaru Cuba Gyllensten, Fredrik Espinoza, Ola Hamfors, Jussi Karlgren, Fredrik Olsson, Per Persson, Akshay Viswanathan, Anders Holst
Abstract This paper presents the Gavagai Living Lexicon, which is an online distributional semantic model currently available in 20 different languages. We describe the underlying distributional semantic model, and how we have solved some of the challenges in applying such a model to large amounts of streaming data. We also describe the architecture of our implementation, and discuss how we deal with continuous quality assurance of the lexicon.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1053/
PDF https://www.aclweb.org/anthology/L16-1053
PWC https://paperswithcode.com/paper/the-gavagai-living-lexicon
Repo
Framework

Odin’s Runes: A Rule Language for Information Extraction

Title Odin’s Runes: A Rule Language for Information Extraction
Authors Marco A. Valenzuela-Esc{'a}rcega, Gus Hahn-Powell, Mihai Surdeanu
Abstract Odin is an information extraction framework that applies cascades of finite state automata over both surface text and syntactic dependency graphs. Support for syntactic patterns allow us to concisely define relations that are otherwise difficult to express in languages such as Common Pattern Specification Language (CPSL), which are currently limited to shallow linguistic features. The interaction of lexical and syntactic automata provides robustness and flexibility when writing extraction rules. This paper describes Odin{'}s declarative language for writing these cascaded automata.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1050/
PDF https://www.aclweb.org/anthology/L16-1050
PWC https://paperswithcode.com/paper/odins-runes-a-rule-language-for-information
Repo
Framework

Automatic Detection of Arabicized Berber and Arabic Varieties

Title Automatic Detection of Arabicized Berber and Arabic Varieties
Authors Wafia Adouane, Nasredine Semmar, Richard Johansson, Victoria Bobicev
Abstract Automatic Language Identification (ALI) is the detection of the natural language of an input text by a machine. It is the first necessary step to do any language-dependent natural language processing task. Various methods have been successfully applied to a wide range of languages, and the state-of-the-art automatic language identifiers are mainly based on character n-gram models trained on huge corpora. However, there are many languages which are not yet automatically processed, for instance minority and informal languages. Many of these languages are only spoken and do not exist in a written format. Social media platforms and new technologies have facilitated the emergence of written format for these spoken languages based on pronunciation. The latter are not well represented on the Web, commonly referred to as under-resourced languages, and the current available ALI tools fail to properly recognize them. In this paper, we revisit the problem of ALI with the focus on Arabicized Berber and dialectal Arabic short texts. We introduce new resources and evaluate the existing methods. The results show that machine learning models combined with lexicons are well suited for detecting Arabicized Berber and different Arabic varieties and distinguishing between them, giving a macro-average F-score of 92.94{%}.
Tasks Language Identification
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4809/
PDF https://www.aclweb.org/anthology/W16-4809
PWC https://paperswithcode.com/paper/automatic-detection-of-arabicized-berber-and
Repo
Framework

The Gun Violence Database: A new task and data set for NLP

Title The Gun Violence Database: A new task and data set for NLP
Authors Ellie Pavlick, Heng Ji, Xiaoman Pan, Chris Callison-Burch
Abstract
Tasks Coreference Resolution, Relation Extraction
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1106/
PDF https://www.aclweb.org/anthology/D16-1106
PWC https://paperswithcode.com/paper/the-gun-violence-database-a-new-task-and-data
Repo
Framework
comments powered by Disqus