May 4, 2019

1413 words 7 mins read

Paper Group NANR 189

Paper Group NANR 189

Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language. Splitting compounds with ngrams. Using phone features to improve dialogue state tracking generalisation to unseen states. LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications. Unravelling Names of Fictional …

Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language

Title Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language
Authors Mo Shen, Wingmui Li, HyunJeong Choe, Chenhui Chu, Daisuke Kawahara, Sadao Kurohashi
Abstract In this paper, we propose a new annotation approach to Chinese word segmentation, part-of-speech (POS) tagging and dependency labelling that aims to overcome the two major issues in traditional morphology-based annotation: Inconsistency and data sparsity. We re-annotate the Penn Chinese Treebank 5.0 (CTB5) and demonstrate the advantages of this approach compared to the original CTB5 annotation through word segmentation, POS tagging and machine translation experiments.
Tasks Chinese Word Segmentation, Machine Translation, Morphological Analysis, Part-Of-Speech Tagging
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1029/
PDF https://www.aclweb.org/anthology/C16-1029
PWC https://paperswithcode.com/paper/consistent-word-segmentation-part-of-speech
Repo
Framework

Splitting compounds with ngrams

Title Splitting compounds with ngrams
Authors Naomi Tachikawa Shapiro
Abstract Compound words with unmarked word boundaries are problematic for many tasks in NLP and computational linguistics, including information extraction, machine translation, and syllabification. This paper introduces a simple, proof-of-concept language modeling approach to automatic compound segmentation, as applied to Finnish. This approach utilizes an off-the-shelf morphological analyzer to split training words into their constituent morphemes. A language model is subsequently trained on ngrams composed of morphemes, morpheme boundaries, and word boundaries. Linguistic constraints are then used to weed out phonotactically ill-formed segmentations, thereby allowing the language model to select the best grammatical segmentation. This approach achieves an accuracy of {\textasciitilde}97{%}.
Tasks Language Modelling, Machine Translation, Morphological Analysis, Semantic Parsing
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1061/
PDF https://www.aclweb.org/anthology/C16-1061
PWC https://paperswithcode.com/paper/splitting-compounds-with-ngrams
Repo
Framework

Using phone features to improve dialogue state tracking generalisation to unseen states

Title Using phone features to improve dialogue state tracking generalisation to unseen states
Authors I{~n}igo Casanueva, Thomas Hain, Mauro Nicolao, Phil Green
Abstract
Tasks Dialogue State Tracking, Spoken Language Understanding
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3611/
PDF https://www.aclweb.org/anthology/W16-3611
PWC https://paperswithcode.com/paper/using-phone-features-to-improve-dialogue
Repo
Framework

LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications

Title LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications
Authors Alexei V. Ivanov, Patrick L. Lange, David Suendermann-Oeft
Abstract
Tasks Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition, Spoken Language Understanding
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3627/
PDF https://www.aclweb.org/anthology/W16-3627
PWC https://paperswithcode.com/paper/lvcsr-system-on-a-hybrid-gpu-cpu-embedded
Repo
Framework

Unravelling Names of Fictional Characters

Title Unravelling Names of Fictional Characters
Authors Katerina Papantoniou, Stasinos Konstantopoulos
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1203/
PDF https://www.aclweb.org/anthology/P16-1203
PWC https://paperswithcode.com/paper/unravelling-names-of-fictional-characters
Repo
Framework

Analysing the Integration of Semantic Web Features for Document Planning across Genres

Title Analysing the Integration of Semantic Web Features for Document Planning across Genres
Authors Marta Vicente, Elena Lloret
Abstract
Tasks Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3513/
PDF https://www.aclweb.org/anthology/W16-3513
PWC https://paperswithcode.com/paper/analysing-the-integration-of-semantic-web
Repo
Framework

Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning

Title Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning
Authors Zhenghua Li, Jiayuan Chao, Min Zhang, Jiwen Yang
Abstract
Tasks
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1072/
PDF https://www.aclweb.org/anthology/D16-1072
PWC https://paperswithcode.com/paper/fast-coupled-sequence-labeling-on
Repo
Framework

Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection

Title Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection
Authors Sarah Ita Levitan, Yocheved Levitan, Guozhen An, Michelle Levine, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg
Abstract
Tasks Deception Detection
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0806/
PDF https://www.aclweb.org/anthology/W16-0806
PWC https://paperswithcode.com/paper/identifying-individual-differences-in-gender
Repo
Framework

Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge

Title Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge
Authors Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Abstract This paper studies cross-lingual transfer for dependency parsing, focusing on very low-resource settings where delexicalized transfer is the only fully automatic option. We show how to boost parsing performance by rewriting the source sentences so as to better match the linguistic regularities of the target language. We contrast a data-driven approach with an approach relying on linguistically motivated rules automatically extracted from the World Atlas of Language Structures. Our findings are backed up by experiments involving 40 languages. They show that both approaches greatly outperform the baseline, the knowledge-driven method yielding the best accuracies, with average improvements of +2.9 UAS, and up to +90 UAS (absolute) on some frequent PoS configurations.
Tasks Active Learning, Cross-Lingual Transfer, Dependency Parsing, Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1012/
PDF https://www.aclweb.org/anthology/C16-1012
PWC https://paperswithcode.com/paper/zero-resource-dependency-parsing-boosting
Repo
Framework

Political News Sentiment Analysis for Under-resourced Languages

Title Political News Sentiment Analysis for Under-resourced Languages
Authors Patrik F. Bakken, Terje A. Bratlie, Cristina Marco, Jon Atle Gulla
Abstract This paper presents classification results for the analysis of sentiment in political news articles. The domain of political news is particularly challenging, as journalists are presumably objective, whilst at the same time opinions can be subtly expressed. To deal with this challenge, in this work we conduct a two-step classification model, distinguishing first subjective and second positive and negative sentiment texts. More specifically, we propose a shallow machine learning approach where only minimal features are needed to train the classifier, including sentiment-bearing Co-Occurring Terms (COTs) and negation words. This approach yields close to state-of-the-art results. Contrary to results in other domains, the use of negations as features does not have a positive impact in the evaluation results. This method is particularly suited for languages that suffer from a lack of resources, such as sentiment lexicons or parsers, and for those systems that need to function in real-time.
Tasks Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1281/
PDF https://www.aclweb.org/anthology/C16-1281
PWC https://paperswithcode.com/paper/political-news-sentiment-analysis-for-under
Repo
Framework

Discontinuity (Re)\mbox$^2$-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing

Title Discontinuity (Re)\mbox$^2$-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing
Authors Yannick Versley
Abstract
Tasks Constituency Parsing, Dependency Parsing
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0907/
PDF https://www.aclweb.org/anthology/W16-0907
PWC https://paperswithcode.com/paper/discontinuity-re-visited-a-minimalist
Repo
Framework

Finding metaphorical triggers through source (not target) domain lexicalization patterns

Title Finding metaphorical triggers through source (not target) domain lexicalization patterns
Authors Jenny Lederer
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-1101/
PDF https://www.aclweb.org/anthology/W16-1101
PWC https://paperswithcode.com/paper/finding-metaphorical-triggers-through-source
Repo
Framework

Legacy language atlas data mining: mapping Kru languages

Title Legacy language atlas data mining: mapping Kru languages
Authors Dafydd Gibbon
Abstract An online tool based on dialectometric methods, DistGraph, is applied to a group of Kru languages of C{^o}te d{'}Ivoire, Liberia and Burkina Faso. The inputs to this resource consist of tables of languages x linguistic features (e.g. phonological, lexical or grammatical), and statistical and graphical outputs are generated which show similarities and differences between the languages in terms of the features as virtual distances. In the present contribution, attention is focussed on the consonant systems of the languages, a traditional starting point for language comparison. The data are harvested from a legacy language data resource based on fieldwork in the 1970s and 1980s, a language atlas of the Kru languages. The method on which the online tool is based extends beyond documentation of individual languages to the documentation of language groups, and supports difference-based prioritisation in education programmes, decisions on language policy and documentation and conservation funding, as well as research on language typology and heritage documentation of history and migration.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1515/
PDF https://www.aclweb.org/anthology/L16-1515
PWC https://paperswithcode.com/paper/legacy-language-atlas-data-mining-mapping-kru
Repo
Framework

Language Resource Addition Strategies for Raw Text Parsing

Title Language Resource Addition Strategies for Raw Text Parsing
Authors Atsushi Ushiku, Tetsuro Sasada, Shinsuke Mori
Abstract We focus on the improvement of accuracy of raw text parsing, from the viewpoint of language resource addition. In Japanese, the raw text parsing is divided into three steps: word segmentation, part-of-speech tagging, and dependency parsing. We investigate the contribution of language resource addition in each of three steps to the improvement in accuracy for two domain corpora. The experimental results show that this improvement depends on the target domain. For example, when we handle well-written texts of limited vocabulary, white paper, an effective language resource is a word-POS pair sequence corpus for the parsing accuracy. So we conclude that it is important to check out the characteristics of the target domain and to choose a suitable language resource addition strategy for the parsing accuracy improvement.
Tasks Dependency Parsing, Part-Of-Speech Tagging
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1105/
PDF https://www.aclweb.org/anthology/L16-1105
PWC https://paperswithcode.com/paper/language-resource-addition-strategies-for-raw
Repo
Framework

Most babies'' are little’’ and most problems'' are huge’': Compositional Entailment in Adjective-Nouns

Title Most babies'' are little’’ and most problems'' are huge’': Compositional Entailment in Adjective-Nouns
Authors Ellie Pavlick, Chris Callison-Burch
Abstract
Tasks Common Sense Reasoning, Natural Language Inference
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1204/
PDF https://www.aclweb.org/anthology/P16-1204
PWC https://paperswithcode.com/paper/most-babies-are-little-and-most-problems-are
Repo
Framework
comments powered by Disqus