May 4, 2019

1413 words 7 mins read

Paper Group NANR 189

Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language. Splitting compounds with ngrams. Using phone features to improve dialogue state tracking generalisation to unseen states. LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications. Unravelling Names of Fictional …

Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language


Title	Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language
Authors	Mo Shen, Wingmui Li, HyunJeong Choe, Chenhui Chu, Daisuke Kawahara, Sadao Kurohashi
Abstract	In this paper, we propose a new annotation approach to Chinese word segmentation, part-of-speech (POS) tagging and dependency labelling that aims to overcome the two major issues in traditional morphology-based annotation: Inconsistency and data sparsity. We re-annotate the Penn Chinese Treebank 5.0 (CTB5) and demonstrate the advantages of this approach compared to the original CTB5 annotation through word segmentation, POS tagging and machine translation experiments.
Tasks	Chinese Word Segmentation, Machine Translation, Morphological Analysis, Part-Of-Speech Tagging
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1029/
PDF	https://www.aclweb.org/anthology/C16-1029
PWC	https://paperswithcode.com/paper/consistent-word-segmentation-part-of-speech
Repo
Framework

Splitting compounds with ngrams


Title	Splitting compounds with ngrams
Authors	Naomi Tachikawa Shapiro
Abstract	Compound words with unmarked word boundaries are problematic for many tasks in NLP and computational linguistics, including information extraction, machine translation, and syllabification. This paper introduces a simple, proof-of-concept language modeling approach to automatic compound segmentation, as applied to Finnish. This approach utilizes an off-the-shelf morphological analyzer to split training words into their constituent morphemes. A language model is subsequently trained on ngrams composed of morphemes, morpheme boundaries, and word boundaries. Linguistic constraints are then used to weed out phonotactically ill-formed segmentations, thereby allowing the language model to select the best grammatical segmentation. This approach achieves an accuracy of {\textasciitilde}97{%}.
Tasks	Language Modelling, Machine Translation, Morphological Analysis, Semantic Parsing
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1061/
PDF	https://www.aclweb.org/anthology/C16-1061
PWC	https://paperswithcode.com/paper/splitting-compounds-with-ngrams
Repo
Framework

Using phone features to improve dialogue state tracking generalisation to unseen states


Title	Using phone features to improve dialogue state tracking generalisation to unseen states
Authors	I{~n}igo Casanueva, Thomas Hain, Mauro Nicolao, Phil Green
Abstract
Tasks	Dialogue State Tracking, Spoken Language Understanding
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3611/
PDF	https://www.aclweb.org/anthology/W16-3611
PWC	https://paperswithcode.com/paper/using-phone-features-to-improve-dialogue
Repo
Framework

LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications


Title	LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications
Authors	Alexei V. Ivanov, Patrick L. Lange, David Suendermann-Oeft
Abstract
Tasks	Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition, Spoken Language Understanding
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3627/
PDF	https://www.aclweb.org/anthology/W16-3627
PWC	https://paperswithcode.com/paper/lvcsr-system-on-a-hybrid-gpu-cpu-embedded
Repo
Framework

Unravelling Names of Fictional Characters


Title	Unravelling Names of Fictional Characters
Authors	Katerina Papantoniou, Stasinos Konstantopoulos
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1203/
PDF	https://www.aclweb.org/anthology/P16-1203
PWC	https://paperswithcode.com/paper/unravelling-names-of-fictional-characters
Repo
Framework

Analysing the Integration of Semantic Web Features for Document Planning across Genres


Title	Analysing the Integration of Semantic Web Features for Document Planning across Genres
Authors	Marta Vicente, Elena Lloret
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3513/
PDF	https://www.aclweb.org/anthology/W16-3513
PWC	https://paperswithcode.com/paper/analysing-the-integration-of-semantic-web
Repo
Framework

Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning


Title	Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning
Authors	Zhenghua Li, Jiayuan Chao, Min Zhang, Jiwen Yang
Abstract
Tasks
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1072/
PDF	https://www.aclweb.org/anthology/D16-1072
PWC	https://paperswithcode.com/paper/fast-coupled-sequence-labeling-on
Repo
Framework

Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection


Title	Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection
Authors	Sarah Ita Levitan, Yocheved Levitan, Guozhen An, Michelle Levine, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg
Abstract
Tasks	Deception Detection
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0806/
PDF	https://www.aclweb.org/anthology/W16-0806
PWC	https://paperswithcode.com/paper/identifying-individual-differences-in-gender
Repo
Framework

Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge


Title	Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge
Authors	Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Abstract	This paper studies cross-lingual transfer for dependency parsing, focusing on very low-resource settings where delexicalized transfer is the only fully automatic option. We show how to boost parsing performance by rewriting the source sentences so as to better match the linguistic regularities of the target language. We contrast a data-driven approach with an approach relying on linguistically motivated rules automatically extracted from the World Atlas of Language Structures. Our findings are backed up by experiments involving 40 languages. They show that both approaches greatly outperform the baseline, the knowledge-driven method yielding the best accuracies, with average improvements of +2.9 UAS, and up to +90 UAS (absolute) on some frequent PoS configurations.
Tasks	Active Learning, Cross-Lingual Transfer, Dependency Parsing, Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1012/
PDF	https://www.aclweb.org/anthology/C16-1012
PWC	https://paperswithcode.com/paper/zero-resource-dependency-parsing-boosting
Repo
Framework

Political News Sentiment Analysis for Under-resourced Languages


Title	Political News Sentiment Analysis for Under-resourced Languages
Authors	Patrik F. Bakken, Terje A. Bratlie, Cristina Marco, Jon Atle Gulla
Abstract	This paper presents classification results for the analysis of sentiment in political news articles. The domain of political news is particularly challenging, as journalists are presumably objective, whilst at the same time opinions can be subtly expressed. To deal with this challenge, in this work we conduct a two-step classification model, distinguishing first subjective and second positive and negative sentiment texts. More specifically, we propose a shallow machine learning approach where only minimal features are needed to train the classifier, including sentiment-bearing Co-Occurring Terms (COTs) and negation words. This approach yields close to state-of-the-art results. Contrary to results in other domains, the use of negations as features does not have a positive impact in the evaluation results. This method is particularly suited for languages that suffer from a lack of resources, such as sentiment lexicons or parsers, and for those systems that need to function in real-time.
Tasks	Sentiment Analysis
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1281/
PDF	https://www.aclweb.org/anthology/C16-1281
PWC	https://paperswithcode.com/paper/political-news-sentiment-analysis-for-under
Repo
Framework

Discontinuity (Re)\mbox$^2$-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing


Title	Discontinuity (Re)\mbox$^2$-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing
Authors	Yannick Versley
Abstract
Tasks	Constituency Parsing, Dependency Parsing
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0907/
PDF	https://www.aclweb.org/anthology/W16-0907
PWC	https://paperswithcode.com/paper/discontinuity-re-visited-a-minimalist
Repo
Framework

Finding metaphorical triggers through source (not target) domain lexicalization patterns


Title	Finding metaphorical triggers through source (not target) domain lexicalization patterns
Authors	Jenny Lederer
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1101/
PDF	https://www.aclweb.org/anthology/W16-1101
PWC	https://paperswithcode.com/paper/finding-metaphorical-triggers-through-source
Repo
Framework

Legacy language atlas data mining: mapping Kru languages


Title	Legacy language atlas data mining: mapping Kru languages
Authors	Dafydd Gibbon
Abstract	An online tool based on dialectometric methods, DistGraph, is applied to a group of Kru languages of C{^o}te d{'}Ivoire, Liberia and Burkina Faso. The inputs to this resource consist of tables of languages x linguistic features (e.g. phonological, lexical or grammatical), and statistical and graphical outputs are generated which show similarities and differences between the languages in terms of the features as virtual distances. In the present contribution, attention is focussed on the consonant systems of the languages, a traditional starting point for language comparison. The data are harvested from a legacy language data resource based on fieldwork in the 1970s and 1980s, a language atlas of the Kru languages. The method on which the online tool is based extends beyond documentation of individual languages to the documentation of language groups, and supports difference-based prioritisation in education programmes, decisions on language policy and documentation and conservation funding, as well as research on language typology and heritage documentation of history and migration.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1515/
PDF	https://www.aclweb.org/anthology/L16-1515
PWC	https://paperswithcode.com/paper/legacy-language-atlas-data-mining-mapping-kru
Repo
Framework

Language Resource Addition Strategies for Raw Text Parsing


Title	Language Resource Addition Strategies for Raw Text Parsing
Authors	Atsushi Ushiku, Tetsuro Sasada, Shinsuke Mori
Abstract	We focus on the improvement of accuracy of raw text parsing, from the viewpoint of language resource addition. In Japanese, the raw text parsing is divided into three steps: word segmentation, part-of-speech tagging, and dependency parsing. We investigate the contribution of language resource addition in each of three steps to the improvement in accuracy for two domain corpora. The experimental results show that this improvement depends on the target domain. For example, when we handle well-written texts of limited vocabulary, white paper, an effective language resource is a word-POS pair sequence corpus for the parsing accuracy. So we conclude that it is important to check out the characteristics of the target domain and to choose a suitable language resource addition strategy for the parsing accuracy improvement.
Tasks	Dependency Parsing, Part-Of-Speech Tagging
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1105/
PDF	https://www.aclweb.org/anthology/L16-1105
PWC	https://paperswithcode.com/paper/language-resource-addition-strategies-for-raw
Repo
Framework

Most `babies'' are` little’’ and most `problems'' are` huge’': Compositional Entailment in Adjective-Nouns


Title	Most `babies'' are` little’’ and most `problems'' are` huge’': Compositional Entailment in Adjective-Nouns
Authors	Ellie Pavlick, Chris Callison-Burch
Abstract
Tasks	Common Sense Reasoning, Natural Language Inference
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1204/
PDF	https://www.aclweb.org/anthology/P16-1204
PWC	https://paperswithcode.com/paper/most-babies-are-little-and-most-problems-are
Repo
Framework