May 5, 2019

1914 words 9 mins read

Paper Group NANR 16

Paper Group NANR 16

Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts. Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text. Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams. Retrofitting …

Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts

Title Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts
Authors Kimi Kaneko, Saku Sugawara, Koji Mineshima, Daisuke Bekki
Abstract This paper proposes a methodology for building a specialized Japanese data set for recognizing temporal relations and discourse relations. In addition to temporal and discourse relations, multi-layered situational relations that distinguish generic and specific states belonging to different layers in a discourse are annotated. Our methodology has been applied to 170 text fragments taken from Wikinews articles in Japanese. The validity of our methodology is evaluated and analyzed in terms of degree of annotator agreement and frequency of errors.
Tasks Natural Language Inference, Question Answering, Text Summarization
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5402/
PDF https://www.aclweb.org/anthology/W16-5402
PWC https://paperswithcode.com/paper/annotation-and-analysis-of-discourse
Repo
Framework

Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text

Title Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text
Authors Bo Han, Afshin Rahimi, Leon Derczynski, Timothy Baldwin
Abstract This paper presents the shared task for English Twitter geolocation prediction in WNUT 2016. We discuss details of task settings, data preparations and participant systems. The derived dataset and performance figures from each system provide baselines for future research in this realm.
Tasks Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3928/
PDF https://www.aclweb.org/anthology/W16-3928
PWC https://paperswithcode.com/paper/twitter-geolocation-prediction-shared-task-of
Repo
Framework

Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams

Title Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams
Authors Jennifer Hill, Rahul Simha
Abstract
Tasks Reading Comprehension
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0503/
PDF https://www.aclweb.org/anthology/W16-0503
PWC https://paperswithcode.com/paper/automatic-generation-of-context-based-fill-in
Repo
Framework

Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures

Title Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures
Authors Zhiguo Yu, Trevor Cohen, Byron Wallace, Elmer Bernstam, Todd Johnson
Abstract
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6106/
PDF https://www.aclweb.org/anthology/W16-6106
PWC https://paperswithcode.com/paper/retrofitting-word-vectors-of-mesh-terms-to
Repo
Framework

Rule-based Automatic Multi-word Term Extraction and Lemmatization

Title Rule-based Automatic Multi-word Term Extraction and Lemmatization
Authors Ranka Stankovi{'c}, Cvetana Krstev, Ivan Obradovi{'c}, Biljana Lazi{'c}, Aleks Trtovac, ra
Abstract In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from the mining domain containing more than 600,000 simple word forms. Extracted and lemmatized multi-word terms are filtered in order to reject falsely offered lemmas and then ranked by introducing measures that combine linguistic and statistical information (C-Value, T-Score, LLR, and Keyness). Mean average precision for retrieval of MWU forms ranges from 0.789 to 0.804, while mean average precision of lemma production ranges from 0.956 to 0.960. The evaluation showed that 94{%} of distinct multi-word forms were evaluated as proper multi-word units, and among them 97{%} were associated with correct lemmas.
Tasks Lemmatization
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1081/
PDF https://www.aclweb.org/anthology/L16-1081
PWC https://paperswithcode.com/paper/rule-based-automatic-multi-word-term
Repo
Framework

A Dependency Treebank of the Chinese Buddhist Canon

Title A Dependency Treebank of the Chinese Buddhist Canon
Authors Tak-sum Wong, John Lee
Abstract We present a dependency treebank of the Chinese Buddhist Canon, which contains 1,514 texts with about 50 million Chinese characters. The treebank was created by an automatic parser trained on a smaller treebank, containing four manually annotated sutras (Lee and Kong, 2014). We report results on word segmentation, part-of-speech tagging and dependency parsing, and discuss challenges posed by the processing of medieval Chinese. In a case study, we exploit the treebank to examine verbs frequently associated with Buddha, and to analyze usage patterns of quotative verbs in direct speech. Our results suggest that certain quotative verbs imply status differences between the speaker and the listener.
Tasks Dependency Parsing, Part-Of-Speech Tagging
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1265/
PDF https://www.aclweb.org/anthology/L16-1265
PWC https://paperswithcode.com/paper/a-dependency-treebank-of-the-chinese-buddhist
Repo
Framework

“How Bullying is this Message?": A Psychometric Thermometer for Bullying

Title “How Bullying is this Message?": A Psychometric Thermometer for Bullying
Authors Parma Nand, Rivindu Perera, Abhijeet Kasture
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/papers/C16-1067/c16-1067
PDF https://www.aclweb.org/anthology/C16-1067
PWC https://paperswithcode.com/paper/how-bullying-is-this-message-a-psychometric
Repo
Framework

Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun

Title Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun
Authors V{'a}clava Kettnerov{'a}, Eduard Bej{\v{c}}ek
Abstract In this paper, we focus on Czech complex predicates formed by a light verb and a predicative noun expressed as the direct object. Although Czech ― as an inflectional language encoding syntactic relations via morphological cases ― provides an excellent opportunity to study the distribution of valency complements in the syntactic structure with complex predicates, this distribution has not been described so far. On the basis of a manual analysis of the richly annotated data from the Prague Dependency Treebank, we thus formulate principles governing this distribution. In an automatic experiment, we verify these principles on well-formed syntactic structures from the Prague Dependency Treebank and the Prague Czech-English Dependency Treebank with very satisfactory results: the distribution of 97{%} of valency complements in the surface structure is governed by the proposed principles. These results corroborate that the surface structure formation of complex predicates is a regular process.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1082/
PDF https://www.aclweb.org/anthology/L16-1082
PWC https://paperswithcode.com/paper/distribution-of-valency-complements-in-czech
Repo
Framework

Clustering with Bregman Divergences: an Asymptotic Analysis

Title Clustering with Bregman Divergences: an Asymptotic Analysis
Authors Chaoyue Liu, Mikhail Belkin
Abstract Clustering, in particular $k$-means clustering, is a central topic in data analysis. Clustering with Bregman divergences is a recently proposed generalization of $k$-means clustering which has already been widely used in applications. In this paper we analyze theoretical properties of Bregman clustering when the number of the clusters $k$ is large. We establish quantization rates and describe the limiting distribution of the centers as $k\to \infty$, extending well-known results for $k$-means clustering.
Tasks Quantization
Published 2016-12-01
URL http://papers.nips.cc/paper/6550-clustering-with-bregman-divergences-an-asymptotic-analysis
PDF http://papers.nips.cc/paper/6550-clustering-with-bregman-divergences-an-asymptotic-analysis.pdf
PWC https://paperswithcode.com/paper/clustering-with-bregman-divergences-an
Repo
Framework

Phrase-Level Combination of SMT and TM Using Constrained Word Lattice

Title Phrase-Level Combination of SMT and TM Using Constrained Word Lattice
Authors Liangyou Li, Andy Way, Qun Liu
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2045/
PDF https://www.aclweb.org/anthology/P16-2045
PWC https://paperswithcode.com/paper/phrase-level-combination-of-smt-and-tm-using
Repo
Framework

A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions

Title A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions
Authors Chaya Liebeskind, Yaakov HaCohen-Kerner
Abstract A verb-noun Multi-Word Expression (MWE) is a combination of a verb and a noun with or without other words, in which the combination has a meaning different from the meaning of the words considered separately. In this paper, we present a new lexical resource of Hebrew Verb-Noun MWEs (VN-MWEs). The VN-MWEs of this resource were manually collected and annotated from five different web resources. In addition, we analyze the lexical properties of Hebrew VN-MWEs by classifying them to three types: morphological, syntactic, and semantic. These two contributions are essential for designing algorithms for automatic VN-MWEs extraction. The analysis suggests some interesting features of VN-MWEs for exploration. The lexical resource enables to sample a set of positive examples for Hebrew VN-MWEs. This set of examples can either be used for training supervised algorithms or as seeds in unsupervised bootstrapping algorithms. Thus, this resource is a first step towards automatic identification of Hebrew VN-MWEs, which is important for natural language understanding, generation and translation systems.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1083/
PDF https://www.aclweb.org/anthology/L16-1083
PWC https://paperswithcode.com/paper/a-lexical-resource-of-hebrew-verb-noun-multi
Repo
Framework

K-SRL: Instance-based Learning for Semantic Role Labeling

Title K-SRL: Instance-based Learning for Semantic Role Labeling
Authors Alan Akbik, Yunyao Li
Abstract Semantic role labeling (SRL) is the task of identifying and labeling predicate-argument structures in sentences with semantic frame and role labels. A known challenge in SRL is the large number of low-frequency exceptions in training data, which are highly context-specific and difficult to generalize. To overcome this challenge, we propose the use of instance-based learning that performs no explicit generalization, but rather extrapolates predictions from the most similar instances in the training data. We present a variant of k-nearest neighbors (kNN) classification with composite features to identify nearest neighbors for SRL. We show that high-quality predictions can be derived from a very small number of similar instances. In a comparative evaluation we experimentally demonstrate that our instance-based learning approach significantly outperforms current state-of-the-art systems on both in-domain and out-of-domain data, reaching F1-scores of 89,28{%} and 79.91{%} respectively.
Tasks Machine Translation, Question Answering, Semantic Role Labeling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1058/
PDF https://www.aclweb.org/anthology/C16-1058
PWC https://paperswithcode.com/paper/k-srl-instance-based-learning-for-semantic
Repo
Framework

SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking

Title SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking
Authors Marie-Jean Meurs, Hayda Almeida, Ludovic Jean-Louis, Eric Charton
Abstract This paper presents SemLinker, an open source system that discovers named entities, connects them to a reference knowledge base, and clusters them semantically. SemLinker relies on several modules that perform surface form generation, mutual disambiguation, entity clustering, and make use of two annotation engines. SemLinker was evaluated in the English Entity Discovery and Linking track of the Text Analysis Conference on Knowledge Base Population, organized by the US National Institute of Standards and Technology. Along with the SemLinker source code, we release our annotation files containing the discovered named entities, their types, and position across processed documents.
Tasks Knowledge Base Population
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1085/
PDF https://www.aclweb.org/anthology/L16-1085
PWC https://paperswithcode.com/paper/semlinker-a-modular-and-open-source-framework
Repo
Framework

OSU_CHGCG at SemEval-2016 Task 9 : Chinese Semantic Dependency Parsing with Generalized Categorial Grammar

Title OSU_CHGCG at SemEval-2016 Task 9 : Chinese Semantic Dependency Parsing with Generalized Categorial Grammar
Authors Manjuan Duan, Lifeng Jin, William Schuler
Abstract
Tasks Dependency Parsing, Question Answering, Semantic Dependency Parsing
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1189/
PDF https://www.aclweb.org/anthology/S16-1189
PWC https://paperswithcode.com/paper/osu_chgcg-at-semeval-2016-task-9-chinese
Repo
Framework

Multilingual Aliasing for Auto-Generating Proposition Banks

Title Multilingual Aliasing for Auto-Generating Proposition Banks
Authors Alan Akbik, Xinyu Guan, Yunyao Li
Abstract Semantic Role Labeling (SRL) is the task of identifying the predicate-argument structure in sentences with semantic frame and role labels. For the English language, the Proposition Bank provides both a lexicon of all possible semantic frames and large amounts of labeled training data. In order to expand SRL beyond English, previous work investigated automatic approaches based on parallel corpora to automatically generate Proposition Banks for new target languages (TLs). However, this approach heuristically produces the frame lexicon from word alignments, leading to a range of lexicon-level errors and inconsistencies. To address these issues, we propose to manually alias TL verbs to existing English frames. For instance, the German verb drehen may evoke several meanings, including {}turn something{''} and {}film something{''}. Accordingly, we alias the former to the frame TURN.01 and the latter to a group of frames that includes FILM.01 and SHOOT.03. We execute a large-scale manual aliasing effort for three target languages and apply the new lexicons to automatically generate large Proposition Banks for Chinese, French and German with manually curated frames. We present a detailed evaluation in which we find that our proposed approach significantly increases the quality and consistency of the generated Proposition Banks. We release these resources to the research community.
Tasks Machine Translation, Question Answering, Semantic Role Labeling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1327/
PDF https://www.aclweb.org/anthology/C16-1327
PWC https://paperswithcode.com/paper/multilingual-aliasing-for-auto-generating
Repo
Framework
comments powered by Disqus