Paper Group NANR 16
Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts. Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text. Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams. Retrofitting …
Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts
Title | Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts |
Authors | Kimi Kaneko, Saku Sugawara, Koji Mineshima, Daisuke Bekki |
Abstract | This paper proposes a methodology for building a specialized Japanese data set for recognizing temporal relations and discourse relations. In addition to temporal and discourse relations, multi-layered situational relations that distinguish generic and specific states belonging to different layers in a discourse are annotated. Our methodology has been applied to 170 text fragments taken from Wikinews articles in Japanese. The validity of our methodology is evaluated and analyzed in terms of degree of annotator agreement and frequency of errors. |
Tasks | Natural Language Inference, Question Answering, Text Summarization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5402/ |
https://www.aclweb.org/anthology/W16-5402 | |
PWC | https://paperswithcode.com/paper/annotation-and-analysis-of-discourse |
Repo | |
Framework | |
Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text
Title | Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text |
Authors | Bo Han, Afshin Rahimi, Leon Derczynski, Timothy Baldwin |
Abstract | This paper presents the shared task for English Twitter geolocation prediction in WNUT 2016. We discuss details of task settings, data preparations and participant systems. The derived dataset and performance figures from each system provide baselines for future research in this realm. |
Tasks | Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3928/ |
https://www.aclweb.org/anthology/W16-3928 | |
PWC | https://paperswithcode.com/paper/twitter-geolocation-prediction-shared-task-of |
Repo | |
Framework | |
Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams
Title | Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams |
Authors | Jennifer Hill, Rahul Simha |
Abstract | |
Tasks | Reading Comprehension |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0503/ |
https://www.aclweb.org/anthology/W16-0503 | |
PWC | https://paperswithcode.com/paper/automatic-generation-of-context-based-fill-in |
Repo | |
Framework | |
Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures
Title | Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures |
Authors | Zhiguo Yu, Trevor Cohen, Byron Wallace, Elmer Bernstam, Todd Johnson |
Abstract | |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6106/ |
https://www.aclweb.org/anthology/W16-6106 | |
PWC | https://paperswithcode.com/paper/retrofitting-word-vectors-of-mesh-terms-to |
Repo | |
Framework | |
Rule-based Automatic Multi-word Term Extraction and Lemmatization
Title | Rule-based Automatic Multi-word Term Extraction and Lemmatization |
Authors | Ranka Stankovi{'c}, Cvetana Krstev, Ivan Obradovi{'c}, Biljana Lazi{'c}, Aleks Trtovac, ra |
Abstract | In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from the mining domain containing more than 600,000 simple word forms. Extracted and lemmatized multi-word terms are filtered in order to reject falsely offered lemmas and then ranked by introducing measures that combine linguistic and statistical information (C-Value, T-Score, LLR, and Keyness). Mean average precision for retrieval of MWU forms ranges from 0.789 to 0.804, while mean average precision of lemma production ranges from 0.956 to 0.960. The evaluation showed that 94{%} of distinct multi-word forms were evaluated as proper multi-word units, and among them 97{%} were associated with correct lemmas. |
Tasks | Lemmatization |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1081/ |
https://www.aclweb.org/anthology/L16-1081 | |
PWC | https://paperswithcode.com/paper/rule-based-automatic-multi-word-term |
Repo | |
Framework | |
A Dependency Treebank of the Chinese Buddhist Canon
Title | A Dependency Treebank of the Chinese Buddhist Canon |
Authors | Tak-sum Wong, John Lee |
Abstract | We present a dependency treebank of the Chinese Buddhist Canon, which contains 1,514 texts with about 50 million Chinese characters. The treebank was created by an automatic parser trained on a smaller treebank, containing four manually annotated sutras (Lee and Kong, 2014). We report results on word segmentation, part-of-speech tagging and dependency parsing, and discuss challenges posed by the processing of medieval Chinese. In a case study, we exploit the treebank to examine verbs frequently associated with Buddha, and to analyze usage patterns of quotative verbs in direct speech. Our results suggest that certain quotative verbs imply status differences between the speaker and the listener. |
Tasks | Dependency Parsing, Part-Of-Speech Tagging |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1265/ |
https://www.aclweb.org/anthology/L16-1265 | |
PWC | https://paperswithcode.com/paper/a-dependency-treebank-of-the-chinese-buddhist |
Repo | |
Framework | |
“How Bullying is this Message?": A Psychometric Thermometer for Bullying
Title | “How Bullying is this Message?": A Psychometric Thermometer for Bullying |
Authors | Parma Nand, Rivindu Perera, Abhijeet Kasture |
Abstract | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/papers/C16-1067/c16-1067 |
https://www.aclweb.org/anthology/C16-1067 | |
PWC | https://paperswithcode.com/paper/how-bullying-is-this-message-a-psychometric |
Repo | |
Framework | |
Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun
Title | Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun |
Authors | V{'a}clava Kettnerov{'a}, Eduard Bej{\v{c}}ek |
Abstract | In this paper, we focus on Czech complex predicates formed by a light verb and a predicative noun expressed as the direct object. Although Czech ― as an inflectional language encoding syntactic relations via morphological cases ― provides an excellent opportunity to study the distribution of valency complements in the syntactic structure with complex predicates, this distribution has not been described so far. On the basis of a manual analysis of the richly annotated data from the Prague Dependency Treebank, we thus formulate principles governing this distribution. In an automatic experiment, we verify these principles on well-formed syntactic structures from the Prague Dependency Treebank and the Prague Czech-English Dependency Treebank with very satisfactory results: the distribution of 97{%} of valency complements in the surface structure is governed by the proposed principles. These results corroborate that the surface structure formation of complex predicates is a regular process. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1082/ |
https://www.aclweb.org/anthology/L16-1082 | |
PWC | https://paperswithcode.com/paper/distribution-of-valency-complements-in-czech |
Repo | |
Framework | |
Clustering with Bregman Divergences: an Asymptotic Analysis
Title | Clustering with Bregman Divergences: an Asymptotic Analysis |
Authors | Chaoyue Liu, Mikhail Belkin |
Abstract | Clustering, in particular $k$-means clustering, is a central topic in data analysis. Clustering with Bregman divergences is a recently proposed generalization of $k$-means clustering which has already been widely used in applications. In this paper we analyze theoretical properties of Bregman clustering when the number of the clusters $k$ is large. We establish quantization rates and describe the limiting distribution of the centers as $k\to \infty$, extending well-known results for $k$-means clustering. |
Tasks | Quantization |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6550-clustering-with-bregman-divergences-an-asymptotic-analysis |
http://papers.nips.cc/paper/6550-clustering-with-bregman-divergences-an-asymptotic-analysis.pdf | |
PWC | https://paperswithcode.com/paper/clustering-with-bregman-divergences-an |
Repo | |
Framework | |
Phrase-Level Combination of SMT and TM Using Constrained Word Lattice
Title | Phrase-Level Combination of SMT and TM Using Constrained Word Lattice |
Authors | Liangyou Li, Andy Way, Qun Liu |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2045/ |
https://www.aclweb.org/anthology/P16-2045 | |
PWC | https://paperswithcode.com/paper/phrase-level-combination-of-smt-and-tm-using |
Repo | |
Framework | |
A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions
Title | A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions |
Authors | Chaya Liebeskind, Yaakov HaCohen-Kerner |
Abstract | A verb-noun Multi-Word Expression (MWE) is a combination of a verb and a noun with or without other words, in which the combination has a meaning different from the meaning of the words considered separately. In this paper, we present a new lexical resource of Hebrew Verb-Noun MWEs (VN-MWEs). The VN-MWEs of this resource were manually collected and annotated from five different web resources. In addition, we analyze the lexical properties of Hebrew VN-MWEs by classifying them to three types: morphological, syntactic, and semantic. These two contributions are essential for designing algorithms for automatic VN-MWEs extraction. The analysis suggests some interesting features of VN-MWEs for exploration. The lexical resource enables to sample a set of positive examples for Hebrew VN-MWEs. This set of examples can either be used for training supervised algorithms or as seeds in unsupervised bootstrapping algorithms. Thus, this resource is a first step towards automatic identification of Hebrew VN-MWEs, which is important for natural language understanding, generation and translation systems. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1083/ |
https://www.aclweb.org/anthology/L16-1083 | |
PWC | https://paperswithcode.com/paper/a-lexical-resource-of-hebrew-verb-noun-multi |
Repo | |
Framework | |
K-SRL: Instance-based Learning for Semantic Role Labeling
Title | K-SRL: Instance-based Learning for Semantic Role Labeling |
Authors | Alan Akbik, Yunyao Li |
Abstract | Semantic role labeling (SRL) is the task of identifying and labeling predicate-argument structures in sentences with semantic frame and role labels. A known challenge in SRL is the large number of low-frequency exceptions in training data, which are highly context-specific and difficult to generalize. To overcome this challenge, we propose the use of instance-based learning that performs no explicit generalization, but rather extrapolates predictions from the most similar instances in the training data. We present a variant of k-nearest neighbors (kNN) classification with composite features to identify nearest neighbors for SRL. We show that high-quality predictions can be derived from a very small number of similar instances. In a comparative evaluation we experimentally demonstrate that our instance-based learning approach significantly outperforms current state-of-the-art systems on both in-domain and out-of-domain data, reaching F1-scores of 89,28{%} and 79.91{%} respectively. |
Tasks | Machine Translation, Question Answering, Semantic Role Labeling |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1058/ |
https://www.aclweb.org/anthology/C16-1058 | |
PWC | https://paperswithcode.com/paper/k-srl-instance-based-learning-for-semantic |
Repo | |
Framework | |
SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking
Title | SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking |
Authors | Marie-Jean Meurs, Hayda Almeida, Ludovic Jean-Louis, Eric Charton |
Abstract | This paper presents SemLinker, an open source system that discovers named entities, connects them to a reference knowledge base, and clusters them semantically. SemLinker relies on several modules that perform surface form generation, mutual disambiguation, entity clustering, and make use of two annotation engines. SemLinker was evaluated in the English Entity Discovery and Linking track of the Text Analysis Conference on Knowledge Base Population, organized by the US National Institute of Standards and Technology. Along with the SemLinker source code, we release our annotation files containing the discovered named entities, their types, and position across processed documents. |
Tasks | Knowledge Base Population |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1085/ |
https://www.aclweb.org/anthology/L16-1085 | |
PWC | https://paperswithcode.com/paper/semlinker-a-modular-and-open-source-framework |
Repo | |
Framework | |
OSU_CHGCG at SemEval-2016 Task 9 : Chinese Semantic Dependency Parsing with Generalized Categorial Grammar
Title | OSU_CHGCG at SemEval-2016 Task 9 : Chinese Semantic Dependency Parsing with Generalized Categorial Grammar |
Authors | Manjuan Duan, Lifeng Jin, William Schuler |
Abstract | |
Tasks | Dependency Parsing, Question Answering, Semantic Dependency Parsing |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1189/ |
https://www.aclweb.org/anthology/S16-1189 | |
PWC | https://paperswithcode.com/paper/osu_chgcg-at-semeval-2016-task-9-chinese |
Repo | |
Framework | |
Multilingual Aliasing for Auto-Generating Proposition Banks
Title | Multilingual Aliasing for Auto-Generating Proposition Banks |
Authors | Alan Akbik, Xinyu Guan, Yunyao Li |
Abstract | Semantic Role Labeling (SRL) is the task of identifying the predicate-argument structure in sentences with semantic frame and role labels. For the English language, the Proposition Bank provides both a lexicon of all possible semantic frames and large amounts of labeled training data. In order to expand SRL beyond English, previous work investigated automatic approaches based on parallel corpora to automatically generate Proposition Banks for new target languages (TLs). However, this approach heuristically produces the frame lexicon from word alignments, leading to a range of lexicon-level errors and inconsistencies. To address these issues, we propose to manually alias TL verbs to existing English frames. For instance, the German verb drehen may evoke several meanings, including {}turn something{''} and { }film something{''}. Accordingly, we alias the former to the frame TURN.01 and the latter to a group of frames that includes FILM.01 and SHOOT.03. We execute a large-scale manual aliasing effort for three target languages and apply the new lexicons to automatically generate large Proposition Banks for Chinese, French and German with manually curated frames. We present a detailed evaluation in which we find that our proposed approach significantly increases the quality and consistency of the generated Proposition Banks. We release these resources to the research community. |
Tasks | Machine Translation, Question Answering, Semantic Role Labeling |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1327/ |
https://www.aclweb.org/anthology/C16-1327 | |
PWC | https://paperswithcode.com/paper/multilingual-aliasing-for-auto-generating |
Repo | |
Framework | |