May 5, 2019

2308 words 11 mins read

Paper Group NANR 12

Paper Group NANR 12

Agreement and Disagreement: Comparison of Points of View in the Political Domain. CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings. A Study of the Bump Alternation in Japanese from the Perspective of Extended/Onset Causation. The Trouble with Machine Translation Coherence. GhoSt-PV: A Representative Gold Standard of Ge …

Agreement and Disagreement: Comparison of Points of View in the Political Domain

Title Agreement and Disagreement: Comparison of Points of View in the Political Domain
Authors Stefano Menini, Sara Tonelli
Abstract The automated comparison of points of view between two politicians is a very challenging task, due not only to the lack of annotated resources, but also to the different dimensions participating to the definition of agreement and disagreement. In order to shed light on this complex task, we first carry out a pilot study to manually annotate the components involved in detecting agreement and disagreement. Then, based on these findings, we implement different features to capture them automatically via supervised classification. We do not focus on debates in dialogical form, but we rather consider sets of documents, in which politicians may express their position with respect to different topics in an implicit or explicit way, like during an electoral campaign. We create and make available three different datasets.
Tasks Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1232/
PDF https://www.aclweb.org/anthology/C16-1232
PWC https://paperswithcode.com/paper/agreement-and-disagreement-comparison-of
Repo
Framework

CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings

Title CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings
Authors Mohammed Attia, Suraj Maharjan, Younes Samih, Laura Kallmeyer, Thamar Solorio
Abstract This paper describes our system submission to the CogALex-2016 Shared Task on Corpus-Based Identification of Semantic Relations. Our system won first place for Task-1 and second place for Task-2. The evaluation results of our system on the test set is 88.1{%} (79.0{%} for TRUE only) f-measure for Task-1 on detecting semantic similarity, and 76.0{%} (42.3{%} when excluding RANDOM) for Task-2 on identifying finer-grained semantic relations. In our experiments, we try word analogy, linear regression, and multi-task Convolutional Neural Networks (CNNs) with word embeddings from publicly available word vectors. We found that linear regression performs better in the binary classification (Task-1), while CNNs have better performance in the multi-class semantic classification (Task-2). We assume that word analogy is more suited for deterministic answers rather than handling the ambiguity of one-to-many and many-to-many relationships. We also show that classifier performance could benefit from balancing the distribution of labels in the training data.
Tasks Information Retrieval, Knowledge Graphs, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5311/
PDF https://www.aclweb.org/anthology/W16-5311
PWC https://paperswithcode.com/paper/cogalex-v-shared-task-ghhh-detecting-semantic
Repo
Framework

A Study of the Bump Alternation in Japanese from the Perspective of Extended/Onset Causation

Title A Study of the Bump Alternation in Japanese from the Perspective of Extended/Onset Causation
Authors Natsuno Aoki, Kentaro Nakatani
Abstract This paper deals with a seldom studied object/oblique alternation phenomenon in Japanese, which. We call this the bump alternation. This phenomenon, first discussed by Sadanobu (1990), is similar to the English with/against alternation. For example, compare hit the wall with the bat [=immobile-as-direct-object frame] to hit the bat against the wall [=mobile-as-direct-object frame]). However, in the Japanese version, the case frame remains constant. Although we fundamentally question Sadanobu{'}s acceptability judgment, we also claim that the causation type (i.e., whether the event is an instance of onset or extended causation; Talmy, 1988; 2000) could make an improvement. An extended causative interpretation could improve the acceptability of the otherwise awkward immobile-as-direct-object frame. We examined this claim through a rating study, and the results showed an interaction between the Causation type (extended/onset) and the Object type (mobile/immobile) in the direction we predicted. We propose that a perspective shift on what is moving causes the {``}extended causation{''} advantage. |
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5317/
PDF https://www.aclweb.org/anthology/W16-5317
PWC https://paperswithcode.com/paper/a-study-of-the-bump-alternation-in-japanese
Repo
Framework

The Trouble with Machine Translation Coherence

Title The Trouble with Machine Translation Coherence
Authors Karin Sim Smith, Wilker Aziz, Lucia Specia
Abstract
Tasks Machine Translation
Published 2016-01-01
URL https://www.aclweb.org/anthology/W16-3407/
PDF https://www.aclweb.org/anthology/W16-3407
PWC https://paperswithcode.com/paper/the-trouble-with-machine-translation
Repo
Framework

GhoSt-PV: A Representative Gold Standard of German Particle Verbs

Title GhoSt-PV: A Representative Gold Standard of German Particle Verbs
Authors Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, Sabine Schulte im Walde
Abstract German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon. Similarly to other multi-word expressions, particle verbs exhibit various levels of compositionality. One of the major obstacles for the study of compositionality is the lack of representative gold standards of human ratings. In order to address this bottleneck, this paper presents such a gold standard data set containing 400 randomly selected German particle verbs. It is balanced across several particle types and three frequency bands, and accomplished by human ratings on the degree of semantic compositionality.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5318/
PDF https://www.aclweb.org/anthology/W16-5318
PWC https://paperswithcode.com/paper/ghost-pv-a-representative-gold-standard-of
Repo
Framework

Improving Information Extraction from Wikipedia Texts using Basic English

Title Improving Information Extraction from Wikipedia Texts using Basic English
Authors Teresa Rodr{'\i}guez-Ferreira, Adri{'a}n Rabad{'a}n, Raquel Herv{'a}s, Alberto D{'\i}az
Abstract The aim of this paper is to study the effect that the use of Basic English versus common English has on information extraction from online resources. The amount of online information available to the public grows exponentially, and is potentially an excellent resource for information extraction. The problem is that this information often comes in an unstructured format, such as plain text. In order to retrieve knowledge from this type of text, it must first be analysed to find the relevant details, and the nature of the language used can greatly impact the quality of the extracted information. In this paper, we compare triplets that represent definitions or properties of concepts obtained from three online collaborative resources (English Wikipedia, Simple English Wikipedia and Simple English Wiktionary) and study the differences in the results when Basic English is used instead of common English. The results show that resources written in Basic English produce less quantity of triplets, but with higher quality.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1062/
PDF https://www.aclweb.org/anthology/L16-1062
PWC https://paperswithcode.com/paper/improving-information-extraction-from
Repo
Framework

Lexfom: a lexical functions ontology model

Title Lexfom: a lexical functions ontology model
Authors Alexs Fonseca, ro, Fatiha Sadat, Fran{\c{c}}ois Lareau
Abstract A lexical function represents a type of relation that exists between lexical units (words or expressions) in any language. For example, the antonymy is a type of relation that is represented by the lexical function Anti: Anti(big) = small. Those relations include both paradigmatic relations, i.e. vertical relations, such as synonymy, antonymy and meronymy and syntagmatic relations, i.e. horizontal relations, such as objective qualification (legitimate demand), subjective qualification (fruitful analysis), positive evaluation (good review) and support verbs (pay a visit, subject to an interrogation). In this paper, we present the Lexical Functions Ontology Model (lexfom) to represent lexical functions and the relation among lexical units. Lexfom is divided in four modules: lexical function representation (lfrep), lexical function family (lffam), lexical function semantic perspective (lfsem) and lexical function relations (lfrel). Moreover, we show how it combines to Lexical Model for Ontologies (lemon), for the transformation of lexical networks into the semantic web formats. So far, we have implemented 100 simple and 500 complex lexical functions, and encoded about 8,000 syntagmatic and 46,000 paradigmatic relations, for the French language.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5320/
PDF https://www.aclweb.org/anthology/W16-5320
PWC https://paperswithcode.com/paper/lexfom-a-lexical-functions-ontology-model
Repo
Framework

CASSAurus: A Resource of Simpler Spanish Synonyms

Title CASSAurus: A Resource of Simpler Spanish Synonyms
Authors Ricardo Baeza-Yates, Luz Rello, Julia Dembowski
Abstract In this work we introduce and describe a language resource composed of lists of simpler synonyms for Spanish. The synonyms are divided in different senses taken from the Spanish OpenThesaurus, where context disambiguation was performed by using statistical information from the Web and Google Books Ngrams. This resource is freely available online and can be used for different NLP tasks such as lexical simplification. Indeed, so far it has been already integrated into four tools.
Tasks Lexical Simplification
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1151/
PDF https://www.aclweb.org/anthology/L16-1151
PWC https://paperswithcode.com/paper/cassaurus-a-resource-of-simpler-spanish
Repo
Framework

Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows

Title Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows
Authors Peter Bourgonje, Julian Moreno Schneider, Georg Rehm, Felix Sasaki
Abstract
Tasks Efficient Exploration, Entity Linking, Machine Translation, Named Entity Recognition, Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3503/
PDF https://www.aclweb.org/anthology/W16-3503
PWC https://paperswithcode.com/paper/processing-document-collections-to
Repo
Framework

Multiword Expressions at the Grammar-Lexicon Interface

Title Multiword Expressions at the Grammar-Lexicon Interface
Authors Timothy Baldwin
Abstract In this talk, I will outline a range of challenges presented by multiword expressions in terms of (lexicalist) precision grammar engineering, and different strategies for accommodating those challenges, in an attempt to strike the right balance in terms of generalisation and over- and under-generation.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3802/
PDF https://www.aclweb.org/anthology/W16-3802
PWC https://paperswithcode.com/paper/multiword-expressions-at-the-grammar-lexicon
Repo
Framework

Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks

Title Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks
Authors Sebastian Schuster, Christopher D. Manning
Abstract Many shallow natural language understanding tasks use dependency trees to extract relations between content words. However, strict surface-structure dependency trees tend to follow the linguistic structure of sentences too closely and frequently fail to provide direct relations between content words. To mitigate this problem, the original Stanford Dependencies representation also defines two dependency graph representations which contain additional and augmented relations that explicitly capture otherwise implicit relations between content words. In this paper, we revisit and extend these dependency graph representations in light of the recent Universal Dependencies (UD) initiative and provide a detailed account of an enhanced and an enhanced++ English UD representation. We further present a converter from constituency to basic, i.e., strict surface structure, UD trees, and a converter from basic UD trees to enhanced and enhanced++ English UD graphs. We release both converters as part of Stanford CoreNLP and the Stanford Parser.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1376/
PDF https://www.aclweb.org/anthology/L16-1376
PWC https://paperswithcode.com/paper/enhanced-english-universal-dependencies-an
Repo
Framework

A Unified Architecture for Semantic Role Labeling and Relation Classification

Title A Unified Architecture for Semantic Role Labeling and Relation Classification
Authors Jiang Guo, Wanxiang Che, Haifeng Wang, Ting Liu, Jun Xu
Abstract This paper describes a unified neural architecture for identifying and classifying multi-typed semantic relations between words in a sentence. We investigate two typical and well-studied tasks: semantic role labeling (SRL) which identifies the relations between predicates and arguments, and relation classification (RC) which focuses on the relation between two entities or nominals. While mostly studied separately in prior work, we show that the two tasks can be effectively connected and modeled using a general architecture. Experiments on CoNLL-2009 benchmark datasets show that our SRL models significantly outperform state-of-the-art approaches. Our RC models also yield competitive performance with the best published records. Furthermore, we show that the two tasks can be trained jointly with multi-task learning, resulting in additive significant improvements for SRL.
Tasks Feature Engineering, Information Retrieval, Multi-Task Learning, Named Entity Recognition, Part-Of-Speech Tagging, Relation Classification, Semantic Role Labeling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1120/
PDF https://www.aclweb.org/anthology/C16-1120
PWC https://paperswithcode.com/paper/a-unified-architecture-for-semantic-role
Repo
Framework

An Overview of BPPT’s Indonesian Language Resources

Title An Overview of BPPT’s Indonesian Language Resources
Authors Gunarso Gunarso, Hammam Riza
Abstract This paper describes various Indonesian language resources that Agency for the Assessment and Application of Technology (BPPT) has developed and collected since mid 80{'}s when we joined MMTS (Multilingual Machine Translation System), an international project coordinated by CICC-Japan to develop a machine translation system for five Asian languages (Bahasa Indonesia, Malay, Thai, Japanese, and Chinese). Since then, we have been actively doing many types of research in the field of statistical machine translation, speech recognition, and speech synthesis which requires many text and speech corpus. Most recent cooperation within ASEAN-IVO is the development of Indonesian ALT (Asian Language Treebank) has added new NLP tools.
Tasks Machine Translation, Speech Recognition, Speech Synthesis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5409/
PDF https://www.aclweb.org/anthology/W16-5409
PWC https://paperswithcode.com/paper/an-overview-of-bppts-indonesian-language
Repo
Framework

BCCWJ-DepPara: A Syntactic Annotation Treebank on the `Balanced Corpus of Contemporary Written Japanese’

Title BCCWJ-DepPara: A Syntactic Annotation Treebank on the `Balanced Corpus of Contemporary Written Japanese’ |
Authors Masayuki Asahara, Yuji Matsumoto
Abstract Paratactic syntactic structures are difficult to represent in syntactic dependency tree structures. As such, we propose an annotation schema for syntactic dependency annotation of Japanese, in which coordinate structures are split from and overlaid on bunsetsu-based (base phrase unit) dependency. The schema represents nested coordinate structures, non-constituent conjuncts, and forward sharing as the set of regions. The annotation was performed on the core data of {`}Balanced Corpus of Contemporary Written Japanese{'}, which comprised about one million words and 1980 samples from six registers, such as newspapers, books, magazines, and web texts. |
Tasks Dependency Parsing
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5406/
PDF https://www.aclweb.org/anthology/W16-5406
PWC https://paperswithcode.com/paper/bccwj-deppara-a-syntactic-annotation-treebank
Repo
Framework

Construction and Analysis of a Large Vietnamese Text Corpus

Title Construction and Analysis of a Large Vietnamese Text Corpus
Authors Dieu-Thu Le, Uwe Quasthoff
Abstract This paper presents a new Vietnamese text corpus which contains around 4.05 billion words. It is a collection of Wikipedia texts, newspaper articles and random web texts. The paper describes the process of collecting, cleaning and creating the corpus. Processing Vietnamese texts faced several challenges, for example, different from many Latin languages, Vietnamese language does not use blanks for separating words, hence using common tokenizers such as replacing blanks with word boundary does not work. A short review about different approaches of Vietnamese tokenization is presented together with how the corpus has been processed and created. After that, some statistical analysis on this data is reported including the number of syllable, average word length, sentence length and topic analysis. The corpus is integrated into a framework which allows searching and browsing. Using this web interface, users can find out how many times a particular word appears in the corpus, sample sentences where this word occurs, its left and right neighbors.
Tasks Tokenization
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1065/
PDF https://www.aclweb.org/anthology/L16-1065
PWC https://paperswithcode.com/paper/construction-and-analysis-of-a-large
Repo
Framework
comments powered by Disqus