May 4, 2019

2280 words 11 mins read

Paper Group NANR 228

Paper Group NANR 228

A Large DataBase of Hypernymy Relations Extracted from the Web.. How Regular is Japanese Loanword Adaptation? A Computational Study. Lifelong Learning with Weighted Majority Votes. Social and linguistic behavior and its correlation to trait empathy. Synthesis of MCMC and Belief Propagation. Sentiment Analysis in Social Networks through Topic modeli …

A Large DataBase of Hypernymy Relations Extracted from the Web.

Title A Large DataBase of Hypernymy Relations Extracted from the Web.
Authors Julian Seitner, Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim, Simone Paolo Ponzetto
Abstract Hypernymy relations (those where an hyponym term shares a {``}isa{''} relationship with his hypernym) play a key role for many Natural Language Processing (NLP) tasks, e.g. ontology learning, automatically building or extending knowledge bases, or word sense disambiguation and induction. In fact, such relations may provide the basis for the construction of more complex structures such as taxonomies, or be used as effective background knowledge for many word understanding applications. We present a publicly available database containing more than 400 million hypernymy relations we extracted from the CommonCrawl web corpus. We describe the infrastructure we developed to iterate over the web corpus for extracting the hypernymy relations and store them effectively into a large database. This collection of relations represents a rich source of knowledge and may be useful for many researchers. We offer the tuple dataset for public download and an Application Programming Interface (API) to help other researchers programmatically query the database. |
Tasks Word Sense Disambiguation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1056/
PDF https://www.aclweb.org/anthology/L16-1056
PWC https://paperswithcode.com/paper/a-large-database-of-hypernymy-relations
Repo
Framework

How Regular is Japanese Loanword Adaptation? A Computational Study

Title How Regular is Japanese Loanword Adaptation? A Computational Study
Authors Lingshuang Mao, Mans Hulden
Abstract The modifications that foreign loanwords undergo when adapted into Japanese have been the subject of much study in linguistics. The scholarly interest of the topic can be attributed to the fact that Japanese loanwords undergo a complex series of phonological adaptations, something which has been puzzling scholars for decades. While previous studies of Japanese loanword accommodation have focused on specific phonological phenomena of limited scope, the current study leverages computational methods to provide a more complete description of all the sound changes that occur when adopting English words into Japanese. To investigate this, we have developed a parallel corpus of 250 English transcriptions and their respective Japanese equivalents. These words were then used to develop a wide-coverage finite state transducer based phonological grammar that mimics the behavior of the Japanese adaption process. By developing rules with the goal of accounting completely for a large number of borrowing and analyzing forms mistakenly generated by the system, we discovered an internal inconsistency inside the loanword phonology of the Japanese language, something arguably underestimated by previous studies. The result of the investigation suggests that there are multiple {`}dimensions{'} that shape the output form of the current Japanese loanwords. These dimensions include orthography, phonetics, and historical changes. |
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1081/
PDF https://www.aclweb.org/anthology/C16-1081
PWC https://paperswithcode.com/paper/how-regular-is-japanese-loanword-adaptation-a
Repo
Framework

Lifelong Learning with Weighted Majority Votes

Title Lifelong Learning with Weighted Majority Votes
Authors Anastasia Pentina, Ruth Urner
Abstract Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network.
Tasks Representation Learning
Published 2016-12-01
URL http://papers.nips.cc/paper/6095-lifelong-learning-with-weighted-majority-votes
PDF http://papers.nips.cc/paper/6095-lifelong-learning-with-weighted-majority-votes.pdf
PWC https://paperswithcode.com/paper/lifelong-learning-with-weighted-majority
Repo
Framework

Social and linguistic behavior and its correlation to trait empathy

Title Social and linguistic behavior and its correlation to trait empathy
Authors Marina Litvak, Jahna Otterbacher, Chee Siang Ang, David Atkins
Abstract A growing body of research exploits social media behaviors to gauge psychological character-istics, though trait empathy has received little attention. Because of its intimate link to the abil-ity to relate to others, our research aims to predict participants{'} levels of empathy, given their textual and friending behaviors on Facebook. Using Poisson regression, we compared the vari-ance explained in Davis{'} Interpersonal Reactivity Index (IRI) scores on four constructs (em-pathic concern, personal distress, fantasy, perspective taking), by two classes of variables: 1) post content and 2) linguistic style. Our study lays the groundwork for a greater understanding of empathy{'}s role in facilitating interactions on social media.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4314/
PDF https://www.aclweb.org/anthology/W16-4314
PWC https://paperswithcode.com/paper/social-and-linguistic-behavior-and-its
Repo
Framework

Synthesis of MCMC and Belief Propagation

Title Synthesis of MCMC and Belief Propagation
Authors Sung-Soo Ahn, Michael Chertkov, Jinwoo Shin
Abstract Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we, first, propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6318-synthesis-of-mcmc-and-belief-propagation
PDF http://papers.nips.cc/paper/6318-synthesis-of-mcmc-and-belief-propagation.pdf
PWC https://paperswithcode.com/paper/synthesis-of-mcmc-and-belief-propagation
Repo
Framework

Sentiment Analysis in Social Networks through Topic modeling

Title Sentiment Analysis in Social Networks through Topic modeling
Authors Debashis Naskar, Sidahmed Mokaddem, Miguel Rebollo, Eva Onaindia
Abstract In this paper, we analyze the sentiments derived from the conversations that occur in social networks. Our goal is to identify the sentiments of the users in the social network through their conversations. We conduct a study to determine whether users of social networks (twitter in particular) tend to gather together according to the likeness of their sentiments. In our proposed framework, (1) we use ANEW, a lexical dictionary to identify affective emotional feelings associated to a message according to the Russell{'}s model of affection; (2) we design a topic modeling mechanism called Sent{_}LDA, based on the Latent Dirichlet Allocation (LDA) generative model, which allows us to find the topic distribution in a general conversation and we associate topics with emotions; (3) we detect communities in the network according to the density and frequency of the messages among the users; and (4) we compare the sentiments of the communities by using the Russell{'}s model of affect versus polarity and we measure the extent to which topic distribution strengthen likeness in the sentiments of the users of a community. This works contributes with a topic modeling methodology to analyze the sentiments in conversations that take place in social networks.
Tasks Sentiment Analysis
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1008/
PDF https://www.aclweb.org/anthology/L16-1008
PWC https://paperswithcode.com/paper/sentiment-analysis-in-social-networks-through
Repo
Framework

NLP and Public Engagement: The Case of the Italian School Reform

Title NLP and Public Engagement: The Case of the Italian School Reform
Authors Tommaso Caselli, Giovanni Moretti, Rachele Sprugnoli, Sara Tonelli, Damien Lanfrey, Donatella Solda Kutzmann
Abstract In this paper we present PIERINO (PIattaforma per l{'}Estrazione e il Recupero di INformazione Online), a system that was implemented in collaboration with the Italian Ministry of Education, University and Research to analyse the citizens{'} comments given in {#}labuonascuola survey. The platform includes various levels of automatic analysis such as key-concept extraction and word co-occurrences. Each analysis is displayed through an intuitive view using different types of visualizations, for example radar charts and sunburst. PIERINO was effectively used to support shaping the last Italian school reform, proving the potential of NLP in the context of policy making.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1063/
PDF https://www.aclweb.org/anthology/L16-1063
PWC https://paperswithcode.com/paper/nlp-and-public-engagement-the-case-of-the
Repo
Framework

The Kyutech corpus and topic segmentation using a combined method

Title The Kyutech corpus and topic segmentation using a combined method
Authors Takashi Yamamura, Kazutaka Shimada, Shintaro Kawahara
Abstract Summarization of multi-party conversation is one of the important tasks in natural language processing. In this paper, we explain a Japanese corpus and a topic segmentation task. To the best of our knowledge, the corpus is the first Japanese corpus annotated for summarization tasks and freely available to anyone. We call it {``}the Kyutech corpus.{''} The task of the corpus is a decision-making task with four participants and it contains utterances with time information, topic segmentation and reference summaries. As a case study for the corpus, we describe a method combined with LCSeg and TopicTiling for a topic segmentation task. We discuss the effectiveness and the problems of the combined method through the experiment with the Kyutech corpus. |
Tasks Decision Making
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5412/
PDF https://www.aclweb.org/anthology/W16-5412
PWC https://paperswithcode.com/paper/the-kyutech-corpus-and-topic-segmentation
Repo
Framework

The Cloud of Knowing: Non-factive al-ta `know’ (as a Neg-raiser) in Korean

Title The Cloud of Knowing: Non-factive al-ta `know’ (as a Neg-raiser) in Korean |
Authors Chungmin Lee, Seungjin Hong
Abstract
Tasks Rumour Detection
Published 2016-10-01
URL https://www.aclweb.org/anthology/Y16-3026/
PDF https://www.aclweb.org/anthology/Y16-3026
PWC https://paperswithcode.com/paper/the-cloud-of-knowing-non-factive-al-ta-aknowa
Repo
Framework

Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter

Title Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter
Authors Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, Trevor Cohn
Abstract
Tasks Rumour Detection, Sentiment Analysis
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2064/
PDF https://www.aclweb.org/anthology/P16-2064
PWC https://paperswithcode.com/paper/hawkes-processes-for-continuous-time-sequence
Repo
Framework

Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network

Title Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network
Authors Anh Tuan Luu, Yi Tay, Siu Cheung Hui, See Kiong Ng
Abstract
Tasks Learning Word Embeddings, Machine Translation, Question Answering, Sentiment Analysis, Word Embeddings
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1039/
PDF https://www.aclweb.org/anthology/D16-1039
PWC https://paperswithcode.com/paper/learning-term-embeddings-for-taxonomic
Repo
Framework

Improving the Morphological Analysis of Classical Sanskrit

Title Improving the Morphological Analysis of Classical Sanskrit
Authors Oliver Hellwig
Abstract The paper describes a new tagset for the morphological disambiguation of Sanskrit, and compares the accuracy of two machine learning methods (Conditional Random Fields, deep recurrent neural networks) for this task, with a special focus on how to model the lexicographic information. It reports a significant improvement over previously published results.
Tasks Lemmatization, Morphological Analysis, Tokenization
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3715/
PDF https://www.aclweb.org/anthology/W16-3715
PWC https://paperswithcode.com/paper/improving-the-morphological-analysis-of
Repo
Framework

Comprehensive and Consistent PropBank Light Verb Annotation

Title Comprehensive and Consistent PropBank Light Verb Annotation
Authors Claire Bonial, Martha Palmer
Abstract Recent efforts have focused on expanding the annotation coverage of PropBank from verb relations to adjective and noun relations, as well as light verb constructions (e.g., make an offer, take a bath). While each new relation type has presented unique annotation challenges, ensuring consistent and comprehensive annotation of light verb constructions has proved particularly challenging, given that light verb constructions are semi-productive, difficult to define, and there are often borderline cases. This research describes the iterative process of developing PropBank annotation guidelines for light verb constructions, the current guidelines, and a comparison to related resources.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1628/
PDF https://www.aclweb.org/anthology/L16-1628
PWC https://paperswithcode.com/paper/comprehensive-and-consistent-propbank-light
Repo
Framework

A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories

Title A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
Authors Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, V, Lucy erwende, Pushmeet Kohli, James Allen
Abstract
Tasks Question Answering, Text Summarization
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1098/
PDF https://www.aclweb.org/anthology/N16-1098
PWC https://paperswithcode.com/paper/a-corpus-and-cloze-evaluation-for-deeper
Repo
Framework

PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits

Title PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits
Authors Maja Popovi{'c}, Mihael Ar{\v{c}}an
Abstract We present a freely available corpus containing source language texts from different domains along with their automatically generated translations into several distinct morphologically rich languages, their post-edited versions, and error annotations of the performed post-edit operations. We believe that the corpus will be useful for many different applications. The main advantage of the approach used for creation of the corpus is the fusion of post-editing and error classification tasks, which have usually been seen as two independent tasks, although naturally they are not. We also show benefits of coupling automatic and manual error classification which facilitates the complex manual error annotation task as well as the development of automatic error classification tools. In addition, the approach facilitates annotation of language pair related issues.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1005/
PDF https://www.aclweb.org/anthology/L16-1005
PWC https://paperswithcode.com/paper/pe2rr-corpus-manual-error-annotation-of
Repo
Framework
comments powered by Disqus