Paper Group NANR 228
A Large DataBase of Hypernymy Relations Extracted from the Web.. How Regular is Japanese Loanword Adaptation? A Computational Study. Lifelong Learning with Weighted Majority Votes. Social and linguistic behavior and its correlation to trait empathy. Synthesis of MCMC and Belief Propagation. Sentiment Analysis in Social Networks through Topic modeli …
A Large DataBase of Hypernymy Relations Extracted from the Web.
Title | A Large DataBase of Hypernymy Relations Extracted from the Web. |
Authors | Julian Seitner, Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim, Simone Paolo Ponzetto |
Abstract | Hypernymy relations (those where an hyponym term shares a {``}isa{''} relationship with his hypernym) play a key role for many Natural Language Processing (NLP) tasks, e.g. ontology learning, automatically building or extending knowledge bases, or word sense disambiguation and induction. In fact, such relations may provide the basis for the construction of more complex structures such as taxonomies, or be used as effective background knowledge for many word understanding applications. We present a publicly available database containing more than 400 million hypernymy relations we extracted from the CommonCrawl web corpus. We describe the infrastructure we developed to iterate over the web corpus for extracting the hypernymy relations and store them effectively into a large database. This collection of relations represents a rich source of knowledge and may be useful for many researchers. We offer the tuple dataset for public download and an Application Programming Interface (API) to help other researchers programmatically query the database. | |
Tasks | Word Sense Disambiguation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1056/ |
https://www.aclweb.org/anthology/L16-1056 | |
PWC | https://paperswithcode.com/paper/a-large-database-of-hypernymy-relations |
Repo | |
Framework | |
How Regular is Japanese Loanword Adaptation? A Computational Study
Title | How Regular is Japanese Loanword Adaptation? A Computational Study |
Authors | Lingshuang Mao, Mans Hulden |
Abstract | The modifications that foreign loanwords undergo when adapted into Japanese have been the subject of much study in linguistics. The scholarly interest of the topic can be attributed to the fact that Japanese loanwords undergo a complex series of phonological adaptations, something which has been puzzling scholars for decades. While previous studies of Japanese loanword accommodation have focused on specific phonological phenomena of limited scope, the current study leverages computational methods to provide a more complete description of all the sound changes that occur when adopting English words into Japanese. To investigate this, we have developed a parallel corpus of 250 English transcriptions and their respective Japanese equivalents. These words were then used to develop a wide-coverage finite state transducer based phonological grammar that mimics the behavior of the Japanese adaption process. By developing rules with the goal of accounting completely for a large number of borrowing and analyzing forms mistakenly generated by the system, we discovered an internal inconsistency inside the loanword phonology of the Japanese language, something arguably underestimated by previous studies. The result of the investigation suggests that there are multiple {`}dimensions{'} that shape the output form of the current Japanese loanwords. These dimensions include orthography, phonetics, and historical changes. | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1081/ |
https://www.aclweb.org/anthology/C16-1081 | |
PWC | https://paperswithcode.com/paper/how-regular-is-japanese-loanword-adaptation-a |
Repo | |
Framework | |
Lifelong Learning with Weighted Majority Votes
Title | Lifelong Learning with Weighted Majority Votes |
Authors | Anastasia Pentina, Ruth Urner |
Abstract | Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network. |
Tasks | Representation Learning |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6095-lifelong-learning-with-weighted-majority-votes |
http://papers.nips.cc/paper/6095-lifelong-learning-with-weighted-majority-votes.pdf | |
PWC | https://paperswithcode.com/paper/lifelong-learning-with-weighted-majority |
Repo | |
Framework | |
Social and linguistic behavior and its correlation to trait empathy
Title | Social and linguistic behavior and its correlation to trait empathy |
Authors | Marina Litvak, Jahna Otterbacher, Chee Siang Ang, David Atkins |
Abstract | A growing body of research exploits social media behaviors to gauge psychological character-istics, though trait empathy has received little attention. Because of its intimate link to the abil-ity to relate to others, our research aims to predict participants{'} levels of empathy, given their textual and friending behaviors on Facebook. Using Poisson regression, we compared the vari-ance explained in Davis{'} Interpersonal Reactivity Index (IRI) scores on four constructs (em-pathic concern, personal distress, fantasy, perspective taking), by two classes of variables: 1) post content and 2) linguistic style. Our study lays the groundwork for a greater understanding of empathy{'}s role in facilitating interactions on social media. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4314/ |
https://www.aclweb.org/anthology/W16-4314 | |
PWC | https://paperswithcode.com/paper/social-and-linguistic-behavior-and-its |
Repo | |
Framework | |
Synthesis of MCMC and Belief Propagation
Title | Synthesis of MCMC and Belief Propagation |
Authors | Sung-Soo Ahn, Michael Chertkov, Jinwoo Shin |
Abstract | Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we, first, propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6318-synthesis-of-mcmc-and-belief-propagation |
http://papers.nips.cc/paper/6318-synthesis-of-mcmc-and-belief-propagation.pdf | |
PWC | https://paperswithcode.com/paper/synthesis-of-mcmc-and-belief-propagation |
Repo | |
Framework | |
Sentiment Analysis in Social Networks through Topic modeling
Title | Sentiment Analysis in Social Networks through Topic modeling |
Authors | Debashis Naskar, Sidahmed Mokaddem, Miguel Rebollo, Eva Onaindia |
Abstract | In this paper, we analyze the sentiments derived from the conversations that occur in social networks. Our goal is to identify the sentiments of the users in the social network through their conversations. We conduct a study to determine whether users of social networks (twitter in particular) tend to gather together according to the likeness of their sentiments. In our proposed framework, (1) we use ANEW, a lexical dictionary to identify affective emotional feelings associated to a message according to the Russell{'}s model of affection; (2) we design a topic modeling mechanism called Sent{_}LDA, based on the Latent Dirichlet Allocation (LDA) generative model, which allows us to find the topic distribution in a general conversation and we associate topics with emotions; (3) we detect communities in the network according to the density and frequency of the messages among the users; and (4) we compare the sentiments of the communities by using the Russell{'}s model of affect versus polarity and we measure the extent to which topic distribution strengthen likeness in the sentiments of the users of a community. This works contributes with a topic modeling methodology to analyze the sentiments in conversations that take place in social networks. |
Tasks | Sentiment Analysis |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1008/ |
https://www.aclweb.org/anthology/L16-1008 | |
PWC | https://paperswithcode.com/paper/sentiment-analysis-in-social-networks-through |
Repo | |
Framework | |
NLP and Public Engagement: The Case of the Italian School Reform
Title | NLP and Public Engagement: The Case of the Italian School Reform |
Authors | Tommaso Caselli, Giovanni Moretti, Rachele Sprugnoli, Sara Tonelli, Damien Lanfrey, Donatella Solda Kutzmann |
Abstract | In this paper we present PIERINO (PIattaforma per l{'}Estrazione e il Recupero di INformazione Online), a system that was implemented in collaboration with the Italian Ministry of Education, University and Research to analyse the citizens{'} comments given in {#}labuonascuola survey. The platform includes various levels of automatic analysis such as key-concept extraction and word co-occurrences. Each analysis is displayed through an intuitive view using different types of visualizations, for example radar charts and sunburst. PIERINO was effectively used to support shaping the last Italian school reform, proving the potential of NLP in the context of policy making. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1063/ |
https://www.aclweb.org/anthology/L16-1063 | |
PWC | https://paperswithcode.com/paper/nlp-and-public-engagement-the-case-of-the |
Repo | |
Framework | |
The Kyutech corpus and topic segmentation using a combined method
Title | The Kyutech corpus and topic segmentation using a combined method |
Authors | Takashi Yamamura, Kazutaka Shimada, Shintaro Kawahara |
Abstract | Summarization of multi-party conversation is one of the important tasks in natural language processing. In this paper, we explain a Japanese corpus and a topic segmentation task. To the best of our knowledge, the corpus is the first Japanese corpus annotated for summarization tasks and freely available to anyone. We call it {``}the Kyutech corpus.{''} The task of the corpus is a decision-making task with four participants and it contains utterances with time information, topic segmentation and reference summaries. As a case study for the corpus, we describe a method combined with LCSeg and TopicTiling for a topic segmentation task. We discuss the effectiveness and the problems of the combined method through the experiment with the Kyutech corpus. | |
Tasks | Decision Making |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5412/ |
https://www.aclweb.org/anthology/W16-5412 | |
PWC | https://paperswithcode.com/paper/the-kyutech-corpus-and-topic-segmentation |
Repo | |
Framework | |
The Cloud of Knowing: Non-factive al-ta `know’ (as a Neg-raiser) in Korean
Title | The Cloud of Knowing: Non-factive al-ta `know’ (as a Neg-raiser) in Korean | |
Authors | Chungmin Lee, Seungjin Hong |
Abstract | |
Tasks | Rumour Detection |
Published | 2016-10-01 |
URL | https://www.aclweb.org/anthology/Y16-3026/ |
https://www.aclweb.org/anthology/Y16-3026 | |
PWC | https://paperswithcode.com/paper/the-cloud-of-knowing-non-factive-al-ta-aknowa |
Repo | |
Framework | |
Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter
Title | Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter |
Authors | Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, Trevor Cohn |
Abstract | |
Tasks | Rumour Detection, Sentiment Analysis |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2064/ |
https://www.aclweb.org/anthology/P16-2064 | |
PWC | https://paperswithcode.com/paper/hawkes-processes-for-continuous-time-sequence |
Repo | |
Framework | |
Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network
Title | Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network |
Authors | Anh Tuan Luu, Yi Tay, Siu Cheung Hui, See Kiong Ng |
Abstract | |
Tasks | Learning Word Embeddings, Machine Translation, Question Answering, Sentiment Analysis, Word Embeddings |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1039/ |
https://www.aclweb.org/anthology/D16-1039 | |
PWC | https://paperswithcode.com/paper/learning-term-embeddings-for-taxonomic |
Repo | |
Framework | |
Improving the Morphological Analysis of Classical Sanskrit
Title | Improving the Morphological Analysis of Classical Sanskrit |
Authors | Oliver Hellwig |
Abstract | The paper describes a new tagset for the morphological disambiguation of Sanskrit, and compares the accuracy of two machine learning methods (Conditional Random Fields, deep recurrent neural networks) for this task, with a special focus on how to model the lexicographic information. It reports a significant improvement over previously published results. |
Tasks | Lemmatization, Morphological Analysis, Tokenization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3715/ |
https://www.aclweb.org/anthology/W16-3715 | |
PWC | https://paperswithcode.com/paper/improving-the-morphological-analysis-of |
Repo | |
Framework | |
Comprehensive and Consistent PropBank Light Verb Annotation
Title | Comprehensive and Consistent PropBank Light Verb Annotation |
Authors | Claire Bonial, Martha Palmer |
Abstract | Recent efforts have focused on expanding the annotation coverage of PropBank from verb relations to adjective and noun relations, as well as light verb constructions (e.g., make an offer, take a bath). While each new relation type has presented unique annotation challenges, ensuring consistent and comprehensive annotation of light verb constructions has proved particularly challenging, given that light verb constructions are semi-productive, difficult to define, and there are often borderline cases. This research describes the iterative process of developing PropBank annotation guidelines for light verb constructions, the current guidelines, and a comparison to related resources. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1628/ |
https://www.aclweb.org/anthology/L16-1628 | |
PWC | https://paperswithcode.com/paper/comprehensive-and-consistent-propbank-light |
Repo | |
Framework | |
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
Title | A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories |
Authors | Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, V, Lucy erwende, Pushmeet Kohli, James Allen |
Abstract | |
Tasks | Question Answering, Text Summarization |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1098/ |
https://www.aclweb.org/anthology/N16-1098 | |
PWC | https://paperswithcode.com/paper/a-corpus-and-cloze-evaluation-for-deeper |
Repo | |
Framework | |
PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits
Title | PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits |
Authors | Maja Popovi{'c}, Mihael Ar{\v{c}}an |
Abstract | We present a freely available corpus containing source language texts from different domains along with their automatically generated translations into several distinct morphologically rich languages, their post-edited versions, and error annotations of the performed post-edit operations. We believe that the corpus will be useful for many different applications. The main advantage of the approach used for creation of the corpus is the fusion of post-editing and error classification tasks, which have usually been seen as two independent tasks, although naturally they are not. We also show benefits of coupling automatic and manual error classification which facilitates the complex manual error annotation task as well as the development of automatic error classification tools. In addition, the approach facilitates annotation of language pair related issues. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1005/ |
https://www.aclweb.org/anthology/L16-1005 | |
PWC | https://paperswithcode.com/paper/pe2rr-corpus-manual-error-annotation-of |
Repo | |
Framework | |