May 4, 2019

1273 words 6 mins read

Paper Group NANR 172

Keynote Lecture 2: Neural (and other Machine Learning) Approaches to Text Normalization. Visualization of Dynamic Reference Graphs. Context Tailoring for Text Normalization. The Limits of Learning with Missing Data. Unsupervised Timeline Generation for Wikipedia History Articles. Tuning Bayes Baseline for Dialect Detection. Learning and Forecasting …

Keynote Lecture 2: Neural (and other Machine Learning) Approaches to Text Normalization


Title	Keynote Lecture 2: Neural (and other Machine Learning) Approaches to Text Normalization
Authors	Richard Sproat
Abstract
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6323/
PDF	https://www.aclweb.org/anthology/W16-6323
PWC	https://paperswithcode.com/paper/keynote-lecture-2-neural-and-other-machine
Repo
Framework

Visualization of Dynamic Reference Graphs


Title	Visualization of Dynamic Reference Graphs
Authors	Ivan Rodin, Ekaterina Chernyak, Mikhail Dubov, Boris Mirkin
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1406/
PDF	https://www.aclweb.org/anthology/W16-1406
PWC	https://paperswithcode.com/paper/visualization-of-dynamic-reference-graphs
Repo
Framework

Context Tailoring for Text Normalization


Title	Context Tailoring for Text Normalization
Authors	Seniz Demir
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1402/
PDF	https://www.aclweb.org/anthology/W16-1402
PWC	https://paperswithcode.com/paper/context-tailoring-for-text-normalization
Repo
Framework

The Limits of Learning with Missing Data


Title	The Limits of Learning with Missing Data
Authors	Brian Bullins, Elad Hazan, Tomer Koren
Abstract	We study regression and classification in a setting where the learning algorithm is allowed to access only a limited number of attributes per example, known as the limited attribute observation model. In this well-studied model, we provide the first lower bounds giving a limit on the precision attainable by any algorithm for several variants of regression, notably linear regression with the absolute loss and the squared loss, as well as for classification with the hinge loss. We complement these lower bounds with a general purpose algorithm that gives an upper bound on the achievable precision limit in the setting of learning with missing data.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6171-the-limits-of-learning-with-missing-data
PDF	http://papers.nips.cc/paper/6171-the-limits-of-learning-with-missing-data.pdf
PWC	https://paperswithcode.com/paper/the-limits-of-learning-with-missing-data
Repo
Framework

Unsupervised Timeline Generation for Wikipedia History Articles


Title	Unsupervised Timeline Generation for Wikipedia History Articles
Authors	S Bauer, ro, Simone Teufel
Abstract
Tasks
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1259/
PDF	https://www.aclweb.org/anthology/D16-1259
PWC	https://paperswithcode.com/paper/unsupervised-timeline-generation-for
Repo
Framework

Tuning Bayes Baseline for Dialect Detection


Title	Tuning Bayes Baseline for Dialect Detection
Authors	Hector-Hugo Franco-Penya, Liliana Mamani Sanchez
Abstract	This paper describes an analysis of our submissions to the Dialect Detection Shared Task 2016. We proposed three different systems that involved simplistic features, to name: a Naive-bayes system, a Support Vector Machines-based system and a Tree Kernel-based system. These systems underperform when compared to other submissions in this shared task, since the best one achieved an accuracy of {\textasciitilde}0.834.
Tasks	Domain Adaptation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4829/
PDF	https://www.aclweb.org/anthology/W16-4829
PWC	https://paperswithcode.com/paper/tuning-bayes-baseline-for-dialect-detection
Repo
Framework


Title	Learning and Forecasting Opinion Dynamics in Social Networks
Authors	Abir De, Isabel Valera, Niloy Ganguly, Sourangshu Bhattacharya, Manuel Gomez Rodriguez
Abstract	Social media and social networking sites have become a global pinboard for exposition and discussion of news, topics, and ideas, where social media users often update their opinions about a particular topic by learning from the opinions shared by their friends. In this context, can we learn a data-driven model of opinion dynamics that is able to accurately forecast users’ opinions? In this paper, we introduce SLANT, a probabilistic modeling framework of opinion dynamics, which represents users’ opinions over time by means of marked jump diffusion stochastic differential equations, and allows for efficient model simulation and parameter estimation from historical fine grained event data. We then leverage our framework to derive a set of efficient predictive formulas for opinion forecasting and identify conditions under which opinions converge to a steady state. Experiments on data gathered from Twitter show that our model provides a good fit to the data and our formulas achieve more accurate forecasting than alternatives.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6193-learning-and-forecasting-opinion-dynamics-in-social-networks
PDF	http://papers.nips.cc/paper/6193-learning-and-forecasting-opinion-dynamics-in-social-networks.pdf
PWC	https://paperswithcode.com/paper/learning-and-forecasting-opinion-dynamics-in
Repo
Framework

Learning Text Similarity with Siamese Recurrent Networks


Title	Learning Text Similarity with Siamese Recurrent Networks
Authors	Paul Neculoiu, Maarten Versteegh, Mihai Rotaru
Abstract
Tasks	Recommendation Systems, Representation Learning, Semantic Textual Similarity, Sentiment Analysis
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1617/
PDF	https://www.aclweb.org/anthology/W16-1617
PWC	https://paperswithcode.com/paper/learning-text-similarity-with-siamese
Repo
Framework

Building the Macedonian-Croatian Parallel Corpus


Title	Building the Macedonian-Croatian Parallel Corpus
Authors	Ines Cebovi{'c}, Marko Tadi{'c}
Abstract	In this paper we present the newly created parallel corpus of two under-resourced languages, namely, Macedonian-Croatian Parallel Corpus (mk-hr{_}pcorp) that has been collected during 2015 at the Faculty of Humanities and Social Sciences, University of Zagreb. The mk-hr{_}pcorp is a unidirectional (mkâ†’hr) parallel corpus composed of synchronic fictional prose texts received already in digital form with over 500 Kw in each language. The corpus was sentence segmented and provides 39,735 aligned sentences. The alignment was done automatically and then post-corrected manually. The alignments order was shuffled and this enabled the corpus to be available under CC-BY license through META-SHARE. However, this prevents the research in language units over the sentence level.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1671/
PDF	https://www.aclweb.org/anthology/L16-1671
PWC	https://paperswithcode.com/paper/building-the-macedonian-croatian-parallel
Repo
Framework

Crowdsourcing a Large Dataset of Domain-Specific Context-Sensitive Semantic Verb Relations


Title	Crowdsourcing a Large Dataset of Domain-Specific Context-Sensitive Semantic Verb Relations
Authors	Maria Sukhareva, Judith Eckle-Kohler, Ivan Habernal, Iryna Gurevych
Abstract	We present a new large dataset of 12403 context-sensitive verb relations manually annotated via crowdsourcing. These relations capture fine-grained semantic information between verb-centric propositions, such as temporal or entailment relations. We propose a novel semantic verb relation scheme and design a multi-step annotation approach for scaling-up the annotations using crowdsourcing. We employ several quality measures and report on agreement scores. The resulting dataset is available under a permissive CreativeCommons license at www.ukp.tu-darmstadt.de/data/verb-relations/. It represents a valuable resource for various applications, such as automatic information consolidation or automatic summarization.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1338/
PDF	https://www.aclweb.org/anthology/L16-1338
PWC	https://paperswithcode.com/paper/crowdsourcing-a-large-dataset-of-domain
Repo
Framework

A study on the production of collocations by European Portuguese learners


Title	A study on the production of collocations by European Portuguese learners
Authors	{^A}ngela Costa, Lu{'\i}sa Coheur, Teresa Lino
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1814/
PDF	https://www.aclweb.org/anthology/W16-1814
PWC	https://paperswithcode.com/paper/a-study-on-the-production-of-collocations-by
Repo
Framework

The Effects of Data Collection Methods in Twitter


Title	The Effects of Data Collection Methods in Twitter
Authors	Sunghwan Mac Kim, Stephen Wan, C{'e}cile Paris, Brian Jin, Bella Robinson
Abstract
Tasks	Keyword Spotting
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-5611/
PDF	https://www.aclweb.org/anthology/W16-5611
PWC	https://paperswithcode.com/paper/the-effects-of-data-collection-methods-in
Repo
Framework

A Framework for Discriminative Rule Selection in Hierarchical Moses


Title	A Framework for Discriminative Rule Selection in Hierarchical Moses
Authors	Fabienne Braune, Alex Fraser, er, Hal Daum{'e} III, Ale{\v{s}} Tamchyna
Abstract
Tasks	Machine Translation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2210/
PDF	https://www.aclweb.org/anthology/W16-2210
PWC	https://paperswithcode.com/paper/a-framework-for-discriminative-rule-selection
Repo
Framework

Semantic Indexing of Multilingual Corpora and its Application on the History Domain


Title	Semantic Indexing of Multilingual Corpora and its Application on the History Domain
Authors	Aless Raganato, ro, Jose Camacho-Collados, Antonio Raganato, Yunseo Joung
Abstract	The increasing amount of multilingual text collections available in different domains makes its automatic processing essential for the development of a given field. However, standard processing techniques based on statistical clues and keyword searches have clear limitations. Instead, we propose a knowledge-based processing pipeline which overcomes most of the limitations of these techniques. This, in turn, enables direct comparison across texts in different languages without the need of translation. In this paper we show the potential of this approach for semantically indexing multilingual text collections in the history domain. In our experiments we used a version of the Bible translated in four different languages, evaluating the precision of our semantic indexing pipeline and showing its reliability on the cross-lingual text retrieval task.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4019/
PDF	https://www.aclweb.org/anthology/W16-4019
PWC	https://paperswithcode.com/paper/semantic-indexing-of-multilingual-corpora-and
Repo
Framework

Minions at SemEval-2016 Task 4: or how to build a sentiment analyzer using off-the-shelf resources?


Title	Minions at SemEval-2016 Task 4: or how to build a sentiment analyzer using off-the-shelf resources?
Authors	C{\u{a}}lin-Cristian Ciubotariu, Marius-Valentin Hri{\c{s}}ca, Mihail Gliga, Diana Daraban{\u{a}}, Tr, Diana ab{\u{a}}{\c{t}}, Adrian Iftene
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1038/
PDF	https://www.aclweb.org/anthology/S16-1038
PWC	https://paperswithcode.com/paper/minions-at-semeval-2016-task-4-or-how-to
Repo
Framework