May 4, 2019

2175 words 11 mins read

Paper Group NANR 207

Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging. Cross-Lingual Named Entity Recognition via Wikification. DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets. IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Mod …

Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging


Title	Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging
Authors	Laura Hern{'a}ndez-Dom{'\i}nguez, Sylvie Ratt{'e}, Boyd Davis, Charlene Pope
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1903/
PDF	https://www.aclweb.org/anthology/W16-1903
PWC	https://paperswithcode.com/paper/conversing-with-the-elderly-in-latin-america
Repo
Framework

Cross-Lingual Named Entity Recognition via Wikification


Title	Cross-Lingual Named Entity Recognition via Wikification
Authors	Chen-Tse Tsai, Stephen Mayhew, Dan Roth
Abstract
Tasks	Entity Linking, Information Retrieval, Named Entity Recognition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1022/
PDF	https://www.aclweb.org/anthology/K16-1022
PWC	https://paperswithcode.com/paper/cross-lingual-named-entity-recognition-via
Repo
Framework

DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets


Title	DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets
Authors	Fabrice Dugas, Eric Nichols
Abstract	In this paper, we describe the DeepNNNER entry to The 2nd Workshop on Noisy User-generated Text (WNUT) Shared Task {#}2: Named Entity Recognition in Twitter. Our shared task submission adopts the bidirectional LSTM-CNN model of Chiu and Nichols (2016), as it has been shown to perform well on both newswire and Web texts. It uses word embeddings trained on large-scale Web text collections together with text normalization to cope with the diversity in Web texts, and lexicons for target named entity classes constructed from publicly-available sources. Extended evaluation comparing the effectiveness of various word embeddings, text normalization, and lexicon settings shows that our system achieves a maximum F1-score of 47.24, performance surpassing that of the shared task{'}s second-ranked system.
Tasks	Feature Engineering, Named Entity Recognition, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-3924/
PDF	https://www.aclweb.org/anthology/W16-3924
PWC	https://paperswithcode.com/paper/deepnnner-applying-blstm-cnns-and-extended
Repo
Framework

IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT


Title	IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT
Authors	S Singh, hya, Anoop Kunchukuttan, Pushpak Bhattacharyya
Abstract	This paper describes the IIT Bombay{'}s submission as a part of the shared task in WAT 2016 for English{–}Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation Sequence Model (OSM) and Neural Language Model have been submitted for WAT. The OSM approach integrates translation and reordering process resulting in relatively improved translation. Similarly the neural experiment integrates Neural Language Model with Statistical Machine Translation (SMT) as a feature for translation. The Neural Probabilistic Language Model (NPLM) gave relatively high BLEU points for Indonesian to English translation system while the Neural Network Joint Model (NNJM) performed better for English to Indonesian direction of translation system. The results indicate improvement over the baseline Phrase-based SMT by 0.61 BLEU points for English-Indonesian system and 0.55 BLEU points for Indonesian-English translation system.
Tasks	Language Modelling, Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4604/
PDF	https://www.aclweb.org/anthology/W16-4604
PWC	https://paperswithcode.com/paper/iit-bombayas-english-indonesian-submission-at
Repo
Framework

A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter


Title	A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter
Authors	Yasuhide Miura, Motoki Taniguchi, Tomoki Taniguchi, Tomoko Ohkuma
Abstract	This paper describes a model that we submitted to W-NUT 2016 Shared task {#}1: Geolocation Prediction in Twitter. Our model classifies a tweet or a user to a city using a simple neural networks structure with fully-connected layers and average pooling processes. From the findings of previous geolocation prediction approaches, we integrated various user metadata along with message texts and trained the model with them. In the test run of the task, the model achieved the accuracy of 40.91{%} and the median distance error of 69.50 km in message-level prediction and the accuracy of 47.55{%} and the median distance error of 16.13 km in user-level prediction. These results are moderate performances in terms of accuracy and best performances in terms of distance. The results show a promising extension of neural networks based models for geolocation prediction where recent advances in neural networks can be added to enhance our current simple model.
Tasks	Denoising
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-3931/
PDF	https://www.aclweb.org/anthology/W16-3931
PWC	https://paperswithcode.com/paper/a-simple-scalable-neural-networks-based-model
Repo
Framework

Argument Mining: the Bottleneck of Knowledge and Language Resources


Title	Argument Mining: the Bottleneck of Knowledge and Language Resources
Authors	Patrick Saint-Dizier
Abstract	Given a controversial issue, argument mining from natural language texts (news papers, and any form of text on the Internet) is extremely challenging: domain knowledge is often required together with appropriate forms of inferences to identify arguments. This contribution explores the types of knowledge that are required and how they can be paired with reasoning schemes, language processing and language resources to accurately mine arguments. We show via corpus analysis that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain knowledge representation, is a relevant approach. In this paper, corpus annotation for argument mining is first developed, then we show how the generative lexicon approach must be adapted and how it can be paired with language processing patterns to extract and specify the nature of arguments. Our approach to argument mining is thus knowledge driven.
Tasks	Argument Mining
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1156/
PDF	https://www.aclweb.org/anthology/L16-1156
PWC	https://paperswithcode.com/paper/argument-mining-the-bottleneck-of-knowledge
Repo
Framework

Modelling the Usage of Discourse Connectives as Rational Speech Acts


Title	Modelling the Usage of Discourse Connectives as Rational Speech Acts
Authors	Frances Yung, Kevin Duh, Taku Komura, Yuji Matsumoto
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1030/
PDF	https://www.aclweb.org/anthology/K16-1030
PWC	https://paperswithcode.com/paper/modelling-the-usage-of-discourse-connectives
Repo
Framework

Finding Arguments as Sequence Labeling in Discourse Parsing


Title	Finding Arguments as Sequence Labeling in Discourse Parsing
Authors	Ziwei Fan, Zhenghua Li, Min Zhang
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-2021/
PDF	https://www.aclweb.org/anthology/K16-2021
PWC	https://paperswithcode.com/paper/finding-arguments-as-sequence-labeling-in
Repo
Framework

Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations


Title	Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations
Authors	Jonas Kuhn
Abstract	The availability of Language Technology Resources and Tools generates a considerable methodological potential in the Digital Humanities: aspects of research questions from the Humanities and Social Sciences can be addressed on text collections in ways that were unavailable to traditional approaches. I start this talk by sketching some sample scenarios of Digital Humanities projects which involve various Humanities and Social Science disciplines, noting that the potential for a meaningful contribution to higher-level questions is highest when the employed language technological models are carefully tailored both (a) to characteristics of the given target corpus, and (b) to relevant analytical subtasks feeding the discipline-specific research questions. Keeping up a multidisciplinary perspective, I then point out a recurrent dilemma in Digital Humanities projects that follow the conventional set-up of collaboration: to build high-quality computational models for the data, fixed analytical targets should be specified as early as possible {–} but to be able to respond to Humanities questions as they evolve over the course of analysis, the analytical machinery should be kept maximally flexible. To reach both, I argue for a novel collaborative culture that rests on a more interleaved, continuous dialogue. (Re-)Specification of analytical targets should be an ongoing process in which the Humanities Scholars and Social Scientists play a role that is as important as the Computational Scientists{'} role. A promising approach lies in the identification of re-occurring types of analytical subtasks, beyond linguistic standard tasks, which can form building blocks for text analysis across disciplines, and for which corpus-based characterizations (viz. annotations) can be collected, compared and revised. On such grounds, computational modeling is more directly tied to the evolving research questions, and hence the seemingly opposing needs of reliable target specifications vs. {``}malleable{''} frameworks of analysis can be reconciled. Experimental work following this approach is under way in the Center for Reflected Text Analytics (CRETA) in Stuttgart. \|
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4001/
PDF	https://www.aclweb.org/anthology/W16-4001
PWC	https://paperswithcode.com/paper/flexible-and-reliable-text-analytics-in-the
Repo
Framework

Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language


Title	Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language
Authors	Teruaki Oka, Tomoaki Kono
Abstract	We are constructing an annotated diachronic corpora of the Japanese language. In part of thiswork, we construct a corpus of Manyosyu, which is an old Japanese poetry anthology. In thispaper, we describe how to align the transcribed text and its original text semiautomatically to beable to cross-reference them in our Manyosyu corpus. Although we align the original charactersto the transcribed words manually, we preliminarily align the transcribed and original charactersby using an unsupervised automatic alignment technique of statistical machine translation toalleviate the work. We found that automatic alignment achieves an F1-measure of 0.83; thus, each poem has 1{–}2 alignment errors. However, finding these errors and modifying them are less workintensiveand more efficient than fully manual annotation. The alignment probabilities can beutilized in this modification. Moreover, we found that we can locate the uncertain transcriptionsin our corpus and compare them to other transcriptions, by using the alignment probabilities.
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4006/
PDF	https://www.aclweb.org/anthology/W16-4006
PWC	https://paperswithcode.com/paper/original-transcribed-text-alignment-for
Repo
Framework

Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis


Title	Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis
Authors	Sven Buechel, Johannes Hellrich, Udo Hahn
Abstract	We here describe a novel methodology for measuring affective language in historical text by expanding an affective lexicon and jointly adapting it to prior language stages. We automatically construct a lexicon for word-emotion association of 18th and 19th century German which is then validated against expert ratings. Subsequently, this resource is used to identify distinct emotional patterns and trace long-term emotional trends in different genres of writing spanning several centuries.
Tasks	Emotion Recognition, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4008/
PDF	https://www.aclweb.org/anthology/W16-4008
PWC	https://paperswithcode.com/paper/feelings-from-the-pastaadapting-affective
Repo
Framework

Unsupervised Text Segmentation Using Semantic Relatedness Graphs


Title	Unsupervised Text Segmentation Using Semantic Relatedness Graphs
Authors	Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto
Abstract
Tasks	Information Retrieval, Text Summarization, Word Embeddings
Published	2016-08-01
URL	https://www.aclweb.org/anthology/S16-2016/
PDF	https://www.aclweb.org/anthology/S16-2016
PWC	https://paperswithcode.com/paper/unsupervised-text-segmentation-using-semantic
Repo
Framework

Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction


Title	Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction
Authors	Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
Abstract	Time series prediction problems are becoming increasingly high-dimensional in modern applications, such as climatology and demand forecasting. For example, in the latter problem, the number of items for which demand needs to be forecast might be as large as 50,000. In addition, the data is generally noisy and full of missing values. Thus, modern applications require methods that are highly scalable, and can deal with noisy data in terms of corruptions or missing values. However, classical time series methods usually fall short of handling these issues. In this paper, we present a temporal regularized matrix factorization (TRMF) framework which supports data-driven temporal learning and forecasting. We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values. Our proposed TRMF is highly general, and subsumes many existing approaches for time series analysis. We make interesting connections to graph regularization methods in the context of learning the dependencies in an autoregressive framework. Experimental results show the superiority of TRMF in terms of scalability and prediction quality. In particular, TRMF is two orders of magnitude faster than other methods on a problem of dimension 50,000, and generates better forecasts on real-world datasets such as Wal-mart E-commerce datasets.
Tasks	Time Series, Time Series Analysis, Time Series Prediction
Published	2016-12-01
URL	http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction
PDF	http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction.pdf
PWC	https://paperswithcode.com/paper/temporal-regularized-matrix-factorization-for
Repo
Framework

Towards Non-projective High-Order Dependency Parser


Title	Towards Non-projective High-Order Dependency Parser
Authors	Wenjing Fang, Kenny Zhu, Yizhong Wang, Jia Tan
Abstract	This paper presents a novel high-order dependency parsing framework that targets non-projective treebanks. It imitates how a human parses sentences in an intuitive way. At every step of the parse, it determines which word is the easiest to process among all the remaining words, identifies its head word and then folds it under the head word. Further, this work is flexible enough to be augmented with other parsing techniques.
Tasks	Dependency Parsing
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2052/
PDF	https://www.aclweb.org/anthology/C16-2052
PWC	https://paperswithcode.com/paper/towards-non-projective-high-order-dependency
Repo
Framework

The Generalized Reparameterization Gradient


Title	The Generalized Reparameterization Gradient
Authors	Francisco R. Ruiz, Michalis Titsias Rc Aueb, David Blei
Abstract	The reparameterization gradient has become a widely used method to obtain Monte Carlo gradients to optimize the variational objective. However, this technique does not easily apply to commonly used distributions such as beta or gamma without further approximations, and most practical applications of the reparameterization gradient fit Gaussian distributions. In this paper, we introduce the generalized reparameterization gradient, a method that extends the reparameterization gradient to a wider class of variational distributions. Generalized reparameterizations use invertible transformations of the latent variables which lead to transformed distributions that weakly depend on the variational parameters. This results in new Monte Carlo gradients that combine reparameterization gradients and score function gradients. We demonstrate our approach on variational inference for two complex probabilistic models. The generalized reparameterization is effective: even a single sample from the variational distribution is enough to obtain a low-variance gradient.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient
PDF	http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient.pdf
PWC	https://paperswithcode.com/paper/the-generalized-reparameterization-gradient-1
Repo
Framework