May 4, 2019

2175 words 11 mins read

Paper Group NANR 207

Paper Group NANR 207

Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging. Cross-Lingual Named Entity Recognition via Wikification. DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets. IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Mod …

Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging

Title Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging
Authors Laura Hern{'a}ndez-Dom{'\i}nguez, Sylvie Ratt{'e}, Boyd Davis, Charlene Pope
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1903/
PDF https://www.aclweb.org/anthology/W16-1903
PWC https://paperswithcode.com/paper/conversing-with-the-elderly-in-latin-america
Repo
Framework

Cross-Lingual Named Entity Recognition via Wikification

Title Cross-Lingual Named Entity Recognition via Wikification
Authors Chen-Tse Tsai, Stephen Mayhew, Dan Roth
Abstract
Tasks Entity Linking, Information Retrieval, Named Entity Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-1022/
PDF https://www.aclweb.org/anthology/K16-1022
PWC https://paperswithcode.com/paper/cross-lingual-named-entity-recognition-via
Repo
Framework

DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets

Title DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets
Authors Fabrice Dugas, Eric Nichols
Abstract In this paper, we describe the DeepNNNER entry to The 2nd Workshop on Noisy User-generated Text (WNUT) Shared Task {#}2: Named Entity Recognition in Twitter. Our shared task submission adopts the bidirectional LSTM-CNN model of Chiu and Nichols (2016), as it has been shown to perform well on both newswire and Web texts. It uses word embeddings trained on large-scale Web text collections together with text normalization to cope with the diversity in Web texts, and lexicons for target named entity classes constructed from publicly-available sources. Extended evaluation comparing the effectiveness of various word embeddings, text normalization, and lexicon settings shows that our system achieves a maximum F1-score of 47.24, performance surpassing that of the shared task{'}s second-ranked system.
Tasks Feature Engineering, Named Entity Recognition, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3924/
PDF https://www.aclweb.org/anthology/W16-3924
PWC https://paperswithcode.com/paper/deepnnner-applying-blstm-cnns-and-extended
Repo
Framework

IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT

Title IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT
Authors S Singh, hya, Anoop Kunchukuttan, Pushpak Bhattacharyya
Abstract This paper describes the IIT Bombay{'}s submission as a part of the shared task in WAT 2016 for English{–}Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation Sequence Model (OSM) and Neural Language Model have been submitted for WAT. The OSM approach integrates translation and reordering process resulting in relatively improved translation. Similarly the neural experiment integrates Neural Language Model with Statistical Machine Translation (SMT) as a feature for translation. The Neural Probabilistic Language Model (NPLM) gave relatively high BLEU points for Indonesian to English translation system while the Neural Network Joint Model (NNJM) performed better for English to Indonesian direction of translation system. The results indicate improvement over the baseline Phrase-based SMT by 0.61 BLEU points for English-Indonesian system and 0.55 BLEU points for Indonesian-English translation system.
Tasks Language Modelling, Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4604/
PDF https://www.aclweb.org/anthology/W16-4604
PWC https://paperswithcode.com/paper/iit-bombayas-english-indonesian-submission-at
Repo
Framework

A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter

Title A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter
Authors Yasuhide Miura, Motoki Taniguchi, Tomoki Taniguchi, Tomoko Ohkuma
Abstract This paper describes a model that we submitted to W-NUT 2016 Shared task {#}1: Geolocation Prediction in Twitter. Our model classifies a tweet or a user to a city using a simple neural networks structure with fully-connected layers and average pooling processes. From the findings of previous geolocation prediction approaches, we integrated various user metadata along with message texts and trained the model with them. In the test run of the task, the model achieved the accuracy of 40.91{%} and the median distance error of 69.50 km in message-level prediction and the accuracy of 47.55{%} and the median distance error of 16.13 km in user-level prediction. These results are moderate performances in terms of accuracy and best performances in terms of distance. The results show a promising extension of neural networks based models for geolocation prediction where recent advances in neural networks can be added to enhance our current simple model.
Tasks Denoising
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3931/
PDF https://www.aclweb.org/anthology/W16-3931
PWC https://paperswithcode.com/paper/a-simple-scalable-neural-networks-based-model
Repo
Framework

Argument Mining: the Bottleneck of Knowledge and Language Resources

Title Argument Mining: the Bottleneck of Knowledge and Language Resources
Authors Patrick Saint-Dizier
Abstract Given a controversial issue, argument mining from natural language texts (news papers, and any form of text on the Internet) is extremely challenging: domain knowledge is often required together with appropriate forms of inferences to identify arguments. This contribution explores the types of knowledge that are required and how they can be paired with reasoning schemes, language processing and language resources to accurately mine arguments. We show via corpus analysis that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain knowledge representation, is a relevant approach. In this paper, corpus annotation for argument mining is first developed, then we show how the generative lexicon approach must be adapted and how it can be paired with language processing patterns to extract and specify the nature of arguments. Our approach to argument mining is thus knowledge driven.
Tasks Argument Mining
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1156/
PDF https://www.aclweb.org/anthology/L16-1156
PWC https://paperswithcode.com/paper/argument-mining-the-bottleneck-of-knowledge
Repo
Framework

Modelling the Usage of Discourse Connectives as Rational Speech Acts

Title Modelling the Usage of Discourse Connectives as Rational Speech Acts
Authors Frances Yung, Kevin Duh, Taku Komura, Yuji Matsumoto
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-1030/
PDF https://www.aclweb.org/anthology/K16-1030
PWC https://paperswithcode.com/paper/modelling-the-usage-of-discourse-connectives
Repo
Framework

Finding Arguments as Sequence Labeling in Discourse Parsing

Title Finding Arguments as Sequence Labeling in Discourse Parsing
Authors Ziwei Fan, Zhenghua Li, Min Zhang
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-2021/
PDF https://www.aclweb.org/anthology/K16-2021
PWC https://paperswithcode.com/paper/finding-arguments-as-sequence-labeling-in
Repo
Framework

Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations

Title Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations
Authors Jonas Kuhn
Abstract The availability of Language Technology Resources and Tools generates a considerable methodological potential in the Digital Humanities: aspects of research questions from the Humanities and Social Sciences can be addressed on text collections in ways that were unavailable to traditional approaches. I start this talk by sketching some sample scenarios of Digital Humanities projects which involve various Humanities and Social Science disciplines, noting that the potential for a meaningful contribution to higher-level questions is highest when the employed language technological models are carefully tailored both (a) to characteristics of the given target corpus, and (b) to relevant analytical subtasks feeding the discipline-specific research questions. Keeping up a multidisciplinary perspective, I then point out a recurrent dilemma in Digital Humanities projects that follow the conventional set-up of collaboration: to build high-quality computational models for the data, fixed analytical targets should be specified as early as possible {–} but to be able to respond to Humanities questions as they evolve over the course of analysis, the analytical machinery should be kept maximally flexible. To reach both, I argue for a novel collaborative culture that rests on a more interleaved, continuous dialogue. (Re-)Specification of analytical targets should be an ongoing process in which the Humanities Scholars and Social Scientists play a role that is as important as the Computational Scientists{'} role. A promising approach lies in the identification of re-occurring types of analytical subtasks, beyond linguistic standard tasks, which can form building blocks for text analysis across disciplines, and for which corpus-based characterizations (viz. annotations) can be collected, compared and revised. On such grounds, computational modeling is more directly tied to the evolving research questions, and hence the seemingly opposing needs of reliable target specifications vs. {``}malleable{''} frameworks of analysis can be reconciled. Experimental work following this approach is under way in the Center for Reflected Text Analytics (CRETA) in Stuttgart. |
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4001/
PDF https://www.aclweb.org/anthology/W16-4001
PWC https://paperswithcode.com/paper/flexible-and-reliable-text-analytics-in-the
Repo
Framework

Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language

Title Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language
Authors Teruaki Oka, Tomoaki Kono
Abstract We are constructing an annotated diachronic corpora of the Japanese language. In part of thiswork, we construct a corpus of Manyosyu, which is an old Japanese poetry anthology. In thispaper, we describe how to align the transcribed text and its original text semiautomatically to beable to cross-reference them in our Manyosyu corpus. Although we align the original charactersto the transcribed words manually, we preliminarily align the transcribed and original charactersby using an unsupervised automatic alignment technique of statistical machine translation toalleviate the work. We found that automatic alignment achieves an F1-measure of 0.83; thus, each poem has 1{–}2 alignment errors. However, finding these errors and modifying them are less workintensiveand more efficient than fully manual annotation. The alignment probabilities can beutilized in this modification. Moreover, we found that we can locate the uncertain transcriptionsin our corpus and compare them to other transcriptions, by using the alignment probabilities.
Tasks Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4006/
PDF https://www.aclweb.org/anthology/W16-4006
PWC https://paperswithcode.com/paper/original-transcribed-text-alignment-for
Repo
Framework

Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis

Title Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis
Authors Sven Buechel, Johannes Hellrich, Udo Hahn
Abstract We here describe a novel methodology for measuring affective language in historical text by expanding an affective lexicon and jointly adapting it to prior language stages. We automatically construct a lexicon for word-emotion association of 18th and 19th century German which is then validated against expert ratings. Subsequently, this resource is used to identify distinct emotional patterns and trace long-term emotional trends in different genres of writing spanning several centuries.
Tasks Emotion Recognition, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4008/
PDF https://www.aclweb.org/anthology/W16-4008
PWC https://paperswithcode.com/paper/feelings-from-the-pastaadapting-affective
Repo
Framework

Unsupervised Text Segmentation Using Semantic Relatedness Graphs

Title Unsupervised Text Segmentation Using Semantic Relatedness Graphs
Authors Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto
Abstract
Tasks Information Retrieval, Text Summarization, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/S16-2016/
PDF https://www.aclweb.org/anthology/S16-2016
PWC https://paperswithcode.com/paper/unsupervised-text-segmentation-using-semantic
Repo
Framework

Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction

Title Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction
Authors Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon
Abstract Time series prediction problems are becoming increasingly high-dimensional in modern applications, such as climatology and demand forecasting. For example, in the latter problem, the number of items for which demand needs to be forecast might be as large as 50,000. In addition, the data is generally noisy and full of missing values. Thus, modern applications require methods that are highly scalable, and can deal with noisy data in terms of corruptions or missing values. However, classical time series methods usually fall short of handling these issues. In this paper, we present a temporal regularized matrix factorization (TRMF) framework which supports data-driven temporal learning and forecasting. We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values. Our proposed TRMF is highly general, and subsumes many existing approaches for time series analysis. We make interesting connections to graph regularization methods in the context of learning the dependencies in an autoregressive framework. Experimental results show the superiority of TRMF in terms of scalability and prediction quality. In particular, TRMF is two orders of magnitude faster than other methods on a problem of dimension 50,000, and generates better forecasts on real-world datasets such as Wal-mart E-commerce datasets.
Tasks Time Series, Time Series Analysis, Time Series Prediction
Published 2016-12-01
URL http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction
PDF http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction.pdf
PWC https://paperswithcode.com/paper/temporal-regularized-matrix-factorization-for
Repo
Framework

Towards Non-projective High-Order Dependency Parser

Title Towards Non-projective High-Order Dependency Parser
Authors Wenjing Fang, Kenny Zhu, Yizhong Wang, Jia Tan
Abstract This paper presents a novel high-order dependency parsing framework that targets non-projective treebanks. It imitates how a human parses sentences in an intuitive way. At every step of the parse, it determines which word is the easiest to process among all the remaining words, identifies its head word and then folds it under the head word. Further, this work is flexible enough to be augmented with other parsing techniques.
Tasks Dependency Parsing
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2052/
PDF https://www.aclweb.org/anthology/C16-2052
PWC https://paperswithcode.com/paper/towards-non-projective-high-order-dependency
Repo
Framework

The Generalized Reparameterization Gradient

Title The Generalized Reparameterization Gradient
Authors Francisco R. Ruiz, Michalis Titsias Rc Aueb, David Blei
Abstract The reparameterization gradient has become a widely used method to obtain Monte Carlo gradients to optimize the variational objective. However, this technique does not easily apply to commonly used distributions such as beta or gamma without further approximations, and most practical applications of the reparameterization gradient fit Gaussian distributions. In this paper, we introduce the generalized reparameterization gradient, a method that extends the reparameterization gradient to a wider class of variational distributions. Generalized reparameterizations use invertible transformations of the latent variables which lead to transformed distributions that weakly depend on the variational parameters. This results in new Monte Carlo gradients that combine reparameterization gradients and score function gradients. We demonstrate our approach on variational inference for two complex probabilistic models. The generalized reparameterization is effective: even a single sample from the variational distribution is enough to obtain a low-variance gradient.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient
PDF http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient.pdf
PWC https://paperswithcode.com/paper/the-generalized-reparameterization-gradient-1
Repo
Framework
comments powered by Disqus