Paper Group NANR 207
Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging. Cross-Lingual Named Entity Recognition via Wikification. DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets. IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Mod …
Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging
Title | Conversing with the elderly in Latin America: a new cohort for multimodal, multilingual longitudinal studies on aging |
Authors | Laura Hern{'a}ndez-Dom{'\i}nguez, Sylvie Ratt{'e}, Boyd Davis, Charlene Pope |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1903/ |
https://www.aclweb.org/anthology/W16-1903 | |
PWC | https://paperswithcode.com/paper/conversing-with-the-elderly-in-latin-america |
Repo | |
Framework | |
Cross-Lingual Named Entity Recognition via Wikification
Title | Cross-Lingual Named Entity Recognition via Wikification |
Authors | Chen-Tse Tsai, Stephen Mayhew, Dan Roth |
Abstract | |
Tasks | Entity Linking, Information Retrieval, Named Entity Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-1022/ |
https://www.aclweb.org/anthology/K16-1022 | |
PWC | https://paperswithcode.com/paper/cross-lingual-named-entity-recognition-via |
Repo | |
Framework | |
DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets
Title | DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets |
Authors | Fabrice Dugas, Eric Nichols |
Abstract | In this paper, we describe the DeepNNNER entry to The 2nd Workshop on Noisy User-generated Text (WNUT) Shared Task {#}2: Named Entity Recognition in Twitter. Our shared task submission adopts the bidirectional LSTM-CNN model of Chiu and Nichols (2016), as it has been shown to perform well on both newswire and Web texts. It uses word embeddings trained on large-scale Web text collections together with text normalization to cope with the diversity in Web texts, and lexicons for target named entity classes constructed from publicly-available sources. Extended evaluation comparing the effectiveness of various word embeddings, text normalization, and lexicon settings shows that our system achieves a maximum F1-score of 47.24, performance surpassing that of the shared task{'}s second-ranked system. |
Tasks | Feature Engineering, Named Entity Recognition, Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3924/ |
https://www.aclweb.org/anthology/W16-3924 | |
PWC | https://paperswithcode.com/paper/deepnnner-applying-blstm-cnns-and-extended |
Repo | |
Framework | |
IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT
Title | IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT |
Authors | S Singh, hya, Anoop Kunchukuttan, Pushpak Bhattacharyya |
Abstract | This paper describes the IIT Bombay{'}s submission as a part of the shared task in WAT 2016 for English{–}Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation Sequence Model (OSM) and Neural Language Model have been submitted for WAT. The OSM approach integrates translation and reordering process resulting in relatively improved translation. Similarly the neural experiment integrates Neural Language Model with Statistical Machine Translation (SMT) as a feature for translation. The Neural Probabilistic Language Model (NPLM) gave relatively high BLEU points for Indonesian to English translation system while the Neural Network Joint Model (NNJM) performed better for English to Indonesian direction of translation system. The results indicate improvement over the baseline Phrase-based SMT by 0.61 BLEU points for English-Indonesian system and 0.55 BLEU points for Indonesian-English translation system. |
Tasks | Language Modelling, Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4604/ |
https://www.aclweb.org/anthology/W16-4604 | |
PWC | https://paperswithcode.com/paper/iit-bombayas-english-indonesian-submission-at |
Repo | |
Framework | |
A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter
Title | A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter |
Authors | Yasuhide Miura, Motoki Taniguchi, Tomoki Taniguchi, Tomoko Ohkuma |
Abstract | This paper describes a model that we submitted to W-NUT 2016 Shared task {#}1: Geolocation Prediction in Twitter. Our model classifies a tweet or a user to a city using a simple neural networks structure with fully-connected layers and average pooling processes. From the findings of previous geolocation prediction approaches, we integrated various user metadata along with message texts and trained the model with them. In the test run of the task, the model achieved the accuracy of 40.91{%} and the median distance error of 69.50 km in message-level prediction and the accuracy of 47.55{%} and the median distance error of 16.13 km in user-level prediction. These results are moderate performances in terms of accuracy and best performances in terms of distance. The results show a promising extension of neural networks based models for geolocation prediction where recent advances in neural networks can be added to enhance our current simple model. |
Tasks | Denoising |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3931/ |
https://www.aclweb.org/anthology/W16-3931 | |
PWC | https://paperswithcode.com/paper/a-simple-scalable-neural-networks-based-model |
Repo | |
Framework | |
Argument Mining: the Bottleneck of Knowledge and Language Resources
Title | Argument Mining: the Bottleneck of Knowledge and Language Resources |
Authors | Patrick Saint-Dizier |
Abstract | Given a controversial issue, argument mining from natural language texts (news papers, and any form of text on the Internet) is extremely challenging: domain knowledge is often required together with appropriate forms of inferences to identify arguments. This contribution explores the types of knowledge that are required and how they can be paired with reasoning schemes, language processing and language resources to accurately mine arguments. We show via corpus analysis that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain knowledge representation, is a relevant approach. In this paper, corpus annotation for argument mining is first developed, then we show how the generative lexicon approach must be adapted and how it can be paired with language processing patterns to extract and specify the nature of arguments. Our approach to argument mining is thus knowledge driven. |
Tasks | Argument Mining |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1156/ |
https://www.aclweb.org/anthology/L16-1156 | |
PWC | https://paperswithcode.com/paper/argument-mining-the-bottleneck-of-knowledge |
Repo | |
Framework | |
Modelling the Usage of Discourse Connectives as Rational Speech Acts
Title | Modelling the Usage of Discourse Connectives as Rational Speech Acts |
Authors | Frances Yung, Kevin Duh, Taku Komura, Yuji Matsumoto |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-1030/ |
https://www.aclweb.org/anthology/K16-1030 | |
PWC | https://paperswithcode.com/paper/modelling-the-usage-of-discourse-connectives |
Repo | |
Framework | |
Finding Arguments as Sequence Labeling in Discourse Parsing
Title | Finding Arguments as Sequence Labeling in Discourse Parsing |
Authors | Ziwei Fan, Zhenghua Li, Min Zhang |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-2021/ |
https://www.aclweb.org/anthology/K16-2021 | |
PWC | https://paperswithcode.com/paper/finding-arguments-as-sequence-labeling-in |
Repo | |
Framework | |
Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations
Title | Flexible and Reliable Text Analytics in the Digital Humanities – Some Methodological Considerations |
Authors | Jonas Kuhn |
Abstract | The availability of Language Technology Resources and Tools generates a considerable methodological potential in the Digital Humanities: aspects of research questions from the Humanities and Social Sciences can be addressed on text collections in ways that were unavailable to traditional approaches. I start this talk by sketching some sample scenarios of Digital Humanities projects which involve various Humanities and Social Science disciplines, noting that the potential for a meaningful contribution to higher-level questions is highest when the employed language technological models are carefully tailored both (a) to characteristics of the given target corpus, and (b) to relevant analytical subtasks feeding the discipline-specific research questions. Keeping up a multidisciplinary perspective, I then point out a recurrent dilemma in Digital Humanities projects that follow the conventional set-up of collaboration: to build high-quality computational models for the data, fixed analytical targets should be specified as early as possible {–} but to be able to respond to Humanities questions as they evolve over the course of analysis, the analytical machinery should be kept maximally flexible. To reach both, I argue for a novel collaborative culture that rests on a more interleaved, continuous dialogue. (Re-)Specification of analytical targets should be an ongoing process in which the Humanities Scholars and Social Scientists play a role that is as important as the Computational Scientists{'} role. A promising approach lies in the identification of re-occurring types of analytical subtasks, beyond linguistic standard tasks, which can form building blocks for text analysis across disciplines, and for which corpus-based characterizations (viz. annotations) can be collected, compared and revised. On such grounds, computational modeling is more directly tied to the evolving research questions, and hence the seemingly opposing needs of reliable target specifications vs. {``}malleable{''} frameworks of analysis can be reconciled. Experimental work following this approach is under way in the Center for Reflected Text Analytics (CRETA) in Stuttgart. | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4001/ |
https://www.aclweb.org/anthology/W16-4001 | |
PWC | https://paperswithcode.com/paper/flexible-and-reliable-text-analytics-in-the |
Repo | |
Framework | |
Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language
Title | Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language |
Authors | Teruaki Oka, Tomoaki Kono |
Abstract | We are constructing an annotated diachronic corpora of the Japanese language. In part of thiswork, we construct a corpus of Manyosyu, which is an old Japanese poetry anthology. In thispaper, we describe how to align the transcribed text and its original text semiautomatically to beable to cross-reference them in our Manyosyu corpus. Although we align the original charactersto the transcribed words manually, we preliminarily align the transcribed and original charactersby using an unsupervised automatic alignment technique of statistical machine translation toalleviate the work. We found that automatic alignment achieves an F1-measure of 0.83; thus, each poem has 1{–}2 alignment errors. However, finding these errors and modifying them are less workintensiveand more efficient than fully manual annotation. The alignment probabilities can beutilized in this modification. Moreover, we found that we can locate the uncertain transcriptionsin our corpus and compare them to other transcriptions, by using the alignment probabilities. |
Tasks | Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4006/ |
https://www.aclweb.org/anthology/W16-4006 | |
PWC | https://paperswithcode.com/paper/original-transcribed-text-alignment-for |
Repo | |
Framework | |
Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis
Title | Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis |
Authors | Sven Buechel, Johannes Hellrich, Udo Hahn |
Abstract | We here describe a novel methodology for measuring affective language in historical text by expanding an affective lexicon and jointly adapting it to prior language stages. We automatically construct a lexicon for word-emotion association of 18th and 19th century German which is then validated against expert ratings. Subsequently, this resource is used to identify distinct emotional patterns and trace long-term emotional trends in different genres of writing spanning several centuries. |
Tasks | Emotion Recognition, Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4008/ |
https://www.aclweb.org/anthology/W16-4008 | |
PWC | https://paperswithcode.com/paper/feelings-from-the-pastaadapting-affective |
Repo | |
Framework | |
Unsupervised Text Segmentation Using Semantic Relatedness Graphs
Title | Unsupervised Text Segmentation Using Semantic Relatedness Graphs |
Authors | Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto |
Abstract | |
Tasks | Information Retrieval, Text Summarization, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/S16-2016/ |
https://www.aclweb.org/anthology/S16-2016 | |
PWC | https://paperswithcode.com/paper/unsupervised-text-segmentation-using-semantic |
Repo | |
Framework | |
Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction
Title | Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction |
Authors | Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon |
Abstract | Time series prediction problems are becoming increasingly high-dimensional in modern applications, such as climatology and demand forecasting. For example, in the latter problem, the number of items for which demand needs to be forecast might be as large as 50,000. In addition, the data is generally noisy and full of missing values. Thus, modern applications require methods that are highly scalable, and can deal with noisy data in terms of corruptions or missing values. However, classical time series methods usually fall short of handling these issues. In this paper, we present a temporal regularized matrix factorization (TRMF) framework which supports data-driven temporal learning and forecasting. We develop novel regularization schemes and use scalable matrix factorization methods that are eminently suited for high-dimensional time series data that has many missing values. Our proposed TRMF is highly general, and subsumes many existing approaches for time series analysis. We make interesting connections to graph regularization methods in the context of learning the dependencies in an autoregressive framework. Experimental results show the superiority of TRMF in terms of scalability and prediction quality. In particular, TRMF is two orders of magnitude faster than other methods on a problem of dimension 50,000, and generates better forecasts on real-world datasets such as Wal-mart E-commerce datasets. |
Tasks | Time Series, Time Series Analysis, Time Series Prediction |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction |
http://papers.nips.cc/paper/6160-temporal-regularized-matrix-factorization-for-high-dimensional-time-series-prediction.pdf | |
PWC | https://paperswithcode.com/paper/temporal-regularized-matrix-factorization-for |
Repo | |
Framework | |
Towards Non-projective High-Order Dependency Parser
Title | Towards Non-projective High-Order Dependency Parser |
Authors | Wenjing Fang, Kenny Zhu, Yizhong Wang, Jia Tan |
Abstract | This paper presents a novel high-order dependency parsing framework that targets non-projective treebanks. It imitates how a human parses sentences in an intuitive way. At every step of the parse, it determines which word is the easiest to process among all the remaining words, identifies its head word and then folds it under the head word. Further, this work is flexible enough to be augmented with other parsing techniques. |
Tasks | Dependency Parsing |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2052/ |
https://www.aclweb.org/anthology/C16-2052 | |
PWC | https://paperswithcode.com/paper/towards-non-projective-high-order-dependency |
Repo | |
Framework | |
The Generalized Reparameterization Gradient
Title | The Generalized Reparameterization Gradient |
Authors | Francisco R. Ruiz, Michalis Titsias Rc Aueb, David Blei |
Abstract | The reparameterization gradient has become a widely used method to obtain Monte Carlo gradients to optimize the variational objective. However, this technique does not easily apply to commonly used distributions such as beta or gamma without further approximations, and most practical applications of the reparameterization gradient fit Gaussian distributions. In this paper, we introduce the generalized reparameterization gradient, a method that extends the reparameterization gradient to a wider class of variational distributions. Generalized reparameterizations use invertible transformations of the latent variables which lead to transformed distributions that weakly depend on the variational parameters. This results in new Monte Carlo gradients that combine reparameterization gradients and score function gradients. We demonstrate our approach on variational inference for two complex probabilistic models. The generalized reparameterization is effective: even a single sample from the variational distribution is enough to obtain a low-variance gradient. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient |
http://papers.nips.cc/paper/6328-the-generalized-reparameterization-gradient.pdf | |
PWC | https://paperswithcode.com/paper/the-generalized-reparameterization-gradient-1 |
Repo | |
Framework | |