May 5, 2019

1702 words 8 mins read

Paper Group NANR 98

A Global Analysis of Emoji Usage. Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik. CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques. Sentiment Analysis of Tweets in Three Indian Languages. A Framework for Automat …

A Global Analysis of Emoji Usage


Title	A Global Analysis of Emoji Usage
Authors	Nikola Ljube{\v{s}}i{'c}, Darja Fi{\v{s}}er
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2610/
PDF	https://www.aclweb.org/anthology/W16-2610
PWC	https://paperswithcode.com/paper/a-global-analysis-of-emoji-usage
Repo
Framework

Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik


Title	Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik
Authors	Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin
Abstract	This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations. In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. We show promising results on both a four-month exercise in Sorani and a two-day exercise in Tajik, achieved with minimal annotation costs. \|
Tasks	Named Entity Recognition
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1095/
PDF	https://www.aclweb.org/anthology/C16-1095
PWC	https://paperswithcode.com/paper/named-entity-recognition-for-linguistic-rapid
Repo
Framework

CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques


Title	CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques
Authors	Veera Raghavendra Chikka
Abstract
Tasks	Relation Extraction, Temporal Information Extraction
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1192/
PDF	https://www.aclweb.org/anthology/S16-1192
PWC	https://paperswithcode.com/paper/cde-iiith-at-semeval-2016-task-12-extraction
Repo
Framework

Sentiment Analysis of Tweets in Three Indian Languages


Title	Sentiment Analysis of Tweets in Three Indian Languages
Authors	Shanta Phani, Shibamouli Lahiri, Arindam Biswas
Abstract	In this paper, we describe the results of sentiment analysis on tweets in three Indian languages {–} Bengali, Hindi, and Tamil. We used the recently released SAIL dataset (Patra et al., 2015), and obtained state-of-the-art results in all three languages. Our features are simple, robust, scalable, and language-independent. Further, we show that these simple features provide better results than more complex and language-specific features, in two separate classification tasks. Detailed feature analysis and error analysis have been reported, along with learning curves for Hindi and Bengali.
Tasks	Opinion Mining, Sentiment Analysis
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-3710/
PDF	https://www.aclweb.org/anthology/W16-3710
PWC	https://paperswithcode.com/paper/sentiment-analysis-of-tweets-in-three-indian
Repo
Framework

A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora


Title	A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora
Authors	Tanja Samard{\v{z}}i{'c}, Maja Mili{\v{c}}evi{'c}
Abstract	Verb aspect is a grammatical and lexical category that encodes temporal unfolding and duration of events described by verbs. It is a potentially interesting source of information for various computational tasks, but has so far not been studied in much depth from the perspective of automatic processing. Slavic languages are particularly interesting in this respect, as they encode aspect through complex and not entirely consistent lexical derivations involving prefixation and suffixation. Focusing on Croatian and Serbian, in this paper we propose a novel framework for automatic classification of their verb types into a number of fine-grained aspectual classes based on the observable morphology of verb forms. In addition, we provide a set of around 2000 verbs classified based on our framework. This set can be used for linguistic research as well as for testing automatic classification on a larger scale. With minor adjustments the approach is also applicable to other Slavic languages
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1728/
PDF	https://www.aclweb.org/anthology/L16-1728
PWC	https://paperswithcode.com/paper/a-framework-for-automatic-acquisition-of
Repo
Framework

Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition


Title	Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Authors	Jinzhuo Wang, Wenmin Wang, Xiongtao Chen, Ronggang Wang, Wen Gao
Abstract	Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer. The former acts as local feature learner while the latter is used to collect contexts. Compared with feed-forward neural networks, DANN learns contexts of local features from the very beginning. This setting helps to preserve hierarchical context evolutions which we show are essential to recognize similar actions. Besides, we present an adaptive method to determine the temporal size for network input based on optical flow energy, and develop a volumetric pyramid pooling layer to deal with input clips of arbitrary sizes. We demonstrate the advantages of DANN on two benchmarks HMDB51 and UCF101 and report competitive or superior results to the state-of-the-art.
Tasks	Optical Flow Estimation, Temporal Action Localization
Published	2016-12-01
URL	http://papers.nips.cc/paper/6335-deep-alternative-neural-network-exploring-contexts-as-early-as-possible-for-action-recognition
PDF	http://papers.nips.cc/paper/6335-deep-alternative-neural-network-exploring-contexts-as-early-as-possible-for-action-recognition.pdf
PWC	https://paperswithcode.com/paper/deep-alternative-neural-network-exploring
Repo
Framework


Title	Multi-source named entity typing for social media
Authors	Reuth Vexler, Einat Minkov
Abstract
Tasks	Entity Linking, Entity Typing, Named Entity Recognition, Question Answering, Relation Extraction
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2702/
PDF	https://www.aclweb.org/anthology/W16-2702
PWC	https://paperswithcode.com/paper/multi-source-named-entity-typing-for-social
Repo
Framework

Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network


Title	Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network
Authors	Ryu Iida, Kentaro Torisawa, Jong-Hoon Oh, Canasai Kruengkrai, Julien Kloetzer
Abstract
Tasks	Machine Translation
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1132/
PDF	https://www.aclweb.org/anthology/D16-1132
PWC	https://paperswithcode.com/paper/intra-sentential-subject-zero-anaphora
Repo
Framework

Broad Twitter Corpus: A Diverse Named Entity Recognition Resource


Title	Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Authors	Leon Derczynski, Kalina Bontcheva, Ian Roberts
Abstract	One of the main obstacles, hampering method development and comparative evaluation of named entity recognition in social media, is the lack of a sizeable, diverse, high quality annotated corpus, analogous to the CoNLL{'}2003 news dataset. For instance, the biggest Ritter tweet corpus is only 45,000 tokens {–} a mere 15{%} the size of CoNLL{'}2003. Another major shortcoming is the lack of temporal, geographic, and author diversity. This paper introduces the Broad Twitter Corpus (BTC), which is not only significantly bigger, but sampled across different regions, temporal periods, and types of Twitter users. The gold-standard named entity annotations are made by a combination of NLP experts and crowd workers, which enables us to harness crowd recall while maintaining high quality. We also measure the entity drift observed in our dataset (i.e. how entity representation varies over time), and compare to newswire. The corpus is released openly, including source text and intermediate annotations.
Tasks	Named Entity Recognition
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1111/
PDF	https://www.aclweb.org/anthology/C16-1111
PWC	https://paperswithcode.com/paper/broad-twitter-corpus-a-diverse-named-entity
Repo
Framework

Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship


Title	Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship
Authors	Yuta Koreeda, Toshihiko Yanase, Kohsuke Yanai, Misa Sato, Yoshiki Niwa
Abstract
Tasks	Argument Mining, Aspect-Based Sentiment Analysis, Decision Making, Sentiment Analysis
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2809/
PDF	https://www.aclweb.org/anthology/W16-2809
PWC	https://paperswithcode.com/paper/neural-attention-model-for-classification-of
Repo
Framework

Unshared Task at the 3rd Workshop on Argument Mining: Perspective Based Local Agreement and Disagreement in Online Debate


Title	Unshared Task at the 3rd Workshop on Argument Mining: Perspective Based Local Agreement and Disagreement in Online Debate
Authors	Chantal van Son, Tommaso Caselli, Antske Fokkens, Isa Maks, Roser Morante, Lora Aroyo, Piek Vossen
Abstract
Tasks	Argument Mining
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2819/
PDF	https://www.aclweb.org/anthology/W16-2819
PWC	https://paperswithcode.com/paper/unshared-task-at-the-3rd-workshop-on-argument
Repo
Framework

Inferring Implicit Causal Relationships in Biomedical Literature


Title	Inferring Implicit Causal Relationships in Biomedical Literature
Authors	Halil Kilicoglu
Abstract
Tasks	Drug Discovery, Named Entity Recognition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2906/
PDF	https://www.aclweb.org/anthology/W16-2906
PWC	https://paperswithcode.com/paper/inferring-implicit-causal-relationships-in
Repo
Framework

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words


Title	Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words
Authors	Gustavo Paetzold, Lucia Specia
Abstract	Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics. From a psycholinguistic perspective, however, the corpora used in these contributions are often not representative of language usage: they are either domain-specific, limited in size, or extracted from unreliable sources. In an effort to address this limitation, we introduce SubIMDB, a corpus of everyday language spoken text we created which contains over 225 million words. The corpus was extracted from 38,102 subtitles of family, comedy and children movies and series, and is the first sizeable structured corpus of subtitles made available. Our experiments show that word frequency norms extracted from this corpus are more effective than those from well-known norms such as Kucera-Francis, HAL and SUBTLEXus in predicting various psycholinguistic properties of words, such as lexical decision times, familiarity, age of acquisition and simplicity. We also provide evidence that contradict the long-standing assumption that the ideal size for a corpus can be determined solely based on how well its word frequencies correlate with lexical decision times.
Tasks	Text Simplification
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1157/
PDF	https://www.aclweb.org/anthology/C16-1157
PWC	https://paperswithcode.com/paper/collecting-and-exploring-everyday-language
Repo
Framework

什麼時候「認真就輸了」？——語料庫中「認真」一詞的語意變化(Do We Lose When Being Serious? —Change in Meaning of the Word ``Renzen(認真)’’ in Corpora)


Title	什麼時候「認真就輸了」？——語料庫中「認真」一詞的語意變化(Do We Lose When Being Serious? —Change in Meaning of the Word ``Renzen(認真)’’ in Corpora) \|
Authors	Pei-Yi Chen, Siaw-Fong Chung
Abstract
Tasks
Published	2016-10-01
URL	https://www.aclweb.org/anthology/O16-1008/
PDF	https://www.aclweb.org/anthology/O16-1008
PWC	https://paperswithcode.com/paper/aeo14aaeacae14-aoai14aaeaaoa-aeacaa-eceaeado
Repo
Framework

Automatic Biomedical Term Polysemy Detection


Title	Automatic Biomedical Term Polysemy Detection
Authors	Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire
Abstract	Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a novel approach to detect if a biomedical term is polysemic, with the long term goal of enriching biomedical ontologies. This approach is based on the extraction of new features. In this context we propose to extract features following two manners: (i) extracted directly from the text dataset, and (ii) from an induced graph. Our method obtains an Accuracy and F-Measure of 0.978.
Tasks	Word Sense Induction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1266/
PDF	https://www.aclweb.org/anthology/L16-1266
PWC	https://paperswithcode.com/paper/automatic-biomedical-term-polysemy-detection
Repo
Framework