May 5, 2019

1702 words 8 mins read

Paper Group NANR 98

Paper Group NANR 98

A Global Analysis of Emoji Usage. Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik. CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques. Sentiment Analysis of Tweets in Three Indian Languages. A Framework for Automat …

A Global Analysis of Emoji Usage

Title A Global Analysis of Emoji Usage
Authors Nikola Ljube{\v{s}}i{'c}, Darja Fi{\v{s}}er
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2610/
PDF https://www.aclweb.org/anthology/W16-2610
PWC https://paperswithcode.com/paper/a-global-analysis-of-emoji-usage
Repo
Framework

Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

Title Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik
Authors Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin
Abstract This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations. In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. We show promising results on both a four-month exercise in Sorani and a two-day exercise in Tajik, achieved with minimal annotation costs. |
Tasks Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1095/
PDF https://www.aclweb.org/anthology/C16-1095
PWC https://paperswithcode.com/paper/named-entity-recognition-for-linguistic-rapid
Repo
Framework

CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques

Title CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques
Authors Veera Raghavendra Chikka
Abstract
Tasks Relation Extraction, Temporal Information Extraction
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1192/
PDF https://www.aclweb.org/anthology/S16-1192
PWC https://paperswithcode.com/paper/cde-iiith-at-semeval-2016-task-12-extraction
Repo
Framework

Sentiment Analysis of Tweets in Three Indian Languages

Title Sentiment Analysis of Tweets in Three Indian Languages
Authors Shanta Phani, Shibamouli Lahiri, Arindam Biswas
Abstract In this paper, we describe the results of sentiment analysis on tweets in three Indian languages {–} Bengali, Hindi, and Tamil. We used the recently released SAIL dataset (Patra et al., 2015), and obtained state-of-the-art results in all three languages. Our features are simple, robust, scalable, and language-independent. Further, we show that these simple features provide better results than more complex and language-specific features, in two separate classification tasks. Detailed feature analysis and error analysis have been reported, along with learning curves for Hindi and Bengali.
Tasks Opinion Mining, Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3710/
PDF https://www.aclweb.org/anthology/W16-3710
PWC https://paperswithcode.com/paper/sentiment-analysis-of-tweets-in-three-indian
Repo
Framework

A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora

Title A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora
Authors Tanja Samard{\v{z}}i{'c}, Maja Mili{\v{c}}evi{'c}
Abstract Verb aspect is a grammatical and lexical category that encodes temporal unfolding and duration of events described by verbs. It is a potentially interesting source of information for various computational tasks, but has so far not been studied in much depth from the perspective of automatic processing. Slavic languages are particularly interesting in this respect, as they encode aspect through complex and not entirely consistent lexical derivations involving prefixation and suffixation. Focusing on Croatian and Serbian, in this paper we propose a novel framework for automatic classification of their verb types into a number of fine-grained aspectual classes based on the observable morphology of verb forms. In addition, we provide a set of around 2000 verbs classified based on our framework. This set can be used for linguistic research as well as for testing automatic classification on a larger scale. With minor adjustments the approach is also applicable to other Slavic languages
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1728/
PDF https://www.aclweb.org/anthology/L16-1728
PWC https://paperswithcode.com/paper/a-framework-for-automatic-acquisition-of
Repo
Framework

Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition

Title Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Authors Jinzhuo Wang, Wenmin Wang, Xiongtao Chen, Ronggang Wang, Wen Gao
Abstract Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer. The former acts as local feature learner while the latter is used to collect contexts. Compared with feed-forward neural networks, DANN learns contexts of local features from the very beginning. This setting helps to preserve hierarchical context evolutions which we show are essential to recognize similar actions. Besides, we present an adaptive method to determine the temporal size for network input based on optical flow energy, and develop a volumetric pyramid pooling layer to deal with input clips of arbitrary sizes. We demonstrate the advantages of DANN on two benchmarks HMDB51 and UCF101 and report competitive or superior results to the state-of-the-art.
Tasks Optical Flow Estimation, Temporal Action Localization
Published 2016-12-01
URL http://papers.nips.cc/paper/6335-deep-alternative-neural-network-exploring-contexts-as-early-as-possible-for-action-recognition
PDF http://papers.nips.cc/paper/6335-deep-alternative-neural-network-exploring-contexts-as-early-as-possible-for-action-recognition.pdf
PWC https://paperswithcode.com/paper/deep-alternative-neural-network-exploring
Repo
Framework

Multi-source named entity typing for social media

Title Multi-source named entity typing for social media
Authors Reuth Vexler, Einat Minkov
Abstract
Tasks Entity Linking, Entity Typing, Named Entity Recognition, Question Answering, Relation Extraction
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2702/
PDF https://www.aclweb.org/anthology/W16-2702
PWC https://paperswithcode.com/paper/multi-source-named-entity-typing-for-social
Repo
Framework

Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network

Title Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network
Authors Ryu Iida, Kentaro Torisawa, Jong-Hoon Oh, Canasai Kruengkrai, Julien Kloetzer
Abstract
Tasks Machine Translation
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1132/
PDF https://www.aclweb.org/anthology/D16-1132
PWC https://paperswithcode.com/paper/intra-sentential-subject-zero-anaphora
Repo
Framework

Broad Twitter Corpus: A Diverse Named Entity Recognition Resource

Title Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Authors Leon Derczynski, Kalina Bontcheva, Ian Roberts
Abstract One of the main obstacles, hampering method development and comparative evaluation of named entity recognition in social media, is the lack of a sizeable, diverse, high quality annotated corpus, analogous to the CoNLL{'}2003 news dataset. For instance, the biggest Ritter tweet corpus is only 45,000 tokens {–} a mere 15{%} the size of CoNLL{'}2003. Another major shortcoming is the lack of temporal, geographic, and author diversity. This paper introduces the Broad Twitter Corpus (BTC), which is not only significantly bigger, but sampled across different regions, temporal periods, and types of Twitter users. The gold-standard named entity annotations are made by a combination of NLP experts and crowd workers, which enables us to harness crowd recall while maintaining high quality. We also measure the entity drift observed in our dataset (i.e. how entity representation varies over time), and compare to newswire. The corpus is released openly, including source text and intermediate annotations.
Tasks Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1111/
PDF https://www.aclweb.org/anthology/C16-1111
PWC https://paperswithcode.com/paper/broad-twitter-corpus-a-diverse-named-entity
Repo
Framework

Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship

Title Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship
Authors Yuta Koreeda, Toshihiko Yanase, Kohsuke Yanai, Misa Sato, Yoshiki Niwa
Abstract
Tasks Argument Mining, Aspect-Based Sentiment Analysis, Decision Making, Sentiment Analysis
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2809/
PDF https://www.aclweb.org/anthology/W16-2809
PWC https://paperswithcode.com/paper/neural-attention-model-for-classification-of
Repo
Framework

Unshared Task at the 3rd Workshop on Argument Mining: Perspective Based Local Agreement and Disagreement in Online Debate

Title Unshared Task at the 3rd Workshop on Argument Mining: Perspective Based Local Agreement and Disagreement in Online Debate
Authors Chantal van Son, Tommaso Caselli, Antske Fokkens, Isa Maks, Roser Morante, Lora Aroyo, Piek Vossen
Abstract
Tasks Argument Mining
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2819/
PDF https://www.aclweb.org/anthology/W16-2819
PWC https://paperswithcode.com/paper/unshared-task-at-the-3rd-workshop-on-argument
Repo
Framework

Inferring Implicit Causal Relationships in Biomedical Literature

Title Inferring Implicit Causal Relationships in Biomedical Literature
Authors Halil Kilicoglu
Abstract
Tasks Drug Discovery, Named Entity Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2906/
PDF https://www.aclweb.org/anthology/W16-2906
PWC https://paperswithcode.com/paper/inferring-implicit-causal-relationships-in
Repo
Framework

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words

Title Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words
Authors Gustavo Paetzold, Lucia Specia
Abstract Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics. From a psycholinguistic perspective, however, the corpora used in these contributions are often not representative of language usage: they are either domain-specific, limited in size, or extracted from unreliable sources. In an effort to address this limitation, we introduce SubIMDB, a corpus of everyday language spoken text we created which contains over 225 million words. The corpus was extracted from 38,102 subtitles of family, comedy and children movies and series, and is the first sizeable structured corpus of subtitles made available. Our experiments show that word frequency norms extracted from this corpus are more effective than those from well-known norms such as Kucera-Francis, HAL and SUBTLEXus in predicting various psycholinguistic properties of words, such as lexical decision times, familiarity, age of acquisition and simplicity. We also provide evidence that contradict the long-standing assumption that the ideal size for a corpus can be determined solely based on how well its word frequencies correlate with lexical decision times.
Tasks Text Simplification
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1157/
PDF https://www.aclweb.org/anthology/C16-1157
PWC https://paperswithcode.com/paper/collecting-and-exploring-everyday-language
Repo
Framework

什麼時候「認真就輸了」?——語料庫中「認真」一詞的語意變化(Do We Lose When Being Serious? —Change in Meaning of the Word ``Renzen(認真)’’ in Corpora)

Title 什麼時候「認真就輸了」?——語料庫中「認真」一詞的語意變化(Do We Lose When Being Serious? —Change in Meaning of the Word ``Renzen(認真)’’ in Corpora) |
Authors Pei-Yi Chen, Siaw-Fong Chung
Abstract
Tasks
Published 2016-10-01
URL https://www.aclweb.org/anthology/O16-1008/
PDF https://www.aclweb.org/anthology/O16-1008
PWC https://paperswithcode.com/paper/aeo14aaeacae14-aoai14aaeaaoa-aeacaa-eceaeado
Repo
Framework

Automatic Biomedical Term Polysemy Detection

Title Automatic Biomedical Term Polysemy Detection
Authors Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire
Abstract Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a novel approach to detect if a biomedical term is polysemic, with the long term goal of enriching biomedical ontologies. This approach is based on the extraction of new features. In this context we propose to extract features following two manners: (i) extracted directly from the text dataset, and (ii) from an induced graph. Our method obtains an Accuracy and F-Measure of 0.978.
Tasks Word Sense Induction
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1266/
PDF https://www.aclweb.org/anthology/L16-1266
PWC https://paperswithcode.com/paper/automatic-biomedical-term-polysemy-detection
Repo
Framework
comments powered by Disqus