July 26, 2019

2412 words 12 mins read

Paper Group NANR 8

Paper Group NANR 8

A Shallow Neural Network for Native Language Identification with Character N-grams. Fewer features perform well at Native Language Identification task. Domain Adaptation from User-level Facebook Models to County-level Twitter Predictions. The Power of Character N-grams in Native Language Identification. Addressing Problems across Linguistic Levels …

A Shallow Neural Network for Native Language Identification with Character N-grams

Title A Shallow Neural Network for Native Language Identification with Character N-grams
Authors Yunita Sari, Muhammad Rifqi Fatchurrahman, Meisyarah Dwiastuti
Abstract This paper describes the systems submitted by GadjahMada team to the Native Language Identification (NLI) Shared Task 2017. Our models used a continuous representation of character n-grams which are learned jointly with feed-forward neural network classifier. Character n-grams have been proved to be effective for style-based identification tasks including NLI. Results on the test set demonstrate that the proposed model performs very well on essay and fusion tracks by obtaining more than 0.8 on both F-macro score and accuracy.
Tasks Language Identification, Native Language Identification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5027/
PDF https://www.aclweb.org/anthology/W17-5027
PWC https://paperswithcode.com/paper/a-shallow-neural-network-for-native-language
Repo
Framework

Fewer features perform well at Native Language Identification task

Title Fewer features perform well at Native Language Identification task
Authors Taraka Rama, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin
Abstract This paper describes our results at the NLI shared task 2017. We participated in essays, speech, and fusion task that uses text, speech, and i-vectors for the task of identifying the native language of the given input. In the essay track, a linear SVM system using word bigrams and character 7-grams performed the best. In the speech track, an LDA classifier based only on i-vectors performed better than a combination system using text features from speech transcriptions and i-vectors. In the fusion task, we experimented with systems that used combination of i-vectors with higher order n-grams features, combination of i-vectors with word unigrams, a mean probability ensemble, and a stacked ensemble system. Our finding is that word unigrams in combination with i-vectors achieve higher score than systems trained with larger number of $n$-gram features. Our best-performing systems achieved F1-scores of 87.16{%}, 83.33{%} and 91.75{%} on the essay track, the speech track and the fusion track respectively.
Tasks Language Identification, Native Language Identification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5028/
PDF https://www.aclweb.org/anthology/W17-5028
PWC https://paperswithcode.com/paper/fewer-features-perform-well-at-native
Repo
Framework

Domain Adaptation from User-level Facebook Models to County-level Twitter Predictions

Title Domain Adaptation from User-level Facebook Models to County-level Twitter Predictions
Authors Daniel Rieman, Kokil Jaidka, H. Andrew Schwartz, Lyle Ungar
Abstract Several studies have demonstrated how language models of user attributes, such as personality, can be built by using the Facebook language of social media users in conjunction with their responses to psychology questionnaires. It is challenging to apply these models to make general predictions about attributes of communities, such as personality distributions across US counties, because it requires 1. the potentially inavailability of the original training data because of privacy and ethical regulations, 2. adapting Facebook language models to Twitter language without retraining the model, and 3. adapting from users to county-level collections of tweets. We propose a two-step algorithm, Target Side Domain Adaptation (TSDA) for such domain adaptation when no labeled Twitter/county data is available. TSDA corrects for the different word distributions between Facebook and Twitter and for the varying word distributions across counties by adjusting target side word frequencies; no changes to the trained model are made. In the case of predicting the Big Five county-level personality traits, TSDA outperforms a state-of-the-art domain adaptation method, gives county-level predictions that have fewer extreme outliers, higher year-to-year stability, and higher correlation with county-level outcomes.
Tasks Domain Adaptation
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1077/
PDF https://www.aclweb.org/anthology/I17-1077
PWC https://paperswithcode.com/paper/domain-adaptation-from-user-level-facebook
Repo
Framework

The Power of Character N-grams in Native Language Identification

Title The Power of Character N-grams in Native Language Identification
Authors Artur Kulmizev, Bo Blankers, Johannes Bjerva, Malvina Nissim, Gertjan van Noord, Barbara Plank, Martijn Wieling
Abstract In this paper, we explore the performance of a linear SVM trained on language independent character features for the NLI Shared Task 2017. Our basic system (GRONINGEN) achieves the best performance (87.56 F1-score) on the evaluation set using only 1-9 character n-grams as features. We compare this against several ensemble and meta-classifiers in order to examine how the linear system fares when combined with other, especially non-linear classifiers. Special emphasis is placed on the topic bias that exists by virtue of the assessment essay prompt distribution.
Tasks Language Identification, Native Language Identification, Text Classification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5043/
PDF https://www.aclweb.org/anthology/W17-5043
PWC https://paperswithcode.com/paper/the-power-of-character-n-grams-in-native
Repo
Framework

Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice

Title Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice
Authors Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde
Abstract Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice. We explore combinations of linguistically motivated approaches to address these problems in English-to-German SMT and show that they are complementary to one another, but also that the popular verbal pre-ordering can cause problems on the morphological and lexical level. A discriminative classifier can overcome these problems, in particular when enriching standard lexical features with features geared towards verbal inflection.
Tasks Word Alignment, Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2099/
PDF https://www.aclweb.org/anthology/E17-2099
PWC https://paperswithcode.com/paper/addressing-problems-across-linguistic-levels
Repo
Framework

Unsupervised Detection of Argumentative Units though Topic Modeling Techniques

Title Unsupervised Detection of Argumentative Units though Topic Modeling Techniques
Authors Alfio Ferrara, Stefano Montanelli, Georgios Petasis
Abstract In this paper we present a new unsupervised approach, {``}Attraction to Topics{''} {–} A2T , for the detection of argumentative units, a sub-task of argument mining. Motivated by the importance of topic identification in manual annotation, we examine whether topic modeling can be used for performing unsupervised detection of argumentative sentences, and to what extend topic modeling can be used to classify sentences as claims and premises. Preliminary evaluation results suggest that topic information can be successfully used for the detection of argumentative sentences, at least for corpora used for evaluation. Our approach has been evaluated on two English corpora, the first of which contains 90 persuasive essays, while the second is a collection of 340 documents from user generated content. |
Tasks Argument Mining, Opinion Mining, Stance Detection
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5113/
PDF https://www.aclweb.org/anthology/W17-5113
PWC https://paperswithcode.com/paper/unsupervised-detection-of-argumentative-units
Repo
Framework

Show Me Your Variance and I Tell You Who You Are - Deriving Compound Compositionality from Word Alignments

Title Show Me Your Variance and I Tell You Who You Are - Deriving Compound Compositionality from Word Alignments
Authors Fabienne Cap
Abstract We use word alignment variance as an indicator for the non-compositionality of German and English noun compounds. Our work-in-progress results are on their own not competitive with state-of-the art approaches, but they show that alignment variance is correlated with compositionality and thus worth a closer look in the future.
Tasks Word Alignment
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1713/
PDF https://www.aclweb.org/anthology/W17-1713
PWC https://paperswithcode.com/paper/show-me-your-variance-and-i-tell-you-who-you
Repo
Framework

Not All Segments are Created Equal: Syntactically Motivated Sentiment Analysis in Lexical Space

Title Not All Segments are Created Equal: Syntactically Motivated Sentiment Analysis in Lexical Space
Authors Muhammad Abdul-Mageed
Abstract Although there is by now a considerable amount of research on subjectivity and sentiment analysis on morphologically-rich languages, it is still unclear how lexical information can best be modeled in these languages. To bridge this gap, we build effective models exploiting exclusively gold- and machine-segmented lexical input and successfully employ syntactically motivated feature selection to improve classification. Our best models achieve significantly above the baselines, with 67.93{%} and 69.37{%} accuracies for subjectivity and sentiment classification respectively.
Tasks Feature Selection, Sentiment Analysis
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1318/
PDF https://www.aclweb.org/anthology/W17-1318
PWC https://paperswithcode.com/paper/not-all-segments-are-created-equal
Repo
Framework

Reviewers for Volume 43

Title Reviewers for Volume 43
Authors
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/J17-4008/
PDF https://www.aclweb.org/anthology/J17-4008
PWC https://paperswithcode.com/paper/reviewers-for-volume-43
Repo
Framework

IBA-Sys at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News

Title IBA-Sys at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News
Authors Zarmeen Nasim
Abstract This paper presents the details of our system IBA-Sys that participated in SemEval Task: Fine-grained sentiment analysis on Financial Microblogs and News. Our system participated in both tracks. For microblogs track, a supervised learning approach was adopted and the regressor was trained using XgBoost regression algorithm on lexicon features. For news headlines track, an ensemble of regressors was used to predict sentiment score. One regressor was trained using TF-IDF features and another was trained using the n-gram features. The source code is available at Github.
Tasks Sentiment Analysis
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2140/
PDF https://www.aclweb.org/anthology/S17-2140
PWC https://paperswithcode.com/paper/iba-sys-at-semeval-2017-task-5-fine-grained
Repo
Framework

Data Augmentation for Visual Question Answering

Title Data Augmentation for Visual Question Answering
Authors Kushal Kafle, Mohammed Yousefhussien, Christopher Kanan
Abstract Data augmentation is widely used to train deep neural networks for image classification tasks. Simply flipping images can help learning tremendously by increasing the number of training images by a factor of two. However, little work has been done studying data augmentation in natural language processing. Here, we describe two methods for data augmentation for Visual Question Answering (VQA). The first uses existing semantic annotations to generate new questions. The second method is a generative approach using recurrent neural networks. Experiments show that the proposed data augmentation improves performance of both baseline and state-of-the-art VQA algorithms.
Tasks Data Augmentation, Image Classification, Question Answering, Text Generation, Visual Question Answering
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3529/
PDF https://www.aclweb.org/anthology/W17-3529
PWC https://paperswithcode.com/paper/data-augmentation-for-visual-question
Repo
Framework

Computational Sarcasm

Title Computational Sarcasm
Authors Pushpak Bhattacharyya, Aditya Joshi
Abstract Sarcasm is a form of verbal irony that is intended to express contempt or ridicule. Motivated by challenges posed by sarcastic text to sentiment analysis, computational approaches to sarcasm have witnessed a growing interest at NLP forums in the past decade. Computational sarcasm refers to automatic approaches pertaining to sarcasm. The tutorial will provide a bird{'}s-eye view of the research in computational sarcasm for text, while focusing on significant milestones.The tutorial begins with linguistic theories of sarcasm, with a focus on incongruity: a useful notion that underlies sarcasm and other forms of figurative language. Since the most significant work in computational sarcasm is sarcasm detection: predicting whether a given piece of text is sarcastic or not, sarcasm detection forms the focus hereafter. We begin our discussion on sarcasm detection with datasets, touching on strategies, challenges and nature of datasets. Then, we describe algorithms for sarcasm detection: rule-based (where a specific evidence of sarcasm is utilised as a rule), statistical classifier-based (where features are designed for a statistical classifier), a topic model-based technique, and deep learning-based algorithms for sarcasm detection. In case of each of these algorithms, we refer to our work on sarcasm detection and share our learnings. Since information beyond the text to be classified, contextual information is useful for sarcasm detection, we then describe approaches that use such information through conversational context or author-specific context.We then follow it by novel areas in computational sarcasm such as sarcasm generation, sarcasm v/s irony classification, etc. We then summarise the tutorial and describe future directions based on errors reported in past work. The tutorial will end with a demonstration of our work on sarcasm detection.This tutorial will be of interest to researchers investigating computational sarcasm and related areas such as computational humour, figurative language understanding, emotion and sentiment sentiment analysis, etc. The tutorial is motivated by our continually evolving survey paper of sarcasm detection, that is available on arXiv at: Joshi, Aditya, Pushpak Bhattacharyya, and Mark James Carman. {``}Automatic Sarcasm Detection: A Survey.{''} arXiv preprint arXiv:1602.03426 (2016). |
Tasks Sarcasm Detection, Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-3002/
PDF https://www.aclweb.org/anthology/D17-3002
PWC https://paperswithcode.com/paper/computational-sarcasm
Repo
Framework

CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification

Title CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification
Authors Raj Kumar Gupta, Yinping Yang
Abstract This paper describes a system developed for a shared sentiment analysis task and its subtasks organized by SemEval-2017. A key feature of our system is the embedded ability to detect sarcasm in order to enhance the performance of sentiment classification. We first constructed an affect-cognition-sociolinguistics sarcasm features model and trained a SVM-based classifier for detecting sarcastic expressions from general tweets. For sentiment prediction, we developed CrystalNest{–} a two-level cascade classification system using features combining sarcasm score derived from our sarcasm classifier, sentiment scores from Alchemy, NRC lexicon, n-grams, word embedding vectors, and part-of-speech features. We found that the sarcasm detection derived features consistently benefited key sentiment analysis evaluation metrics, in different degrees, across four subtasks A-D.
Tasks Opinion Mining, Sarcasm Detection, Sentiment Analysis
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2103/
PDF https://www.aclweb.org/anthology/S17-2103
PWC https://paperswithcode.com/paper/crystalnest-at-semeval-2017-task-4-using
Repo
Framework

Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network

Title Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network
Authors Abhijit Mishra, Kuntal Dey, Pushpak Bhattacharyya
Abstract Cognitive NLP systems- i.e., NLP systems that make use of behavioral data - augment traditional text-based features with cognitive features extracted from eye-movement patterns, EEG signals, brain-imaging etc. Such extraction of features is typically manual. We contend that manual extraction of features may not be the best way to tackle text subtleties that characteristically prevail in complex classification tasks like Sentiment Analysis and Sarcasm Detection, and that even the extraction and choice of features should be delegated to the learning system. We introduce a framework to automatically extract cognitive features from the eye-movement/gaze data of human readers reading the text and use them as features along with textual features for the tasks of sentiment polarity and sarcasm detection. Our proposed framework is based on Convolutional Neural Network (CNN). The CNN learns features from both gaze and text and uses them to classify the input text. We test our technique on published sentiment and sarcasm labeled datasets, enriched with gaze information, to show that using a combination of automatically learned text and gaze features often yields better classification performance over (i) CNN based systems that rely on text input alone and (ii) existing systems that rely on handcrafted gaze and textual features.
Tasks EEG, Sarcasm Detection, Sentiment Analysis, Text Classification
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1035/
PDF https://www.aclweb.org/anthology/P17-1035
PWC https://paperswithcode.com/paper/learning-cognitive-features-from-gaze-data
Repo
Framework

Os Prov'erbios em manuais de ensino de Portugu^es L'\ingua N~ao Materna (The Proverbs of teaching manuals in Non-Native Portuguese)[In Portuguese]

Title Os Prov'erbios em manuais de ensino de Portugu^es L'\ingua N~ao Materna (The Proverbs of teaching manuals in Non-Native Portuguese)[In Portuguese]
Authors S{'o}nia Reis, Jorge Baptista
Abstract
Tasks
Published 2017-10-01
URL https://www.aclweb.org/anthology/W17-6629/
PDF https://www.aclweb.org/anthology/W17-6629
PWC https://paperswithcode.com/paper/os-provarbios-em-manuais-de-ensino-de
Repo
Framework
comments powered by Disqus