July 26, 2019

1972 words 10 mins read

Paper Group NANR 156

Paper Group NANR 156

Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian. Comparison of Short-Text Sentiment Analysis Methods for Croatian. Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories. Modelling metaphor with attribute-based semantics. Single and Cross-domain Polarity Classification using …

Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian

Title Debunking Sentiment Lexicons: A Case of Domain-Specific Sentiment Classification for Croatian
Authors Paula Gombar, Zoran Medi{'c}, Domagoj Alagi{'c}, Jan {\v{S}}najder
Abstract Sentiment lexicons are widely used as an intuitive and inexpensive way of tackling sentiment classification, often within a simple lexicon word-counting approach or as part of a supervised model. However, it is an open question whether these approaches can compete with supervised models that use only word-representation features. We address this question in the context of domain-specific sentiment classification for Croatian. We experiment with the graph-based acquisition of sentiment lexicons, analyze their quality, and investigate how effectively they can be used in sentiment classification. Our results indicate that, even with as few as 500 labeled instances, a supervised model substantially outperforms a word-counting model. We also observe that adding lexicon-based features does not significantly improve supervised sentiment classification.
Tasks Sentiment Analysis, Stock Price Prediction
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1409/
PDF https://www.aclweb.org/anthology/W17-1409
PWC https://paperswithcode.com/paper/debunking-sentiment-lexicons-a-case-of-domain
Repo
Framework

Comparison of Short-Text Sentiment Analysis Methods for Croatian

Title Comparison of Short-Text Sentiment Analysis Methods for Croatian
Authors Leon Rotim, Jan {\v{S}}najder
Abstract We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines.
Tasks Sentiment Analysis, Stock Price Prediction, Text Classification, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1411/
PDF https://www.aclweb.org/anthology/W17-1411
PWC https://paperswithcode.com/paper/comparison-of-short-text-sentiment-analysis
Repo
Framework

Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories

Title Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories
Authors
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7600/
PDF https://www.aclweb.org/anthology/W17-7600
PWC https://paperswithcode.com/paper/proceedings-of-the-16th-international
Repo
Framework

Modelling metaphor with attribute-based semantics

Title Modelling metaphor with attribute-based semantics
Authors Luana Bulat, Stephen Clark, Ekaterina Shutova
Abstract One of the key problems in computational metaphor modelling is finding the optimal level of abstraction of semantic representations, such that these are able to capture and generalise metaphorical mechanisms. In this paper we present the first metaphor identification method that uses representations constructed from property norms. Such norms have been previously shown to provide a cognitively plausible representation of concepts in terms of semantic properties. Our results demonstrate that such property-based semantic representations provide a suitable model of cross-domain knowledge projection in metaphors, outperforming standard distributional models on a metaphor identification task.
Tasks Machine Translation, Natural Language Inference, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2084/
PDF https://www.aclweb.org/anthology/E17-2084
PWC https://paperswithcode.com/paper/modelling-metaphor-with-attribute-based
Repo
Framework

Single and Cross-domain Polarity Classification using String Kernels

Title Single and Cross-domain Polarity Classification using String Kernels
Authors Rosa M. Gim{'e}nez-P{'e}rez, Marc Franco-Salvador, Paolo Rosso
Abstract The polarity classification task aims at automatically identifying whether a subjective text is positive or negative. When the target domain is different from those where a model was trained, we refer to a cross-domain setting. That setting usually implies the use of a domain adaptation method. In this work, we study the single and cross-domain polarity classification tasks from the string kernels perspective. Contrary to classical domain adaptation methods, which employ texts from both domains to detect pivot features, we do not use the target domain for training. Our approach detects the lexical peculiarities that characterise the text polarity and maps them into a domain independent space by means of kernel discriminant analysis. Experimental results show state-of-the-art performance in single and cross-domain polarity classification.
Tasks Domain Adaptation, Text Classification
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2089/
PDF https://www.aclweb.org/anthology/E17-2089
PWC https://paperswithcode.com/paper/single-and-cross-domain-polarity
Repo
Framework

UWat-Emote at EmoInt-2017: Emotion Intensity Detection using Affect Clues, Sentiment Polarity and Word Embeddings

Title UWat-Emote at EmoInt-2017: Emotion Intensity Detection using Affect Clues, Sentiment Polarity and Word Embeddings
Authors Vineet John, Olga Vechtomova
Abstract This paper describes the UWaterloo affect prediction system developed for EmoInt-2017. We delve into our feature selection approach for affect intensity, affect presence, sentiment intensity and sentiment presence lexica alongside pre-trained word embeddings, which are utilized to extract emotion intensity signals from tweets in an ensemble learning approach. The system employs emotion specific model training, and utilizes distinct models for each of the emotion corpora in isolation. Our system utilizes gradient boosted regression as the primary learning technique to predict the final emotion intensities.
Tasks Emotion Classification, Feature Selection, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5235/
PDF https://www.aclweb.org/anthology/W17-5235
PWC https://paperswithcode.com/paper/uwat-emote-at-emoint-2017-emotion-intensity
Repo
Framework

Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Title Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Authors
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1000/
PDF https://www.aclweb.org/anthology/I17-1000
PWC https://paperswithcode.com/paper/proceedings-of-the-eighth-international-joint
Repo
Framework

Predicting Emotional Word Ratings using Distributional Representations and Signed Clustering

Title Predicting Emotional Word Ratings using Distributional Representations and Signed Clustering
Authors Jo{~a}o Sedoc, Daniel Preo{\c{t}}iuc-Pietro, Lyle Ungar
Abstract Inferring the emotional content of words is important for text-based sentiment analysis, dialogue systems and psycholinguistics, but word ratings are expensive to collect at scale and across languages or domains. We develop a method that automatically extends word-level ratings to unrated words using signed clustering of vector space word representations along with affect ratings. We use our method to determine a word{'}s valence and arousal, which determine its position on the circumplex model of affect, the most popular dimensional model of emotion. Our method achieves superior out-of-sample word rating prediction on both affective dimensions across three different languages when compared to state-of-the-art word similarity based methods. Our method can assist building word ratings for new languages and improve downstream tasks such as sentiment analysis and emotion detection.
Tasks Sentiment Analysis
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2090/
PDF https://www.aclweb.org/anthology/E17-2090
PWC https://paperswithcode.com/paper/predicting-emotional-word-ratings-using
Repo
Framework

A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency

Title A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency
Authors Ron Appel, Pietro Perona
Abstract There is a need for simple yet accurate white-box learning systems that train quickly and with little data. To this end, we showcase REBEL, a multi-class boosting method, and present a novel family of weak learners called localized similarities. Our framework provably minimizes the training error of any dataset at an exponential rate. We carry out experiments on a variety of synthetic and real datasets, demonstrating a consistent tendency to avoid overfitting. We evaluate our method on MNIST and standard UCI datasets against other state-of-the-art methods, showing the empirical proficiency of our method.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=675
PDF http://proceedings.mlr.press/v70/appel17a/appel17a.pdf
PWC https://paperswithcode.com/paper/a-simple-multi-class-boosting-framework-with
Repo
Framework

Context-Aware Graph Segmentation for Graph-Based Translation

Title Context-Aware Graph Segmentation for Graph-Based Translation
Authors Liangyou Li, Andy Way, Qun Liu
Abstract In this paper, we present an improved graph-based translation model which segments an input graph into node-induced subgraphs by taking source context into consideration. Translations are generated by combining subgraph translations left-to-right using beam search. Experiments on Chinese{–}English and German{–}English demonstrate that the context-aware segmentation significantly improves the baseline graph-based model.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2095/
PDF https://www.aclweb.org/anthology/E17-2095
PWC https://paperswithcode.com/paper/context-aware-graph-segmentation-for-graph
Repo
Framework

Ranking Convolutional Recurrent Neural Networks for Purchase Stage Identification on Imbalanced Twitter Data

Title Ranking Convolutional Recurrent Neural Networks for Purchase Stage Identification on Imbalanced Twitter Data
Authors Heike Adel, Francine Chen, Yan-Ying Chen
Abstract Users often use social media to share their interest in products. We propose to identify purchase stages from Twitter data following the AIDA model (Awareness, Interest, Desire, Action). In particular, we define the task of classifying the purchase stage of each tweet in a user{'}s tweet sequence. We introduce RCRNN, a Ranking Convolutional Recurrent Neural Network which computes tweet representations using convolution over word embeddings and models a tweet sequence with gated recurrent units. Also, we consider various methods to cope with the imbalanced label distribution in our data and show that a ranking layer outperforms class weights.
Tasks Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2094/
PDF https://www.aclweb.org/anthology/E17-2094
PWC https://paperswithcode.com/paper/ranking-convolutional-recurrent-neural
Repo
Framework

Reranking Translation Candidates Produced by Several Bilingual Word Similarity Sources

Title Reranking Translation Candidates Produced by Several Bilingual Word Similarity Sources
Authors Laurent Jakubina, Phillippe Langlais
Abstract We investigate the reranking of the output of several distributional approaches on the Bilingual Lexicon Induction task. We show that reranking an n-best list produced by any of those approaches leads to very substantial improvements. We further demonstrate that combining several n-best lists by reranking is an effective way of further boosting performance.
Tasks Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2096/
PDF https://www.aclweb.org/anthology/E17-2096
PWC https://paperswithcode.com/paper/reranking-translation-candidates-produced-by
Repo
Framework

Uniform Deviation Bounds for k-Means Clustering

Title Uniform Deviation Bounds for k-Means Clustering
Authors Olivier Bachem, Mario Lucic, S. Hamed Hassani, Andreas Krause
Abstract Uniform deviation bounds limit the difference between a model’s expected loss and its loss on an empirical sample uniformly for all models in a learning problem. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are unbounded. As a result, we obtain competitive uniform deviation bounds for k-Means clustering under weak assumptions on the underlying distribution. If the fourth moment is bounded, we prove a rate of $O(m^{-1/2})$ compared to the previously known $O(m^{-1/4})$ rate. Furthermore, we show that the rate also depends on the kurtosis – the normalized fourth moment which measures the “tailedness” of a distribution. We also provide improved rates under progressively stronger assumptions, namely, bounded higher moments, subgaussianity and bounded support of the underlying distribution.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=523
PDF http://proceedings.mlr.press/v70/bachem17a/bachem17a.pdf
PWC https://paperswithcode.com/paper/uniform-deviation-bounds-for-k-means
Repo
Framework

Lexicalized Reordering for Left-to-Right Hierarchical Phrase-based Translation

Title Lexicalized Reordering for Left-to-Right Hierarchical Phrase-based Translation
Authors Maryam Siahbani, Anoop Sarkar
Abstract Phrase-based and hierarchical phrase-based (Hiero) translation models differ radically in the way reordering is modeled. Lexicalized reordering models play an important role in phrase-based MT and such models have been added to CKY-based decoders for Hiero. Watanabe et al. (2006) proposed a promising decoding algorithm for Hiero (LR-Hiero) that visits input spans in arbitrary order and produces the translation in left to right (LR) order which leads to far fewer language model calls and leads to a considerable speedup in decoding. We introduce a novel shift-reduce algorithm to LR-Hiero to decode with our lexicalized reordering model (LRM) and show that it improves translation quality for Czech-English, Chinese-English and German-English.
Tasks Language Modelling, Machine Translation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2097/
PDF https://www.aclweb.org/anthology/E17-2097
PWC https://paperswithcode.com/paper/lexicalized-reordering-for-left-to-right
Repo
Framework

Gradient Boosted Decision Trees for High Dimensional Sparse Output

Title Gradient Boosted Decision Trees for High Dimensional Sparse Output
Authors Si Si, Huan Zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh
Abstract In this paper, we study the gradient boosted decision trees (GBDT) when the output space is high dimensional and sparse. For example, in multilabel classification, the output space is a $L$-dimensional 0/1 vector, where $L$ is number of labels that can grow to millions and beyond in many modern applications. We show that vanilla GBDT can easily run out of memory or encounter near-forever running time in this regime, and propose a new GBDT variant, GBDT-SPARSE, to resolve this problem by employing $L_0$ regularization. We then discuss in detail how to utilize this sparsity to conduct GBDT training, including splitting the nodes, computing the sparse residual, and predicting in sublinear time. Finally, we apply our algorithm to extreme multilabel classification problems, and show that the proposed GBDT-SPARSE achieves an order of magnitude improvements in model size and prediction time over existing methods, while yielding similar performance.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=870
PDF http://proceedings.mlr.press/v70/si17a/si17a.pdf
PWC https://paperswithcode.com/paper/gradient-boosted-decision-trees-for-high
Repo
Framework
comments powered by Disqus