October 15, 2019

2660 words 13 mins read

Paper Group NANR 125

Building an annotated dataset of app store reviews with Appraisal features in English and Spanish. RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian. A prototype finite-state morphological analyser for Chukchi. Building a Morphological Treebank for German from a Linguistic Database. Robust Subspace Approximation in a S …

Building an annotated dataset of app store reviews with Appraisal features in English and Spanish


Title	Building an annotated dataset of app store reviews with Appraisal features in English and Spanish
Authors	Natalia Mora, Julia Lavid-L{'o}pez
Abstract	This paper describes the creation and annotation of a dataset consisting of 250 English and Spanish app store reviews from Google{'}s Play Store with Appraisal features. This is one of the most influential linguistic frameworks for the analysis of evaluation and opinion in discourse due to its insightful descriptive features. However, it has not been extensively applied in NLP in spite of its potential for the classification of the subjective content of these reviews. We describe the dataset, the annotation scheme and guidelines, the agreement studies, the annotation results and their impact on the characterisation of this genre.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1103/
PDF	https://www.aclweb.org/anthology/W18-1103
PWC	https://paperswithcode.com/paper/building-an-annotated-dataset-of-app-store
Repo
Framework


Title	RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian
Authors	Anna Rogers, Alexey Romanov, Anna Rumshisky, Svitlana Volkova, Mikhail Gronas, Alex Gribov
Abstract	This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages. RuSentiment is currently the largest in its class for Russian, with 31,185 posts annotated with Fleiss{'} kappa of 0.58 (3 annotations per post). To diversify the dataset, 6,950 posts were pre-selected with an active learning-style strategy. We report baseline classification results, and we also release the best-performing embeddings trained on 3.2B tokens of Russian VKontakte posts.
Tasks	Active Learning, Sentiment Analysis, Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1064/
PDF	https://www.aclweb.org/anthology/C18-1064
PWC	https://paperswithcode.com/paper/rusentiment-an-enriched-sentiment-analysis
Repo
Framework

A prototype finite-state morphological analyser for Chukchi


Title	A prototype finite-state morphological analyser for Chukchi
Authors	Vasilisa Andriyanets, Francis Tyers
Abstract	In this article we describe the application of finite-state transducers to the morphological and phonological systems of Chukchi, a polysynthetic language spoken in the north of the Russian Federation. The language exhibits progressive and regressive vowel harmony, productive incorporation and extensive circumfixing. To implement the analyser we use the well-known Helsinki Finite-State Toolkit (HFST). The resulting model covers the majority of the morphological and phonological processes. A brief evaluation carried out on publically-available corpora shows that the coverage of the transducer is between and 53{%} and 76{%}. An error evaluation of 100 tokens randomly selected from the corpus, which were not covered by the analyser shows that most of the morphological processes are covered and that the majority of errors are caused by a limited stem lexicon.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4804/
PDF	https://www.aclweb.org/anthology/W18-4804
PWC	https://paperswithcode.com/paper/a-prototype-finite-state-morphological
Repo
Framework

Building a Morphological Treebank for German from a Linguistic Database


Title	Building a Morphological Treebank for German from a Linguistic Database
Authors	Petra Steiner, Josef Ruppenhofer
Abstract
Tasks	Morphological Analysis
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1613/
PDF	https://www.aclweb.org/anthology/L18-1613
PWC	https://paperswithcode.com/paper/building-a-morphological-treebank-for-german
Repo
Framework

Robust Subspace Approximation in a Stream


Title	Robust Subspace Approximation in a Stream
Authors	Roie Levin, Anish Prasad Sevekari, David Woodruff
Abstract	We study robust subspace estimation in the streaming and distributed settings. Given a set of n data points {a_i}{i=1}^n in R^d and an integer k, we wish to find a linear subspace S of dimension k for which sum_i M(dist(S, a_i)) is minimized, where dist(S,x) := min{y in S} x-y_2, and M() is some loss function. When M is the identity function, S gives a subspace that is more robust to outliers than that provided by the truncated SVD. Though the problem is NP-hard, it is approximable within a (1+epsilon) factor in polynomial time when k and epsilon are constant. We give the first sublinear approximation algorithm for this problem in the turnstile streaming and arbitrary partition distributed models, achieving the same time guarantees as in the offline case. Our algorithm is the first based entirely on oblivious dimensionality reduction, and significantly simplifies prior methods for this problem, which held in neither the streaming nor distributed models.
Tasks	Dimensionality Reduction
Published	2018-12-01
URL	http://papers.nips.cc/paper/8267-robust-subspace-approximation-in-a-stream
PDF	http://papers.nips.cc/paper/8267-robust-subspace-approximation-in-a-stream.pdf
PWC	https://paperswithcode.com/paper/robust-subspace-approximation-in-a-stream
Repo
Framework

Towards Realistic Predictors


Title	Towards Realistic Predictors
Authors	Pei Wang, Nuno Vasconcelos
Abstract	A new class of predictors, denoted realistic predictors, is defined. These are predictors that, like humans, assess the difficulty of examples, reject to work on those that are deemed too hard, but guarantee good performance on the ones they operate on. In this paper, we talk about a particular case of it, realistic classifiers. The central problem in realistic classification, the design of an inductive predictor of hardness scores, is considered. It is argued that this should be a predictor independent of the classifier itself, but tuned to it, and learned without explicit supervision, so as to learn from its mistakes. A new architecture is proposed to accomplish these goals by complementing the classifier with an auxiliary hardness prediction network (HP-Net). Sharing the same inputs as classifiers, the HP-Net outputs the hardness scores to be fed to the classifier as loss weights. Alternatively, the output of classifiers is also fed to HP-Net in a new defined loss, variant of cross entropy loss. The two networks are trained jointly in an adversarial way where, as the classifier learns to improve its predictions, the HP-Net refines its hardness scores. Given the learned hardness predictor, a simple implementation of realistic classifiers is proposed by rejecting examples with large scores. Experimental results not only provide evidence in support of the effectiveness of the proposed architecture and the learned hardness predictor, but also show that the realistic classifier always improves performance on the examples that it accepts to classify, performing better on these examples than an equivalent nonrealistic classifier. All of these make it possible for realistic classifiers to guarantee a good performance.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Pei_Wang_Towards_Realistic_Predictors_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Pei_Wang_Towards_Realistic_Predictors_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/towards-realistic-predictors
Repo
Framework

Autonomous Sub-domain Modeling for Dialogue Policy with Hierarchical Deep Reinforcement Learning


Title	Autonomous Sub-domain Modeling for Dialogue Policy with Hierarchical Deep Reinforcement Learning
Authors	Giovanni Yoko Kristianto, Huiwen Zhang, Bin Tong, Makoto Iwayama, Yoshiyuki Kobayashi
Abstract	Solving composites tasks, which consist of several inherent sub-tasks, remains a challenge in the research area of dialogue. Current studies have tackled this issue by manually decomposing the composite tasks into several sub-domains. However, much human effort is inevitable. This paper proposes a dialogue framework that autonomously models meaningful sub-domains and learns the policy over them. Our experiments show that our framework outperforms the baseline without subdomains by 11{%} in terms of success rate, and is competitive with that with manually defined sub-domains.
Tasks	Hierarchical Reinforcement Learning
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5702/
PDF	https://www.aclweb.org/anthology/W18-5702
PWC	https://paperswithcode.com/paper/autonomous-sub-domain-modeling-for-dialogue
Repo
Framework

Coding Structures and Actions with the COSTA Scheme in Medical Conversations


Title	Coding Structures and Actions with the COSTA Scheme in Medical Conversations
Authors	Nan Wang, Yan Song, Fei Xia
Abstract	This paper describes the COSTA scheme for coding structures and actions in conversation. Informed by Conversation Analysis, the scheme introduces an innovative method for marking multi-layer structural organization of conversation and a structure-informed taxonomy of actions. In addition, we create a corpus of naturally occurring medical conversations, containing 318 video-recorded and manually transcribed pediatric consultations. Based on the annotated corpus, we investigate 1) treatment decision-making process in medical conversations, and 2) effects of physician-caregiver communication behaviors on antibiotic over-prescribing. Although the COSTA annotation scheme is developed based on data from the task-specific domain of pediatric consultations, it can be easily extended to apply to more general domains and other languages.
Tasks	Decision Making
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2309/
PDF	https://www.aclweb.org/anthology/W18-2309
PWC	https://paperswithcode.com/paper/coding-structures-and-actions-with-the-costa
Repo
Framework

Biomedical Event Extraction Using Convolutional Neural Networks and Dependency Parsing


Title	Biomedical Event Extraction Using Convolutional Neural Networks and Dependency Parsing
Authors	Jari Bj{"o}rne, Tapio Salakoski
Abstract	Event and relation extraction are central tasks in biomedical text mining. Where relation extraction concerns the detection of semantic connections between pairs of entities, event extraction expands this concept with the addition of trigger words, multiple arguments and nested events, in order to more accurately model the diversity of natural language. In this work we develop a convolutional neural network that can be used for both event and relation extraction. We use a linear representation of the input text, where information is encoded with various vector space embeddings. Most notably, we encode the parse graph into this linear space using dependency path embeddings. We integrate our neural network into the open source Turku Event Extraction System (TEES) framework. Using this system, our machine learning model can be easily applied to a large set of corpora from e.g. the BioNLP, DDI Extraction and BioCreative shared tasks. We evaluate our system on 12 different event, relation and NER corpora, showing good generalizability to many tasks and achieving improved performance on several corpora.
Tasks	Dependency Parsing, Relation Extraction, Semantic Parsing, Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2311/
PDF	https://www.aclweb.org/anthology/W18-2311
PWC	https://paperswithcode.com/paper/biomedical-event-extraction-using
Repo
Framework

Constraining MGbank: Agreement, L-Selection and Supertagging in Minimalist Grammars


Title	Constraining MGbank: Agreement, L-Selection and Supertagging in Minimalist Grammars
Authors	John Torr
Abstract	This paper reports on two strategies that have been implemented for improving the efficiency and precision of wide-coverage Minimalist Grammar (MG) parsing. The first extends the formalism presented in Torr and Stabler (2016) with a mechanism for enforcing fine-grained selectional restrictions and agreements. The second is a method for factoring computationally costly null heads out from bottom-up MG parsing; this has the additional benefit of rendering the formalism fully compatible for the first time with highly efficient Markovian supertaggers. These techniques aided in the task of generating MGbank, the first wide-coverage corpus of Minimalist Grammar derivation trees.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1055/
PDF	https://www.aclweb.org/anthology/P18-1055
PWC	https://paperswithcode.com/paper/constraining-mgbank-agreement-l-selection-and
Repo
Framework

Natural Language Generation for Polysynthetic Languages: Language Teaching and Learning Software for Kanyen’k'eha (Mohawk)


Title	Natural Language Generation for Polysynthetic Languages: Language Teaching and Learning Software for Kanyen’k'eha (Mohawk)
Authors	Greg Lessard, Nathan Brinklow, Michael Levison
Abstract	Kanyen{'}k{'e}ha (in English, Mohawk) is an Iroquoian language spoken primarily in Eastern Canada (Ontario, Qu{'e}bec). Classified as endangered, it has only a small number of speakers and very few younger native speakers. Consequently, teachers and courses, teaching materials and software are urgently needed. In the case of software, the polysynthetic nature of Kanyen{'}k{'e}ha means that the number of possible combinations grows exponentially and soon surpasses attempts to capture variant forms by hand. It is in this context that we describe an attempt to produce language teaching materials based on a generative approach. A natural language generation environment (ivi/Vinci) embedded in a web environment (VinciLingua) makes it possible to produce, by rule, variant forms of indefinite complexity. These may be used as models to explore, or as materials to which learners respond. Generated materials may take the form of written text, oral utterances, or images; responses may be typed on a keyboard, gestural (using a mouse) or, to a limited extent, oral. The software also provides complex orthographic, morphological and syntactic analysis of learner productions. We describe the trajectory of development of materials for a suite of four courses on Kanyen{'}k{'e}ha, the first of which will be taught in the fall of 2018.
Tasks	Text Generation
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4805/
PDF	https://www.aclweb.org/anthology/W18-4805
PWC	https://paperswithcode.com/paper/natural-language-generation-for-polysynthetic
Repo
Framework

Sentiment Classification towards Question-Answering with Hierarchical Matching Network


Title	Sentiment Classification towards Question-Answering with Hierarchical Matching Network
Authors	Chenlin Shen, Changlong Sun, Jingjing Wang, Yangyang Kang, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, Guodong Zhou
Abstract	In an e-commerce environment, user-oriented question-answering (QA) text pair could carry rich sentiment information. In this study, we propose a novel task/method to address QA sentiment analysis. In particular, we create a high-quality annotated corpus with specially-designed annotation guidelines for QA-style sentiment classification. On the basis, we propose a three-stage hierarchical matching network to explore deep sentiment information in a QA text pair. First, we segment both the question and answer text into sentences and construct a number of [Q-sentence, A-sentence] units in each QA text pair. Then, by leveraging a QA bidirectional matching layer, the proposed approach can learn the matching vectors of each [Q-sentence, A-sentence] unit. Finally, we characterize the importance of the generated matching vectors via a self-matching attention layer. Experimental results, comparing with a number of state-of-the-art baselines, demonstrate the impressive effectiveness of the proposed approach for QA-style sentiment classification.
Tasks	Opinion Mining, Question Answering, Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1401/
PDF	https://www.aclweb.org/anthology/D18-1401
PWC	https://paperswithcode.com/paper/sentiment-classification-towards-question
Repo
Framework

Analytic Expressions for Probabilistic Moments of PL-DNN With Gaussian Input


Title	Analytic Expressions for Probabilistic Moments of PL-DNN With Gaussian Input
Authors	Adel Bibi, Modar Alfadly, Bernard Ghanem
Abstract	The outstanding performance of deep neural networks (DNNs), for the visual recognition task in particular, has been demonstrated on several large-scale benchmarks. This performance has immensely strengthened the line of re- search that aims to understand and analyze the driving reasons behind the effectiveness of these networks. One important aspect of this analysis has recently gained much attention, namely the reaction of a DNN to noisy input. This has spawned research on developing adversarial input attacks as well as training strategies that make DNNs more robust against these attacks. To this end, we derive in this pa- per exact analytic expressions for the first and second moments (mean and variance) of a small piecewise linear (PL) network (Affine, ReLU, Affine) subject to general Gaussian input. We experimentally show that these expressions are tight under simple linearizations of deeper PL-DNNs, especially popular architectures in the literature (e.g. LeNet and AlexNet). Extensive experiments on image classification show that these expressions can be used to study the behaviour of the output mean of the logits for each class, the interclass confusion and the pixel-level spatial noise sensitivity of the network. Moreover, we show how these expressions can be used to systematically construct targeted and non-targeted adversarial attacks.
Tasks	Image Classification
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Bibi_Analytic_Expressions_for_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Bibi_Analytic_Expressions_for_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/analytic-expressions-for-probabilistic
Repo
Framework

Measuring Frame Instance Relatedness


Title	Measuring Frame Instance Relatedness
Authors	Valerio Basile, Roque Lopez Condori, Elena Cabrio
Abstract	Frame semantics is a well-established framework to represent the meaning of natural language in computational terms. In this work, we aim to propose a quantitative measure of relatedness between pairs of frame instances. We test our method on a dataset of sentence pairs, highlighting the correlation between our metric and human judgments of semantic similarity. Furthermore, we propose an application of our measure for clustering frame instances to extract prototypical knowledge from natural language.
Tasks	Reading Comprehension, Semantic Similarity, Semantic Textual Similarity, Sentiment Analysis
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-2029/
PDF	https://www.aclweb.org/anthology/S18-2029
PWC	https://paperswithcode.com/paper/measuring-frame-instance-relatedness
Repo
Framework

UniMelb at SemEval-2018 Task 12: Generative Implication using LSTMs, Siamese Networks and Semantic Representations with Synonym Fuzzing


Title	UniMelb at SemEval-2018 Task 12: Generative Implication using LSTMs, Siamese Networks and Semantic Representations with Synonym Fuzzing
Authors	Anirudh Joshi, Tim Baldwin, Richard O. Sinnott, Cecile Paris
Abstract	This paper describes a warrant classification system for SemEval 2018 Task 12, that attempts to learn semantic representations of reasons, claims and warrants. The system consists of 3 stacked LSTMs: one for the reason, one for the claim, and one shared Siamese Network for the 2 candidate warrants. Our main contribution is to force the embeddings into a shared feature space using vector operations, semantic similarity classification, Siamese networks, and multi-task learning. In doing so, we learn a form of generative implication, in encoding implication interrelationships between reasons, claims, and the associated correct and incorrect warrants. We augment the limited data in the task further by utilizing WordNet synonym {``}fuzzing{''}. When applied to SemEval 2018 Task 12, our system performs well on the development data, and officially ranked 8th among 21 teams. \|
Tasks	Multi-Task Learning, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1190/
PDF	https://www.aclweb.org/anthology/S18-1190
PWC	https://paperswithcode.com/paper/unimelb-at-semeval-2018-task-12-generative
Repo
Framework