January 25, 2020

2364 words 12 mins read

Paper Group NANR 15

GenderQuant: Quantifying Mention-Level Genderedness. Improving classification of Adverse Drug Reactions through Using Sentiment Analysis and Transfer Learning. Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences. Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Appr …

GenderQuant: Quantifying Mention-Level Genderedness


Title	GenderQuant: Quantifying Mention-Level Genderedness
Authors	{Ananya}, Nitya Parthasarthi, Sameer Singh
Abstract	Language is gendered if the context surrounding a mention is suggestive of a particular binary gender for that mention. Detecting the different ways in which language is gendered is an important task since gendered language can bias NLP models (such as for coreference resolution). This task is challenging since genderedness is often expressed in subtle ways. Existing approaches need considerable annotation efforts for each language, domain, and author, and often require handcrafted lexicons and features. Additionally, these approaches do not provide a quantifiable measure of how gendered the text is, nor are they applicable at the fine-grained mention level. In this paper, we use existing NLP pipelines to automatically annotate gender of mentions in the text. On corpora labeled using this method, we train a supervised classifier to predict the gender of any mention from its context and evaluate it on unseen text. The model confidence for a mention{'}s gender can be used as a proxy to indicate the level of genderedness of the context. We test this gendered language detector on movie summaries, movie reviews, news articles, and fiction novels, achieving an AUC-ROC of up to 0.71, and observe that the model predictions agree with human judgments collected for this task. We also provide examples of detected gendered sentences from aforementioned domains.
Tasks	Coreference Resolution
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1303/
PDF	https://www.aclweb.org/anthology/N19-1303
PWC	https://paperswithcode.com/paper/genderquant-quantifying-mention-level
Repo
Framework

Improving classification of Adverse Drug Reactions through Using Sentiment Analysis and Transfer Learning


Title	Improving classification of Adverse Drug Reactions through Using Sentiment Analysis and Transfer Learning
Authors	Hassan Alhuzali, Sophia Ananiadou
Abstract	The availability of large-scale and real-time data on social media has motivated research into adverse drug reactions (ADRs). ADR classification helps to identify negative effects of drugs, which can guide health professionals and pharmaceutical companies in making medications safer and advocating patients{'} safety. Based on the observation that in social media, negative sentiment is frequently expressed towards ADRs, this study presents a neural model that combines sentiment analysis with transfer learning techniques to improve ADR detection in social media postings. Our system is firstly trained to classify sentiment in tweets concerning current affairs, using the SemEval17-task4A corpus. We then apply transfer learning to adapt the model to the task of detecting ADRs in social media postings. We show that, in combination with rich representations of words and their contexts, transfer learning is beneficial, especially given the large degree of vocabulary overlap between the current affairs posts in the SemEval17-task4A corpus and posts about ADRs. We compare our results with previous approaches, and show that our model can outperform them by up to 3{%} F-score.
Tasks	Sentiment Analysis, Transfer Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5036/
PDF	https://www.aclweb.org/anthology/W19-5036
PWC	https://paperswithcode.com/paper/improving-classification-of-adverse-drug
Repo
Framework

Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences


Title	Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences
Authors	Filip Radlinski, Krisztian Balog, Bill Byrne, Karthik Krishnamoorthi
Abstract	Conversational recommendation has recently attracted significant attention. As systems must understand users{'} preferences, training them has called for conversational corpora, typically derived from task-oriented conversations. We observe that such corpora often do not reflect how people naturally describe preferences. We present a new approach to obtaining user preferences in dialogue: Coached Conversational Preference Elicitation. It allows collection of natural yet structured conversational preferences. Studying the dialogues in one domain, we present a brief quantitative analysis of how people describe movie preferences at scale. Demonstrating the methodology, we release the CCPE-M dataset to the community with over 500 movie preference dialogues expressing over 10,000 preferences.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-5941/
PDF	https://www.aclweb.org/anthology/W19-5941
PWC	https://paperswithcode.com/paper/coached-conversational-preference-elicitation
Repo
Framework

Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Approaches


Title	Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Approaches
Authors	Andra{\v{z}} Pelicon, Matej Martinc, Petra Kralj Novak
Abstract	SemEval 2019 Task 6 was OffensEval: Identifying and Categorizing Offensive Language in Social Media. The task was further divided into three sub-tasks: offensive language identification, automatic categorization of offense types, and offense target identification. In this paper, we present the approaches used by the Embeddia team, who qualified as fourth, eighteenth and fifth on the tree sub-tasks. A different model was trained for each sub-task. For the first sub-task, we used a BERT model fine-tuned on the OLID dataset, while for the second and third tasks we developed a custom neural network architecture which combines bag-of-words features and automatically generated sequence-based features. Our results show that combining automatically and manually crafted features fed into a neural architecture outperform transfer learning approach on more unbalanced datasets.
Tasks	Language Identification, Transfer Learning
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2108/
PDF	https://www.aclweb.org/anthology/S19-2108
PWC	https://paperswithcode.com/paper/embeddia-at-semeval-2019-task-6-detecting
Repo
Framework

Challenges of language change and variation: towards an extended treebank of Medieval French


Title	Challenges of language change and variation: towards an extended treebank of Medieval French
Authors	Mathilde Regnault, Sophie Pr{'e}vost, Eric Villemonte de la Clergerie
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7816/
PDF	https://www.aclweb.org/anthology/W19-7816
PWC	https://paperswithcode.com/paper/challenges-of-language-change-and-variation
Repo
Framework

MSO with tests and reducts


Title	MSO with tests and reducts
Authors	Fern, Tim o, David Woods, Carl Vogel
Abstract	Tests added to Kleene algebra (by Kozen and others) are considered within Monadic Second Order logic over strings, where they are likened to statives in natural language. Reducts are formed over tests and non-tests alike, specifying what is observable. Notions of temporal granularity are based on observable change, under the assumption that a finite set bounds what is observable (with the possibility of stretching such bounds by moving to a larger finite set). String projections at different granularities are conjoined by superpositions that provide another variant of concatenation for Booleans.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-3106/
PDF	https://www.aclweb.org/anthology/W19-3106
PWC	https://paperswithcode.com/paper/mso-with-tests-and-reducts
Repo
Framework

Biologically-Constrained Graphs for Global Connectomics Reconstruction


Title	Biologically-Constrained Graphs for Global Connectomics Reconstruction
Authors	Brian Matejek, Daniel Haehn, Haidong Zhu, Donglai Wei, Toufiq Parag, Hanspeter Pfister
Abstract	Most current state-of-the-art connectome reconstruction pipelines have two major steps: initial pixel-based segmentation with affinity prediction and watershed transform, and refined segmentation by merging over-segmented regions. These methods rely only on local context and are typically agnostic to the underlying biology. Since a few merge errors can lead to several incorrectly merged neuronal processes, these algorithms are currently tuned towards over-segmentation producing an overburden of costly proofreading. We propose a third step for connectomics reconstruction pipelines to refine an over-segmentation using both local and global context with an emphasis on adhering to the underlying biology. We first extract a graph from an input segmentation where nodes correspond to segment labels and edges indicate potential split errors in the over-segmentation. In order to increase throughput and allow for large-scale reconstruction, we employ biologically inspired geometric constraints based on neuron morphology to reduce the number of nodes and edges. Next, two neural networks learn these neuronal shapes to further aid the graph construction process. Lastly, we reformulate the region merging problem as a graph partitioning one to leverage global context. We demonstrate the performance of our approach on four real-world connectomics datasets with an average variation of information improvement of 21.3%.
Tasks	Electron Microscopy Image Segmentation, graph construction, graph partitioning
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Matejek_Biologically-Constrained_Graphs_for_Global_Connectomics_Reconstruction_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Matejek_Biologically-Constrained_Graphs_for_Global_Connectomics_Reconstruction_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/biologically-constrained-graphs-for-global
Repo
Framework

Proceedings of the Second Workshop on Economics and Natural Language Processing


Title	Proceedings of the Second Workshop on Economics and Natural Language Processing
Authors
Abstract
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5100/
PDF	https://www.aclweb.org/anthology/D19-5100
PWC	https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on-20
Repo
Framework

Unsupervised Hyper-alignment for Multilingual Word Embeddings


Title	Unsupervised Hyper-alignment for Multilingual Word Embeddings
Authors	Jean Alaux, Edouard Grave, Marco Cuturi, Armand Joulin
Abstract	We consider the problem of aligning continuous word representations, learned in multiple languages, to a common space. It was recently shown that, in the case of two languages, it is possible to learn such a mapping without supervision. This paper extends this line of work to the problem of aligning multiple languages to a common space. A solution is to independently map all languages to a pivot language. Unfortunately, this degrades the quality of indirect word translation. We thus propose a novel formulation that ensures composable mappings, leading to better alignments. We evaluate our method by jointly aligning word vectors in eleven languages, showing consistent improvement with indirect mappings while maintaining competitive performance on direct word translation.
Tasks	Multilingual Word Embeddings, Word Embeddings
Published	2019-05-01
URL	https://openreview.net/forum?id=HJe62s09tX
PDF	https://openreview.net/pdf?id=HJe62s09tX
PWC	https://paperswithcode.com/paper/unsupervised-hyper-alignment-for-multilingual
Repo
Framework

Neural Machine Translation for English–Kazakh with Morphological Segmentation and Synthetic Data


Title	Neural Machine Translation for English–Kazakh with Morphological Segmentation and Synthetic Data
Authors	Antonio Toral, Lukas Edman, Galiya Yeshmagambetova, Jennifer Spenader
Abstract	This paper presents the systems submitted by the University of Groningen to the English{–} Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore the potential benefits of (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English{–}Kazakh data and (iii) synthetic data, both for the source and for the target language. Our best submissions ranked second for Kazakh→English and third for English→Kazakh in terms of the BLEU automatic evaluation metric.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5343/
PDF	https://www.aclweb.org/anthology/W19-5343
PWC	https://paperswithcode.com/paper/neural-machine-translation-for-english-kazakh
Repo
Framework

Lexicon Guided Attentive Neural Network Model for Argument Mining


Title	Lexicon Guided Attentive Neural Network Model for Argument Mining
Authors	Jian-Fu Lin, Kuo Yu Huang, Hen-Hsen Huang, Hsin-Hsi Chen
Abstract	Identification of argumentative components is an important stage of argument mining. Lexicon information is reported as one of the most frequently used features in the argument mining research. In this paper, we propose a methodology to integrate lexicon information into a neural network model by attention mechanism. We conduct experiments on the UKP dataset, which is collected from heterogeneous sources and contains several text types, e.g., microblog, Wikipedia, and news. We explore lexicons from various application scenarios such as sentiment analysis and emotion detection. We also compare the experimental results of leveraging different lexicons.
Tasks	Argument Mining, Sentiment Analysis
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4508/
PDF	https://www.aclweb.org/anthology/W19-4508
PWC	https://paperswithcode.com/paper/lexicon-guided-attentive-neural-network-model
Repo
Framework

Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering


Title	Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering
Authors	Xinglin Piao, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
Abstract	Unsupervised clustering for high-dimension data (such as imageset or video) is a hard issue in data processing and data mining area since these data always lie on a manifold (such as Grassmann manifold). Inspired of Low Rank representation theory, researchers proposed a series of effective clustering methods for high-dimension data with non-linear metric. However, most of these methods adopt the traditional single nuclear norm as the relaxation of the rank function, which would lead to suboptimal solution deviated from the original one. In this paper, we propose a new low rank model for high-dimension data clustering task on Grassmann manifold based on the Double Nuclear norm which is used to better approximate the rank minimization of matrix. Further, to consider the inner geometry or structure of data space, we integrated the adaptive Laplacian regularization to construct the local relationship of data samples. The proposed models have been assessed on several public datasets for imageset clustering. The experimental results show that the proposed models outperform the state-of-the-art clustering ones.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Piao_Double_Nuclear_Norm_Based_Low_Rank_Representation_on_Grassmann_Manifolds_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Piao_Double_Nuclear_Norm_Based_Low_Rank_Representation_on_Grassmann_Manifolds_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/double-nuclear-norm-based-low-rank
Repo
Framework

Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning


Title	Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning
Authors	Toru Nishino, Shotaro Misawa, Ryuji Kano, Tomoki Taniguchi, Yasuhide Miura, Tomoko Ohkuma
Abstract	The automated generation of information indicating the characteristics of articles such as headlines, key phrases, summaries and categories helps writers to alleviate their workload. Previous research has tackled these tasks using neural abstractive summarization and classification methods. However, the outputs may be inconsistent if they are generated individually. The purpose of our study is to generate multiple outputs consistently. We introduce a multi-task learning model with a shared encoder and multiple decoders for each task. We propose a novel loss function called hierarchical consistency loss to maintain consistency among the attention weights of the decoders. To evaluate the consistency, we employ a human evaluation. The results show that our model generates more consistent headlines, key phrases and categories. In addition, our model outperforms the baseline model on the ROUGE scores, and generates more adequate and fluent headlines.
Tasks	Abstractive Text Summarization, Document Classification, Multi-Task Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1315/
PDF	https://www.aclweb.org/anthology/D19-1315
PWC	https://paperswithcode.com/paper/keeping-consistency-of-sentence-generation
Repo
Framework

On Meaning-Preserving Adversarial Perturbations for Sequence-to-Sequence Models


Title	On Meaning-Preserving Adversarial Perturbations for Sequence-to-Sequence Models
Authors	Paul Michel, Graham Neubig, Xian Li, Juan Miguel Pino
Abstract	Adversarial examples have been shown to be an effective way of assessing the robustness of neural sequence-to-sequence (seq2seq) models, by applying perturbations to the input of a model leading to large degradation in performance. However, these perturbations are only indicative of a weakness in the model if they do not change the semantics of the input in a way that would change the expected output. Using the example of machine translation (MT), we propose a new evaluation framework for adversarial attacks on seq2seq models taking meaning preservation into account and demonstrate that existing methods may not preserve meaning in general. Based on these findings, we propose new constraints for attacks on word-based MT systems and show, via human and automatic evaluation, that they produce more semantically similar adversarial inputs. Furthermore, we show that performing adversarial training with meaning-preserving attacks is beneficial to the model in terms of adversarial robustness without hurting test performance.
Tasks	Machine Translation
Published	2019-05-01
URL	https://openreview.net/forum?id=BylkG20qYm
PDF	https://openreview.net/pdf?id=BylkG20qYm
PWC	https://paperswithcode.com/paper/on-meaning-preserving-adversarial
Repo
Framework

SyntaxFest 2019 Invited talk - Quantitative Computational Syntax: dependencies, intervention effects and word embeddings


Title	SyntaxFest 2019 Invited talk - Quantitative Computational Syntax: dependencies, intervention effects and word embeddings
Authors	Paola Merlo
Abstract
Tasks	Word Embeddings
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7801/
PDF	https://www.aclweb.org/anthology/W19-7801
PWC	https://paperswithcode.com/paper/syntaxfest-2019-invited-talk-quantitative
Repo
Framework