January 25, 2020

2502 words 12 mins read

Paper Group NANR 94

CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech. A Convex Relaxation for Multi-Graph Matching. Categorizing and Inferring the Relationship between the Text and Image of Twitter Posts. Customizing Neural Machine Translation for Subtitling. Bootstrapping a Neural Morphological Analyzer …

CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech


Title	CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech
Authors	Yi-Ling Chung, Elizaveta Kuzmenko, Serra Sinem Tekiroglu, Marco Guerini
Abstract	Although there is an unprecedented effort to provide adequate responses in terms of laws and policies to hate content on social media platforms, dealing with hatred online is still a tough problem. Tackling hate speech in the standard way of content deletion or user suspension may be charged with censorship and overblocking. One alternate strategy, that has received little attention so far by the research community, is to actually oppose hate content with counter-narratives (i.e. informed textual responses). In this paper, we describe the creation of the first large-scale, multilingual, expert-based dataset of hate-speech/counter-narrative pairs. This dataset has been built with the effort of more than 100 operators from three different NGOs that applied their training and expertise to the task. Together with the collected data we also provide additional annotations about expert demographics, hate and response type, and data augmentation through translation and paraphrasing. Finally, we provide initial experiments to assess the quality of our data.
Tasks	Data Augmentation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1271/
PDF	https://www.aclweb.org/anthology/P19-1271
PWC	https://paperswithcode.com/paper/conan-counter-narratives-through
Repo
Framework

A Convex Relaxation for Multi-Graph Matching


Title	A Convex Relaxation for Multi-Graph Matching
Authors	Paul Swoboda, Dagmar Kainm"uller, Ashkan Mokarian, Christian Theobalt, Florian Bernard
Abstract	We present a convex relaxation for the multi-graph matching problem. Our formulation allows for partial pairwise matchings, guarantees cycle consistency, and our objective incorporates both linear and quadratic costs. Moreover, we also present an extension to higher-order costs. In order to solve the convex relaxation we employ a message passing algorithm that optimizes the dual problem. We experimentally compare our algorithm on established benchmark problems from computer vision, as well as on large problems from biological image analysis, the size of which exceed previously investigated multi-graph matching instances.
Tasks	Graph Matching
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Swoboda_A_Convex_Relaxation_for_Multi-Graph_Matching_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Swoboda_A_Convex_Relaxation_for_Multi-Graph_Matching_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/a-convex-relaxation-for-multi-graph-matching
Repo
Framework

Categorizing and Inferring the Relationship between the Text and Image of Twitter Posts


Title	Categorizing and Inferring the Relationship between the Text and Image of Twitter Posts
Authors	Alakan Vempala, a, Daniel Preo{\c{t}}iuc-Pietro
Abstract	Text in social media posts is frequently accompanied by images in order to provide content, supply context, or to express feelings. This paper studies how the meaning of the entire tweet is composed through the relationship between its textual content and its image. We build and release a data set of image tweets annotated with four classes which express whether the text or the image provides additional information to the other modality. We show that by combining the text and image information, we can build a machine learning approach that accurately distinguishes between the relationship types. Further, we derive insights into how these relationships are materialized through text and image content analysis and how they are impacted by user demographic traits. These methods can be used in several downstream applications including pre-training image tagging models, collecting distantly supervised data for image captioning, and can be directly used in end-user applications to optimize screen estate.
Tasks	Image Captioning
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1272/
PDF	https://www.aclweb.org/anthology/P19-1272
PWC	https://paperswithcode.com/paper/categorizing-and-inferring-the-relationship
Repo
Framework

Customizing Neural Machine Translation for Subtitling


Title	Customizing Neural Machine Translation for Subtitling
Authors	Evgeny Matusov, Patrick Wilken, Yota Georgakopoulou
Abstract	In this work, we customized a neural machine translation system for translation of subtitles in the domain of entertainment. The neural translation model was adapted to the subtitling content and style and extended by a simple, yet effective technique for utilizing inter-sentence context for short sentences such as dialog turns. The main contribution of the paper is a novel subtitle segmentation algorithm that predicts the end of a subtitle line given the previous word-level context using a recurrent neural network learned from human segmentation decisions. This model is combined with subtitle length and duration constraints established in the subtitling industry. We conducted a thorough human evaluation with two post-editors (English-to-Spanish translation of a documentary and a sitcom). It showed a notable productivity increase of up to 37{%} as compared to translating from scratch and significant reductions in human translation edit rate in comparison with the post-editing of the baseline non-adapted system without a learned segmentation model.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5209/
PDF	https://www.aclweb.org/anthology/W19-5209
PWC	https://paperswithcode.com/paper/customizing-neural-machine-translation-for
Repo
Framework

Bootstrapping a Neural Morphological Analyzer for St. Lawrence Island Yupik from a Finite-State Transducer


Title	Bootstrapping a Neural Morphological Analyzer for St. Lawrence Island Yupik from a Finite-State Transducer
Authors	Lane Schwartz, Emily Chen, Benjamin Hunt, Sylvia L.R. Schreiner
Abstract
Tasks
Published	2019-02-01
URL	https://www.aclweb.org/anthology/W19-6012/
PDF	https://www.aclweb.org/anthology/W19-6012
PWC	https://paperswithcode.com/paper/bootstrapping-a-neural-morphological-analyzer
Repo
Framework

Improving Low-Resource Morphological Learning with Intermediate Forms from Finite State Transducers


Title	Improving Low-Resource Morphological Learning with Intermediate Forms from Finite State Transducers
Authors	Sarah Moeller, Ghazaleh Kazeminejad, Andrew Cowell, Mans Hulden
Abstract
Tasks
Published	2019-02-01
URL	https://www.aclweb.org/anthology/W19-6011/
PDF	https://www.aclweb.org/anthology/W19-6011
PWC	https://paperswithcode.com/paper/improving-low-resource-morphological-learning
Repo
Framework

Analyzing Linguistic Differences between Owner and Staff Attributed Tweets


Title	Analyzing Linguistic Differences between Owner and Staff Attributed Tweets
Authors	Daniel Preo{\c{t}}iuc-Pietro, Rita Devlin Marier
Abstract	Research on social media has to date assumed that all posts from an account are authored by the same person. In this study, we challenge this assumption and study the linguistic differences between posts signed by the account owner or attributed to their staff. We introduce a novel data set of tweets posted by U.S. politicians who self-reported their tweets using a signature. We analyze the linguistic topics and style features that distinguish the two types of tweets. Predictive results show that we are able to predict owner and staff attributed tweets with good accuracy, even when not using any training data from that account.
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1274/
PDF	https://www.aclweb.org/anthology/P19-1274
PWC	https://paperswithcode.com/paper/analyzing-linguistic-differences-between
Repo
Framework

Learning Multimodal Graph-to-Graph Translation for Molecule Optimization


Title	Learning Multimodal Graph-to-Graph Translation for Molecule Optimization
Authors	Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola
Abstract	We view molecule optimization as a graph-to-graph translation problem. The goal is to learn to map from one molecular graph to another with better properties based on an available corpus of paired molecules. Since molecules can be optimized in different ways, there are multiple viable translations for each input graph. A key challenge is therefore to model diverse translation outputs. Our primary contributions include a junction tree encoder-decoder for learning diverse graph translations along with a novel adversarial training method for aligning distributions of molecules. Diverse output distributions in our model are explicitly realized by low-dimensional latent vectors that modulate the translation process. We evaluate our model on multiple molecule optimization tasks and show that our model outperforms previous state-of-the-art baselines by a significant margin.
Tasks	Graph-To-Graph Translation
Published	2019-05-01
URL	https://openreview.net/forum?id=B1xJAsA5F7
PDF	https://openreview.net/pdf?id=B1xJAsA5F7
PWC	https://paperswithcode.com/paper/learning-multimodal-graph-to-graph
Repo
Framework

Phylogenic Multi-Lingual Dependency Parsing


Title	Phylogenic Multi-Lingual Dependency Parsing
Authors	Mathieu Dehouck, Pascal Denis
Abstract	Languages evolve and diverge over time. Their evolutionary history is often depicted in the shape of a phylogenetic tree. Assuming parsing models are representations of their languages grammars, their evolution should follow a structure similar to that of the phylogenetic tree. In this paper, drawing inspiration from multi-task learning, we make use of the phylogenetic tree to guide the learning of multi-lingual dependency parsers leveraging languages structural similarities. Experiments on data from the Universal Dependency project show that phylogenetic training is beneficial to low resourced languages and to well furnished languages families. As a side product of phylogenetic training, our model is able to perform zero-shot parsing of previously unseen languages.
Tasks	Dependency Parsing, Multi-Task Learning
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1017/
PDF	https://www.aclweb.org/anthology/N19-1017
PWC	https://paperswithcode.com/paper/phylogenic-multi-lingual-dependency-parsing
Repo
Framework

Deep Clustering by Gaussian Mixture Variational Autoencoders With Graph Embedding


Title	Deep Clustering by Gaussian Mixture Variational Autoencoders With Graph Embedding
Authors	Linxiao Yang, Ngai-Man Cheung, Jiaying Li, Jun Fang
Abstract	We propose DGG: D eep clustering via a G aussian-mixture variational autoencoder (VAE) with G raph embedding. To facilitate clustering, we apply Gaussian mixture model (GMM) as the prior in VAE. To handle data with complex spread, we apply graph embedding. Our idea is that graph information which captures local data structures is an excellent complement to deep GMM. Combining them facilitates the network to learn powerful representations that follow global model and local structural constraints. Therefore, our method unifies model-based and similarity-based approaches for clustering. To combine graph embedding with probabilistic deep GMM, we propose a novel stochastic extension of graph embedding: we treat samples as nodes on a graph and minimize the weighted distance between their posterior distributions. We apply Jenson-Shannon divergence as the distance. We combine the divergence minimization with the log-likelihood maximization of the deep GMM. We derive formulations to obtain an unified objective that enables simultaneous deep representation learning and clustering. Our experimental results show that our proposed DGG outperforms recent deep Gaussian mixture methods (model-based) and deep spectral clustering (similarity-based). Our results highlight advantages of combining model-based and similarity-based clustering as proposed in this work. Our code is published here: https://github.com/dodoyang0929/DGG.git
Tasks	Graph Embedding, Representation Learning
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Yang_Deep_Clustering_by_Gaussian_Mixture_Variational_Autoencoders_With_Graph_Embedding_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Yang_Deep_Clustering_by_Gaussian_Mixture_Variational_Autoencoders_With_Graph_Embedding_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deep-clustering-by-gaussian-mixture
Repo
Framework

Improving Neural Machine Translation by Achieving Knowledge Transfer with Sentence Alignment Learning


Title	Improving Neural Machine Translation by Achieving Knowledge Transfer with Sentence Alignment Learning
Authors	Xuewen Shi, Heyan Huang, Wenguan Wang, Ping Jian, Yi-Kun Tang
Abstract	Neural Machine Translation (NMT) optimized by Maximum Likelihood Estimation (MLE) lacks the guarantee of translation adequacy. To alleviate this problem, we propose an NMT approach that heightens the adequacy in machine translation by transferring the semantic knowledge learned from bilingual sentence alignment. Specifically, we first design a discriminator that learns to estimate sentence aligning score over translation candidates, and then the learned semantic knowledge is transfered to the NMT model under an adversarial learning framework. We also propose a gated self-attention based encoder for sentence embedding. Furthermore, an N-pair training loss is introduced in our framework to aid the discriminator in better capturing lexical evidence in translation candidates. Experimental results show that our proposed method outperforms baseline NMT models on Chinese-to-English and English-to-German translation tasks. Further analysis also indicates the detailed semantic knowledge transfered from the discriminator to the NMT model.
Tasks	Machine Translation, Sentence Embedding, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-1025/
PDF	https://www.aclweb.org/anthology/K19-1025
PWC	https://paperswithcode.com/paper/improving-neural-machine-translation-by-2
Repo
Framework

Predicting Future Frames Using Retrospective Cycle GAN


Title	Predicting Future Frames Using Retrospective Cycle GAN
Authors	Yong-Hoon Kwon, Min-Gyu Park
Abstract	Recent advances in deep learning have significantly improved the performance of video prediction, however, top-performing algorithms start to generate blurry predictions as they attempt to predict farther future frames. In this paper, we propose a unified generative adversarial network for predicting accurate and temporally consistent future frames over time, even in a challenging environment. The key idea is to train a single generator that can predict both future and past frames while enforcing the consistency of bi-directional prediction using the retrospective cycle constraints. Moreover, we employ two discriminators not only to identify fake frames but also to distinguish fake contained image sequences from the real sequence. The latter discriminator, the sequence discriminator, plays a crucial role in predicting temporally consistent future frames. We experimentally verify the proposed framework using various real-world videos captured by car-mounted cameras, surveillance cameras, and arbitrary devices with state-of-the-art methods.
Tasks	Video Prediction
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Kwon_Predicting_Future_Frames_Using_Retrospective_Cycle_GAN_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Kwon_Predicting_Future_Frames_Using_Retrospective_Cycle_GAN_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/predicting-future-frames-using-retrospective
Repo
Framework

CLPsych 2019 Shared Task: Predicting the Degree of Suicide Risk in Reddit Posts


Title	CLPsych 2019 Shared Task: Predicting the Degree of Suicide Risk in Reddit Posts
Authors	Ayah Zirikly, Philip Resnik, {"O}zlem Uzuner, Kristy Hollingshead
Abstract	The shared task for the 2019 Workshop on Computational Linguistics and Clinical Psychology (CLPsych{'}19) introduced an assessment of suicide risk based on social media postings, using data from Reddit to identify users at no, low, moderate, or severe risk. Two variations of the task focused on users whose posts to the r/SuicideWatch subreddit indicated they might be at risk; a third task looked at screening users based only on their more everyday (non-SuicideWatch) posts. We received submissions from 15 different teams, and the results provide progress and insight into the value of language signal in helping to predict risk level.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-3003/
PDF	https://www.aclweb.org/anthology/W19-3003
PWC	https://paperswithcode.com/paper/clpsych-2019-shared-task-predicting-the
Repo
Framework

GTCOM Neural Machine Translation Systems for WMT19


Title	GTCOM Neural Machine Translation Systems for WMT19
Authors	Chao Bei, Hao Zong, Conghu Yuan, Qingming Liu, Baoyong Fan
Abstract	This paper describes the Global Tone Communication Co., Ltd.{'}s submission of the WMT19 shared news translation task. We participate in six directions: English to (Gujarati, Lithuanian and Finnish) and (Gujarati, Lithuanian and Finnish) to English. Further, we get the best BLEU scores in the directions of English to Gujarati and Lithuanian to English (28.2 and 36.3 respectively) among all the participants. The submitted systems mainly focus on back-translation, knowledge distillation and reranking to build a competitive model for this task. Also, we apply language model to filter monolingual data, back-translated data and parallel data. The techniques we apply for data filtering include filtering by rules, language models. Besides, We conduct several experiments to validate different knowledge distillation techniques and right-to-left (R2L) reranking.
Tasks	Language Modelling, Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5305/
PDF	https://www.aclweb.org/anthology/W19-5305
PWC	https://paperswithcode.com/paper/gtcom-neural-machine-translation-systems-for
Repo
Framework

News2vec: News Network Embedding with Subnode Information


Title	News2vec: News Network Embedding with Subnode Information
Authors	Ye Ma, Lu Zong, Yikang Yang, Jionglong Su
Abstract	With the development of NLP technologies, news can be automatically categorized and labeled according to a variety of characteristics, at the same time be represented as low dimensional embeddings. However, it lacks a systematic approach that effectively integrates the inherited features and inter-textual knowledge of news to represent the collective information with a dense vector. With the aim of filling this gap, the News2vec model is proposed to allow the distributed representation of news taking into account its associated features. To describe the cross-document linkages between news, a network consisting of news and its attributes is constructed. Moreover, the News2vec model treats the news node as a bag of features by developing the Subnode model. Based on the biased random walk and the skip-gram model, each news feature is mapped to a vector, and the news is thus represented as the sum of its features. This approach offers an easy solution to create embeddings for unseen news nodes based on its attributes. To evaluate our model, dimension reduction plots and correlation heat-maps are created to visualize the news vectors, together with the application of two downstream tasks, the stock movement prediction and news recommendation. By comparing with other established text/sentence embedding models, we show that News2vec achieves state-of-the-art performance on these news-related tasks.
Tasks	Dimensionality Reduction, Network Embedding, Sentence Embedding
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1490/
PDF	https://www.aclweb.org/anthology/D19-1490
PWC	https://paperswithcode.com/paper/news2vec-news-network-embedding-with-subnode
Repo
Framework