October 20, 2019

2797 words 14 mins read

Paper Group AWR 207

Self-Imitation Learning. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. Cross-Target Stance Classification with Self-Attention Networks. Appendix - Recommended Statistical Significance Tests for NLP Tasks. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks. Semantic Human …

Self-Imitation Learning


Title	Self-Imitation Learning
Authors	Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee
Abstract	This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent’s past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks.
Tasks	Atari Games, Imitation Learning
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05635v1
PDF	http://arxiv.org/pdf/1806.05635v1.pdf
PWC	https://paperswithcode.com/paper/self-imitation-learning
Repo	https://github.com/rwightman/pytorch-opensim-rl
Framework	pytorch

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples


Title	Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Authors	Anish Athalye, Nicholas Carlini, David Wagner
Abstract	We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.
Tasks	Adversarial Attack, Adversarial Defense
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00420v4
PDF	http://arxiv.org/pdf/1802.00420v4.pdf
PWC	https://paperswithcode.com/paper/obfuscated-gradients-give-a-false-sense-of
Repo	https://github.com/anishathalye/obfuscated-gradients
Framework	tf

Cross-Target Stance Classification with Self-Attention Networks


Title	Cross-Target Stance Classification with Self-Attention Networks
Authors	Chang Xu, Cecile Paris, Surya Nepal, Ross Sparks
Abstract	In stance classification, the target on which the stance is made defines the boundary of the task, and a classifier is usually trained for prediction on the same target. In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. We show that our model can find useful information shared between relevant targets which improves generalization in certain scenarios.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06593v2
PDF	http://arxiv.org/pdf/1805.06593v2.pdf
PWC	https://paperswithcode.com/paper/cross-target-stance-classification-with-self
Repo	https://github.com/nuaaxc/cross_target_stance_classification
Framework	tf

Appendix - Recommended Statistical Significance Tests for NLP Tasks


Title	Appendix - Recommended Statistical Significance Tests for NLP Tasks
Authors	Rotem Dror, Roi Reichart
Abstract	Statistical significance testing plays an important role when drawing conclusions from experimental results in NLP papers. Particularly, it is a valuable tool when one would like to establish the superiority of one algorithm over another. This appendix complements the guide for testing statistical significance in NLP presented in \cite{dror2018hitchhiker} by proposing valid statistical tests for the common tasks and evaluation measures in the field.
Tasks
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01448v1
PDF	http://arxiv.org/pdf/1809.01448v1.pdf
PWC	https://paperswithcode.com/paper/appendix-recommended-statistical-significance
Repo	https://github.com/rtmdrr/testSignificanceNLP
Framework	none

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks


Title	Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Authors	Byeongchang Kim, Hyunwoo Kim, Gunhee Kim
Abstract	We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models.
Tasks	Abstractive Text Summarization
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00783v2
PDF	http://arxiv.org/pdf/1811.00783v2.pdf
PWC	https://paperswithcode.com/paper/abstractive-summarization-of-reddit-posts
Repo	https://github.com/ctr4si/MMN
Framework	none

Semantic Human Matting


Title	Semantic Human Matting
Authors	Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, Kun Gai
Abstract	Human matting, high quality extraction of humans from natural images, is crucial for a wide variety of applications. Since the matting problem is severely under-constrained, most previous methods require user interactions to take user designated trimaps or scribbles as constraints. This user-in-the-loop nature makes them difficult to be applied to large scale data or time-sensitive scenarios. In this paper, instead of using explicit user input constraints, we employ implicit semantic constraints learned from data and propose an automatic human matting algorithm (SHM). SHM is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks. In practice, simultaneously learning both coarse semantics and fine details is challenging. We propose a novel fusion strategy which naturally gives a probabilistic estimation of the alpha matte. We also construct a very large dataset with high quality annotations consisting of 35,513 unique foregrounds to facilitate the learning and evaluation of human matting. Extensive experiments on this dataset and plenty of real images show that SHM achieves comparable results with state-of-the-art interactive matting methods.
Tasks	Image Matting
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01354v2
PDF	http://arxiv.org/pdf/1809.01354v2.pdf
PWC	https://paperswithcode.com/paper/semantic-human-matting
Repo	https://github.com/pkang2017/image-matting
Framework	none

DESlib: A Dynamic ensemble selection library in Python


Title	DESlib: A Dynamic ensemble selection library in Python
Authors	Rafael M. O. Cruz, Luiz G. Hafemann, Robert Sabourin, George D. C. Cavalcanti
Abstract	DESlib is an open-source python library providing the implementation of several dynamic selection techniques. The library is divided into three modules: (i) \emph{dcs}, containing the implementation of dynamic classifier selection methods (DCS); (ii) \emph{des}, containing the implementation of dynamic ensemble selection methods (DES); (iii) \emph{static}, with the implementation of static ensemble techniques. The library is fully documented (documentation available online on Read the Docs), has a high test coverage (codecov.io) and is part of the scikit-learn-contrib supported projects. Documentation, code and examples can be found on its GitHub page: https://github.com/scikit-learn-contrib/DESlib.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.04967v3
PDF	http://arxiv.org/pdf/1802.04967v3.pdf
PWC	https://paperswithcode.com/paper/deslib-a-dynamic-ensemble-selection-library
Repo	https://github.com/redavtalab/DES
Framework	none

Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System


Title	Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System
Authors	Pai Liu, Jingwei Gan, Rajan K. Chakrabarty
Abstract	We introduce a deep learning method to simulate the motion of particles trapped in a chaotic recirculating flame. The Lagrangian trajectories of particles, captured using a high-speed camera and subsequently reconstructed in 3-dimensional space, were used to train a variational autoencoder (VAE) which comprises multiple layers of convolutional neural networks. We show that the trajectories, which are statistically representative of those determined in experiments, can be generated using the VAE network. The performance of our model is evaluated with respect to the accuracy and generalization of the outputs.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11896v2
PDF	http://arxiv.org/pdf/1811.11896v2.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoding-the-lagrangian
Repo	https://github.com/deadzombie2333/Lagrangian_simulation_VAE
Framework	tf

Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora


Title	Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora
Authors	Sajawel Ahmed, Alexander Mehler
Abstract	This study improves the performance of neural named entity recognition by a margin of up to 11% in F-score on the example of a low-resource language like German, thereby outperforming existing baselines and establishing a new state-of-the-art on each single open-source dataset. Rather than designing deeper and wider hybrid neural architectures, we gather all available resources and perform a detailed optimization and grammar-dependent morphological processing consisting of lemmatization and part-of-speech tagging prior to exposing the raw data to any training process. We test our approach in a threefold monolingual experimental setup of a) single, b) joint, and c) optimized training and shed light on the dependency of downstream-tasks on the size of corpora used to compute word embeddings.
Tasks	Lemmatization, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10675v1
PDF	http://arxiv.org/pdf/1807.10675v1.pdf
PWC	https://paperswithcode.com/paper/resource-size-matters-improving-neural-named
Repo	https://github.com/FID-Biodiversity/GermanWordEmbeddings-NER
Framework	none

Training Competitive Binary Neural Networks from Scratch


Title	Training Competitive Binary Neural Networks from Scratch
Authors	Joseph Bethge, Marvin Bornstein, Adrian Loy, Haojin Yang, Christoph Meinel
Abstract	Convolutional neural networks have achieved astonishing results in different application areas. Various methods that allow us to use these models on mobile and embedded devices have been proposed. Especially binary neural networks are a promising approach for devices with low computational power. However, training accurate binary models from scratch remains a challenge. Previous work often uses prior knowledge from full-precision models and complex training strategies. In our work, we focus on increasing the performance of binary neural networks without such prior knowledge and a much simpler training strategy. In our experiments we show that we are able to achieve state-of-the-art results on standard benchmark datasets. Further, to the best of our knowledge, we are the first to successfully adopt a network architecture with dense connections for binary networks, which lets us improve the state-of-the-art even further.
Tasks
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01965v1
PDF	http://arxiv.org/pdf/1812.01965v1.pdf
PWC	https://paperswithcode.com/paper/training-competitive-binary-neural-networks
Repo	https://github.com/hpi-xnor/BMXNet-v2
Framework	mxnet

Generative Models from the perspective of Continual Learning


Title	Generative Models from the perspective of Continual Learning
Authors	Timothée Lesort, Hugo Caselles-Dupré, Michael Garcia-Ortiz, Andrei Stoian, David Filliat
Abstract	Which generative model is the most suitable for Continual Learning? This paper aims at evaluating and comparing generative models on disjoint sequential image generation tasks. We investigate how several models learn and forget, considering various strategies: rehearsal, regularization, generative replay and fine-tuning. We used two quantitative metrics to estimate the generation quality and memory ability. We experiment with sequential tasks on three commonly used benchmarks for Continual Learning (MNIST, Fashion MNIST and CIFAR10). We found that among all models, the original GAN performs best and among Continual Learning strategies, generative replay outperforms all other methods. Even if we found satisfactory combinations on MNIST and Fashion MNIST, training generative models sequentially on CIFAR10 is particularly instable, and remains a challenge. Our code is available online \footnote{\url{https://github.com/TLESORT/Generative_Continual_Learning}}.
Tasks	Continual Learning, Image Generation
Published	2018-12-21
URL	http://arxiv.org/abs/1812.09111v1
PDF	http://arxiv.org/pdf/1812.09111v1.pdf
PWC	https://paperswithcode.com/paper/generative-models-from-the-perspective-of
Repo	https://github.com/TLESORT/Generative_Continual_Learning
Framework	pytorch

An Information-theoretic Framework for the Lossy Compression of Link Streams


Title	An Information-theoretic Framework for the Lossy Compression of Link Streams
Authors	Robin Lamarche-Perrin
Abstract	Graph compression is a data analysis technique that consists in the replacement of parts of a graph by more general structural patterns in order to reduce its description length. It notably provides interesting exploration tools for the study of real, large-scale, and complex graphs which cannot be grasped at first glance. This article proposes a framework for the compression of temporal graphs, that is for the compression of graphs that evolve with time. This framework first builds on a simple and limited scheme, exploiting structural equivalence for the lossless compression of static graphs, then generalises it to the lossy compression of link streams, a recent formalism for the study of temporal graphs. Such generalisation relies on the natural extension of (bidimensional) relational data by the addition of a third temporal dimension. Moreover, we introduce an information-theoretic measure to quantify and to control the information that is lost during compression, as well as an algebraic characterisation of the space of possible compression patterns to enhance the expressiveness of the initial compression scheme. These contributions lead to the definition of a combinatorial optimisation problem, that is the Lossy Multistream Compression Problem, for which we provide an exact algorithm.
Tasks
Published	2018-07-18
URL	http://arxiv.org/abs/1807.06874v1
PDF	http://arxiv.org/pdf/1807.06874v1.pdf
PWC	https://paperswithcode.com/paper/an-information-theoretic-framework-for-the
Repo	https://github.com/Lamarche-Perrin/greedy-graph-compression
Framework	none

Evaluating Overfit and Underfit in Models of Network Community Structure


Title	Evaluating Overfit and Underfit in Models of Network Community Structure
Authors	Amir Ghasemian, Homa Hosseinmardi, Aaron Clauset
Abstract	A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network’s connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.
Tasks	Community Detection, Link Prediction
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10582v3
PDF	http://arxiv.org/pdf/1802.10582v3.pdf
PWC	https://paperswithcode.com/paper/evaluating-overfit-and-underfit-in-models-of
Repo	https://github.com/AGhasemian/CommunityFitNet
Framework	none

Forecasting Future Humphrey Visual Fields Using Deep Learning


Title	Forecasting Future Humphrey Visual Fields Using Deep Learning
Authors	Joanne C. Wen, Cecilia S. Lee, Pearse A. Keane, Sa Xiao, Yue Wu, Ariel Rokem, Philip P. Chen, Aaron Y. Lee
Abstract	Purpose: To determine if deep learning networks could be trained to forecast a future 24-2 Humphrey Visual Field (HVF). Participants: All patients who obtained a HVF 24-2 at the University of Washington. Methods: All datapoints from consecutive 24-2 HVFs from 1998 to 2018 were extracted from a University of Washington database. Ten-fold cross validation with a held out test set was used to develop the three main phases of model development: model architecture selection, dataset combination selection, and time-interval model training with transfer learning, to train a deep learning artificial neural network capable of generating a point-wise visual field prediction. Results: More than 1.7 million perimetry points were extracted to the hundredth decibel from 32,443 24-2 HVFs. The best performing model with 20 million trainable parameters, CascadeNet-5, was selected. The overall MAE for the test set was 2.47 dB (95% CI: 2.45 dB to 2.48 dB). The 100 fully trained models were able to successfully predict progressive field loss in glaucomatous eyes up to 5.5 years in the future with a correlation of 0.92 between the MD of predicted and actual future HVF (p < 2.2 x 10 -16 ) and an average difference of 0.41 dB. Conclusions: Using unfiltered real-world datasets, deep learning networks show an impressive ability to not only learn spatio-temporal HVF changes but also to generate predictions for future HVFs up to 5.5 years, given only a single HVF.
Tasks	Transfer Learning
Published	2018-04-02
URL	http://arxiv.org/abs/1804.04543v1
PDF	http://arxiv.org/pdf/1804.04543v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-future-humphrey-visual-fields
Repo	https://github.com/uw-biomedical-ml/hvfProgression
Framework	tf

Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions


Title	Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Authors	Albert Gatt, Marc Tanti, Adrian Muscat, Patrizia Paggio, Reuben A. Farrugia, Claudia Borg, Kenneth P. Camilleri, Mike Rosner, Lonneke van der Plas
Abstract	The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken `in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods. \|
Tasks
Published	2018-03-10
URL	http://arxiv.org/abs/1803.03827v1
PDF	http://arxiv.org/pdf/1803.03827v1.pdf
PWC	https://paperswithcode.com/paper/face2text-collecting-an-annotated-image
Repo	https://github.com/akanimax/T2F
Framework	pytorch