October 20, 2019

2797 words 14 mins read

Paper Group AWR 207

Paper Group AWR 207

Self-Imitation Learning. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. Cross-Target Stance Classification with Self-Attention Networks. Appendix - Recommended Statistical Significance Tests for NLP Tasks. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks. Semantic Human …

Self-Imitation Learning

Title Self-Imitation Learning
Authors Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee
Abstract This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent’s past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks.
Tasks Atari Games, Imitation Learning
Published 2018-06-14
URL http://arxiv.org/abs/1806.05635v1
PDF http://arxiv.org/pdf/1806.05635v1.pdf
PWC https://paperswithcode.com/paper/self-imitation-learning
Repo https://github.com/rwightman/pytorch-opensim-rl
Framework pytorch

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Title Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Authors Anish Athalye, Nicholas Carlini, David Wagner
Abstract We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.
Tasks Adversarial Attack, Adversarial Defense
Published 2018-02-01
URL http://arxiv.org/abs/1802.00420v4
PDF http://arxiv.org/pdf/1802.00420v4.pdf
PWC https://paperswithcode.com/paper/obfuscated-gradients-give-a-false-sense-of
Repo https://github.com/anishathalye/obfuscated-gradients
Framework tf

Cross-Target Stance Classification with Self-Attention Networks

Title Cross-Target Stance Classification with Self-Attention Networks
Authors Chang Xu, Cecile Paris, Surya Nepal, Ross Sparks
Abstract In stance classification, the target on which the stance is made defines the boundary of the task, and a classifier is usually trained for prediction on the same target. In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. We show that our model can find useful information shared between relevant targets which improves generalization in certain scenarios.
Tasks
Published 2018-05-17
URL http://arxiv.org/abs/1805.06593v2
PDF http://arxiv.org/pdf/1805.06593v2.pdf
PWC https://paperswithcode.com/paper/cross-target-stance-classification-with-self
Repo https://github.com/nuaaxc/cross_target_stance_classification
Framework tf
Title Appendix - Recommended Statistical Significance Tests for NLP Tasks
Authors Rotem Dror, Roi Reichart
Abstract Statistical significance testing plays an important role when drawing conclusions from experimental results in NLP papers. Particularly, it is a valuable tool when one would like to establish the superiority of one algorithm over another. This appendix complements the guide for testing statistical significance in NLP presented in \cite{dror2018hitchhiker} by proposing valid statistical tests for the common tasks and evaluation measures in the field.
Tasks
Published 2018-09-05
URL http://arxiv.org/abs/1809.01448v1
PDF http://arxiv.org/pdf/1809.01448v1.pdf
PWC https://paperswithcode.com/paper/appendix-recommended-statistical-significance
Repo https://github.com/rtmdrr/testSignificanceNLP
Framework none

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks

Title Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Authors Byeongchang Kim, Hyunwoo Kim, Gunhee Kim
Abstract We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models.
Tasks Abstractive Text Summarization
Published 2018-11-02
URL http://arxiv.org/abs/1811.00783v2
PDF http://arxiv.org/pdf/1811.00783v2.pdf
PWC https://paperswithcode.com/paper/abstractive-summarization-of-reddit-posts
Repo https://github.com/ctr4si/MMN
Framework none

Semantic Human Matting

Title Semantic Human Matting
Authors Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, Kun Gai
Abstract Human matting, high quality extraction of humans from natural images, is crucial for a wide variety of applications. Since the matting problem is severely under-constrained, most previous methods require user interactions to take user designated trimaps or scribbles as constraints. This user-in-the-loop nature makes them difficult to be applied to large scale data or time-sensitive scenarios. In this paper, instead of using explicit user input constraints, we employ implicit semantic constraints learned from data and propose an automatic human matting algorithm (SHM). SHM is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks. In practice, simultaneously learning both coarse semantics and fine details is challenging. We propose a novel fusion strategy which naturally gives a probabilistic estimation of the alpha matte. We also construct a very large dataset with high quality annotations consisting of 35,513 unique foregrounds to facilitate the learning and evaluation of human matting. Extensive experiments on this dataset and plenty of real images show that SHM achieves comparable results with state-of-the-art interactive matting methods.
Tasks Image Matting
Published 2018-09-05
URL http://arxiv.org/abs/1809.01354v2
PDF http://arxiv.org/pdf/1809.01354v2.pdf
PWC https://paperswithcode.com/paper/semantic-human-matting
Repo https://github.com/pkang2017/image-matting
Framework none

DESlib: A Dynamic ensemble selection library in Python

Title DESlib: A Dynamic ensemble selection library in Python
Authors Rafael M. O. Cruz, Luiz G. Hafemann, Robert Sabourin, George D. C. Cavalcanti
Abstract DESlib is an open-source python library providing the implementation of several dynamic selection techniques. The library is divided into three modules: (i) \emph{dcs}, containing the implementation of dynamic classifier selection methods (DCS); (ii) \emph{des}, containing the implementation of dynamic ensemble selection methods (DES); (iii) \emph{static}, with the implementation of static ensemble techniques. The library is fully documented (documentation available online on Read the Docs), has a high test coverage (codecov.io) and is part of the scikit-learn-contrib supported projects. Documentation, code and examples can be found on its GitHub page: https://github.com/scikit-learn-contrib/DESlib.
Tasks
Published 2018-02-14
URL http://arxiv.org/abs/1802.04967v3
PDF http://arxiv.org/pdf/1802.04967v3.pdf
PWC https://paperswithcode.com/paper/deslib-a-dynamic-ensemble-selection-library
Repo https://github.com/redavtalab/DES
Framework none

Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System

Title Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System
Authors Pai Liu, Jingwei Gan, Rajan K. Chakrabarty
Abstract We introduce a deep learning method to simulate the motion of particles trapped in a chaotic recirculating flame. The Lagrangian trajectories of particles, captured using a high-speed camera and subsequently reconstructed in 3-dimensional space, were used to train a variational autoencoder (VAE) which comprises multiple layers of convolutional neural networks. We show that the trajectories, which are statistically representative of those determined in experiments, can be generated using the VAE network. The performance of our model is evaluated with respect to the accuracy and generalization of the outputs.
Tasks
Published 2018-11-29
URL http://arxiv.org/abs/1811.11896v2
PDF http://arxiv.org/pdf/1811.11896v2.pdf
PWC https://paperswithcode.com/paper/variational-autoencoding-the-lagrangian
Repo https://github.com/deadzombie2333/Lagrangian_simulation_VAE
Framework tf

Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora

Title Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora
Authors Sajawel Ahmed, Alexander Mehler
Abstract This study improves the performance of neural named entity recognition by a margin of up to 11% in F-score on the example of a low-resource language like German, thereby outperforming existing baselines and establishing a new state-of-the-art on each single open-source dataset. Rather than designing deeper and wider hybrid neural architectures, we gather all available resources and perform a detailed optimization and grammar-dependent morphological processing consisting of lemmatization and part-of-speech tagging prior to exposing the raw data to any training process. We test our approach in a threefold monolingual experimental setup of a) single, b) joint, and c) optimized training and shed light on the dependency of downstream-tasks on the size of corpora used to compute word embeddings.
Tasks Lemmatization, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published 2018-07-26
URL http://arxiv.org/abs/1807.10675v1
PDF http://arxiv.org/pdf/1807.10675v1.pdf
PWC https://paperswithcode.com/paper/resource-size-matters-improving-neural-named
Repo https://github.com/FID-Biodiversity/GermanWordEmbeddings-NER
Framework none

Training Competitive Binary Neural Networks from Scratch

Title Training Competitive Binary Neural Networks from Scratch
Authors Joseph Bethge, Marvin Bornstein, Adrian Loy, Haojin Yang, Christoph Meinel
Abstract Convolutional neural networks have achieved astonishing results in different application areas. Various methods that allow us to use these models on mobile and embedded devices have been proposed. Especially binary neural networks are a promising approach for devices with low computational power. However, training accurate binary models from scratch remains a challenge. Previous work often uses prior knowledge from full-precision models and complex training strategies. In our work, we focus on increasing the performance of binary neural networks without such prior knowledge and a much simpler training strategy. In our experiments we show that we are able to achieve state-of-the-art results on standard benchmark datasets. Further, to the best of our knowledge, we are the first to successfully adopt a network architecture with dense connections for binary networks, which lets us improve the state-of-the-art even further.
Tasks
Published 2018-12-05
URL http://arxiv.org/abs/1812.01965v1
PDF http://arxiv.org/pdf/1812.01965v1.pdf
PWC https://paperswithcode.com/paper/training-competitive-binary-neural-networks
Repo https://github.com/hpi-xnor/BMXNet-v2
Framework mxnet

Generative Models from the perspective of Continual Learning

Title Generative Models from the perspective of Continual Learning
Authors Timothée Lesort, Hugo Caselles-Dupré, Michael Garcia-Ortiz, Andrei Stoian, David Filliat
Abstract Which generative model is the most suitable for Continual Learning? This paper aims at evaluating and comparing generative models on disjoint sequential image generation tasks. We investigate how several models learn and forget, considering various strategies: rehearsal, regularization, generative replay and fine-tuning. We used two quantitative metrics to estimate the generation quality and memory ability. We experiment with sequential tasks on three commonly used benchmarks for Continual Learning (MNIST, Fashion MNIST and CIFAR10). We found that among all models, the original GAN performs best and among Continual Learning strategies, generative replay outperforms all other methods. Even if we found satisfactory combinations on MNIST and Fashion MNIST, training generative models sequentially on CIFAR10 is particularly instable, and remains a challenge. Our code is available online \footnote{\url{https://github.com/TLESORT/Generative_Continual_Learning}}.
Tasks Continual Learning, Image Generation
Published 2018-12-21
URL http://arxiv.org/abs/1812.09111v1
PDF http://arxiv.org/pdf/1812.09111v1.pdf
PWC https://paperswithcode.com/paper/generative-models-from-the-perspective-of
Repo https://github.com/TLESORT/Generative_Continual_Learning
Framework pytorch
Title An Information-theoretic Framework for the Lossy Compression of Link Streams
Authors Robin Lamarche-Perrin
Abstract Graph compression is a data analysis technique that consists in the replacement of parts of a graph by more general structural patterns in order to reduce its description length. It notably provides interesting exploration tools for the study of real, large-scale, and complex graphs which cannot be grasped at first glance. This article proposes a framework for the compression of temporal graphs, that is for the compression of graphs that evolve with time. This framework first builds on a simple and limited scheme, exploiting structural equivalence for the lossless compression of static graphs, then generalises it to the lossy compression of link streams, a recent formalism for the study of temporal graphs. Such generalisation relies on the natural extension of (bidimensional) relational data by the addition of a third temporal dimension. Moreover, we introduce an information-theoretic measure to quantify and to control the information that is lost during compression, as well as an algebraic characterisation of the space of possible compression patterns to enhance the expressiveness of the initial compression scheme. These contributions lead to the definition of a combinatorial optimisation problem, that is the Lossy Multistream Compression Problem, for which we provide an exact algorithm.
Tasks
Published 2018-07-18
URL http://arxiv.org/abs/1807.06874v1
PDF http://arxiv.org/pdf/1807.06874v1.pdf
PWC https://paperswithcode.com/paper/an-information-theoretic-framework-for-the
Repo https://github.com/Lamarche-Perrin/greedy-graph-compression
Framework none

Evaluating Overfit and Underfit in Models of Network Community Structure

Title Evaluating Overfit and Underfit in Models of Network Community Structure
Authors Amir Ghasemian, Homa Hosseinmardi, Aaron Clauset
Abstract A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network’s connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.
Tasks Community Detection, Link Prediction
Published 2018-02-28
URL http://arxiv.org/abs/1802.10582v3
PDF http://arxiv.org/pdf/1802.10582v3.pdf
PWC https://paperswithcode.com/paper/evaluating-overfit-and-underfit-in-models-of
Repo https://github.com/AGhasemian/CommunityFitNet
Framework none

Forecasting Future Humphrey Visual Fields Using Deep Learning

Title Forecasting Future Humphrey Visual Fields Using Deep Learning
Authors Joanne C. Wen, Cecilia S. Lee, Pearse A. Keane, Sa Xiao, Yue Wu, Ariel Rokem, Philip P. Chen, Aaron Y. Lee
Abstract Purpose: To determine if deep learning networks could be trained to forecast a future 24-2 Humphrey Visual Field (HVF). Participants: All patients who obtained a HVF 24-2 at the University of Washington. Methods: All datapoints from consecutive 24-2 HVFs from 1998 to 2018 were extracted from a University of Washington database. Ten-fold cross validation with a held out test set was used to develop the three main phases of model development: model architecture selection, dataset combination selection, and time-interval model training with transfer learning, to train a deep learning artificial neural network capable of generating a point-wise visual field prediction. Results: More than 1.7 million perimetry points were extracted to the hundredth decibel from 32,443 24-2 HVFs. The best performing model with 20 million trainable parameters, CascadeNet-5, was selected. The overall MAE for the test set was 2.47 dB (95% CI: 2.45 dB to 2.48 dB). The 100 fully trained models were able to successfully predict progressive field loss in glaucomatous eyes up to 5.5 years in the future with a correlation of 0.92 between the MD of predicted and actual future HVF (p < 2.2 x 10 -16 ) and an average difference of 0.41 dB. Conclusions: Using unfiltered real-world datasets, deep learning networks show an impressive ability to not only learn spatio-temporal HVF changes but also to generate predictions for future HVFs up to 5.5 years, given only a single HVF.
Tasks Transfer Learning
Published 2018-04-02
URL http://arxiv.org/abs/1804.04543v1
PDF http://arxiv.org/pdf/1804.04543v1.pdf
PWC https://paperswithcode.com/paper/forecasting-future-humphrey-visual-fields
Repo https://github.com/uw-biomedical-ml/hvfProgression
Framework tf

Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions

Title Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Authors Albert Gatt, Marc Tanti, Adrian Muscat, Patrizia Paggio, Reuben A. Farrugia, Claudia Borg, Kenneth P. Camilleri, Mike Rosner, Lonneke van der Plas
Abstract The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken `in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods. |
Tasks
Published 2018-03-10
URL http://arxiv.org/abs/1803.03827v1
PDF http://arxiv.org/pdf/1803.03827v1.pdf
PWC https://paperswithcode.com/paper/face2text-collecting-an-annotated-image
Repo https://github.com/akanimax/T2F
Framework pytorch
comments powered by Disqus