Paper Group AWR 207
Self-Imitation Learning. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. Cross-Target Stance Classification with Self-Attention Networks. Appendix - Recommended Statistical Significance Tests for NLP Tasks. Abstractive Summarization of Reddit Posts with Multi-level Memory Networks. Semantic Human …
Self-Imitation Learning
Title | Self-Imitation Learning |
Authors | Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee |
Abstract | This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent’s past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks. |
Tasks | Atari Games, Imitation Learning |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05635v1 |
http://arxiv.org/pdf/1806.05635v1.pdf | |
PWC | https://paperswithcode.com/paper/self-imitation-learning |
Repo | https://github.com/rwightman/pytorch-opensim-rl |
Framework | pytorch |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Title | Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Authors | Anish Athalye, Nicholas Carlini, David Wagner |
Abstract | We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers. |
Tasks | Adversarial Attack, Adversarial Defense |
Published | 2018-02-01 |
URL | http://arxiv.org/abs/1802.00420v4 |
http://arxiv.org/pdf/1802.00420v4.pdf | |
PWC | https://paperswithcode.com/paper/obfuscated-gradients-give-a-false-sense-of |
Repo | https://github.com/anishathalye/obfuscated-gradients |
Framework | tf |
Cross-Target Stance Classification with Self-Attention Networks
Title | Cross-Target Stance Classification with Self-Attention Networks |
Authors | Chang Xu, Cecile Paris, Surya Nepal, Ross Sparks |
Abstract | In stance classification, the target on which the stance is made defines the boundary of the task, and a classifier is usually trained for prediction on the same target. In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. We show that our model can find useful information shared between relevant targets which improves generalization in certain scenarios. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06593v2 |
http://arxiv.org/pdf/1805.06593v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-target-stance-classification-with-self |
Repo | https://github.com/nuaaxc/cross_target_stance_classification |
Framework | tf |
Appendix - Recommended Statistical Significance Tests for NLP Tasks
Title | Appendix - Recommended Statistical Significance Tests for NLP Tasks |
Authors | Rotem Dror, Roi Reichart |
Abstract | Statistical significance testing plays an important role when drawing conclusions from experimental results in NLP papers. Particularly, it is a valuable tool when one would like to establish the superiority of one algorithm over another. This appendix complements the guide for testing statistical significance in NLP presented in \cite{dror2018hitchhiker} by proposing valid statistical tests for the common tasks and evaluation measures in the field. |
Tasks | |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01448v1 |
http://arxiv.org/pdf/1809.01448v1.pdf | |
PWC | https://paperswithcode.com/paper/appendix-recommended-statistical-significance |
Repo | https://github.com/rtmdrr/testSignificanceNLP |
Framework | none |
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Title | Abstractive Summarization of Reddit Posts with Multi-level Memory Networks |
Authors | Byeongchang Kim, Hyunwoo Kim, Gunhee Kim |
Abstract | We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models. |
Tasks | Abstractive Text Summarization |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00783v2 |
http://arxiv.org/pdf/1811.00783v2.pdf | |
PWC | https://paperswithcode.com/paper/abstractive-summarization-of-reddit-posts |
Repo | https://github.com/ctr4si/MMN |
Framework | none |
Semantic Human Matting
Title | Semantic Human Matting |
Authors | Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, Kun Gai |
Abstract | Human matting, high quality extraction of humans from natural images, is crucial for a wide variety of applications. Since the matting problem is severely under-constrained, most previous methods require user interactions to take user designated trimaps or scribbles as constraints. This user-in-the-loop nature makes them difficult to be applied to large scale data or time-sensitive scenarios. In this paper, instead of using explicit user input constraints, we employ implicit semantic constraints learned from data and propose an automatic human matting algorithm (SHM). SHM is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks. In practice, simultaneously learning both coarse semantics and fine details is challenging. We propose a novel fusion strategy which naturally gives a probabilistic estimation of the alpha matte. We also construct a very large dataset with high quality annotations consisting of 35,513 unique foregrounds to facilitate the learning and evaluation of human matting. Extensive experiments on this dataset and plenty of real images show that SHM achieves comparable results with state-of-the-art interactive matting methods. |
Tasks | Image Matting |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01354v2 |
http://arxiv.org/pdf/1809.01354v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-human-matting |
Repo | https://github.com/pkang2017/image-matting |
Framework | none |
DESlib: A Dynamic ensemble selection library in Python
Title | DESlib: A Dynamic ensemble selection library in Python |
Authors | Rafael M. O. Cruz, Luiz G. Hafemann, Robert Sabourin, George D. C. Cavalcanti |
Abstract | DESlib is an open-source python library providing the implementation of several dynamic selection techniques. The library is divided into three modules: (i) \emph{dcs}, containing the implementation of dynamic classifier selection methods (DCS); (ii) \emph{des}, containing the implementation of dynamic ensemble selection methods (DES); (iii) \emph{static}, with the implementation of static ensemble techniques. The library is fully documented (documentation available online on Read the Docs), has a high test coverage (codecov.io) and is part of the scikit-learn-contrib supported projects. Documentation, code and examples can be found on its GitHub page: https://github.com/scikit-learn-contrib/DESlib. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04967v3 |
http://arxiv.org/pdf/1802.04967v3.pdf | |
PWC | https://paperswithcode.com/paper/deslib-a-dynamic-ensemble-selection-library |
Repo | https://github.com/redavtalab/DES |
Framework | none |
Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System
Title | Variational Autoencoding the Lagrangian Trajectories of Particles in a Combustion System |
Authors | Pai Liu, Jingwei Gan, Rajan K. Chakrabarty |
Abstract | We introduce a deep learning method to simulate the motion of particles trapped in a chaotic recirculating flame. The Lagrangian trajectories of particles, captured using a high-speed camera and subsequently reconstructed in 3-dimensional space, were used to train a variational autoencoder (VAE) which comprises multiple layers of convolutional neural networks. We show that the trajectories, which are statistically representative of those determined in experiments, can be generated using the VAE network. The performance of our model is evaluated with respect to the accuracy and generalization of the outputs. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.11896v2 |
http://arxiv.org/pdf/1811.11896v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-autoencoding-the-lagrangian |
Repo | https://github.com/deadzombie2333/Lagrangian_simulation_VAE |
Framework | tf |
Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora
Title | Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora |
Authors | Sajawel Ahmed, Alexander Mehler |
Abstract | This study improves the performance of neural named entity recognition by a margin of up to 11% in F-score on the example of a low-resource language like German, thereby outperforming existing baselines and establishing a new state-of-the-art on each single open-source dataset. Rather than designing deeper and wider hybrid neural architectures, we gather all available resources and perform a detailed optimization and grammar-dependent morphological processing consisting of lemmatization and part-of-speech tagging prior to exposing the raw data to any training process. We test our approach in a threefold monolingual experimental setup of a) single, b) joint, and c) optimized training and shed light on the dependency of downstream-tasks on the size of corpora used to compute word embeddings. |
Tasks | Lemmatization, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10675v1 |
http://arxiv.org/pdf/1807.10675v1.pdf | |
PWC | https://paperswithcode.com/paper/resource-size-matters-improving-neural-named |
Repo | https://github.com/FID-Biodiversity/GermanWordEmbeddings-NER |
Framework | none |
Training Competitive Binary Neural Networks from Scratch
Title | Training Competitive Binary Neural Networks from Scratch |
Authors | Joseph Bethge, Marvin Bornstein, Adrian Loy, Haojin Yang, Christoph Meinel |
Abstract | Convolutional neural networks have achieved astonishing results in different application areas. Various methods that allow us to use these models on mobile and embedded devices have been proposed. Especially binary neural networks are a promising approach for devices with low computational power. However, training accurate binary models from scratch remains a challenge. Previous work often uses prior knowledge from full-precision models and complex training strategies. In our work, we focus on increasing the performance of binary neural networks without such prior knowledge and a much simpler training strategy. In our experiments we show that we are able to achieve state-of-the-art results on standard benchmark datasets. Further, to the best of our knowledge, we are the first to successfully adopt a network architecture with dense connections for binary networks, which lets us improve the state-of-the-art even further. |
Tasks | |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01965v1 |
http://arxiv.org/pdf/1812.01965v1.pdf | |
PWC | https://paperswithcode.com/paper/training-competitive-binary-neural-networks |
Repo | https://github.com/hpi-xnor/BMXNet-v2 |
Framework | mxnet |
Generative Models from the perspective of Continual Learning
Title | Generative Models from the perspective of Continual Learning |
Authors | Timothée Lesort, Hugo Caselles-Dupré, Michael Garcia-Ortiz, Andrei Stoian, David Filliat |
Abstract | Which generative model is the most suitable for Continual Learning? This paper aims at evaluating and comparing generative models on disjoint sequential image generation tasks. We investigate how several models learn and forget, considering various strategies: rehearsal, regularization, generative replay and fine-tuning. We used two quantitative metrics to estimate the generation quality and memory ability. We experiment with sequential tasks on three commonly used benchmarks for Continual Learning (MNIST, Fashion MNIST and CIFAR10). We found that among all models, the original GAN performs best and among Continual Learning strategies, generative replay outperforms all other methods. Even if we found satisfactory combinations on MNIST and Fashion MNIST, training generative models sequentially on CIFAR10 is particularly instable, and remains a challenge. Our code is available online \footnote{\url{https://github.com/TLESORT/Generative_Continual_Learning}}. |
Tasks | Continual Learning, Image Generation |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.09111v1 |
http://arxiv.org/pdf/1812.09111v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-models-from-the-perspective-of |
Repo | https://github.com/TLESORT/Generative_Continual_Learning |
Framework | pytorch |
An Information-theoretic Framework for the Lossy Compression of Link Streams
Title | An Information-theoretic Framework for the Lossy Compression of Link Streams |
Authors | Robin Lamarche-Perrin |
Abstract | Graph compression is a data analysis technique that consists in the replacement of parts of a graph by more general structural patterns in order to reduce its description length. It notably provides interesting exploration tools for the study of real, large-scale, and complex graphs which cannot be grasped at first glance. This article proposes a framework for the compression of temporal graphs, that is for the compression of graphs that evolve with time. This framework first builds on a simple and limited scheme, exploiting structural equivalence for the lossless compression of static graphs, then generalises it to the lossy compression of link streams, a recent formalism for the study of temporal graphs. Such generalisation relies on the natural extension of (bidimensional) relational data by the addition of a third temporal dimension. Moreover, we introduce an information-theoretic measure to quantify and to control the information that is lost during compression, as well as an algebraic characterisation of the space of possible compression patterns to enhance the expressiveness of the initial compression scheme. These contributions lead to the definition of a combinatorial optimisation problem, that is the Lossy Multistream Compression Problem, for which we provide an exact algorithm. |
Tasks | |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06874v1 |
http://arxiv.org/pdf/1807.06874v1.pdf | |
PWC | https://paperswithcode.com/paper/an-information-theoretic-framework-for-the |
Repo | https://github.com/Lamarche-Perrin/greedy-graph-compression |
Framework | none |
Evaluating Overfit and Underfit in Models of Network Community Structure
Title | Evaluating Overfit and Underfit in Models of Network Community Structure |
Authors | Amir Ghasemian, Homa Hosseinmardi, Aaron Clauset |
Abstract | A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network’s connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared. |
Tasks | Community Detection, Link Prediction |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10582v3 |
http://arxiv.org/pdf/1802.10582v3.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-overfit-and-underfit-in-models-of |
Repo | https://github.com/AGhasemian/CommunityFitNet |
Framework | none |
Forecasting Future Humphrey Visual Fields Using Deep Learning
Title | Forecasting Future Humphrey Visual Fields Using Deep Learning |
Authors | Joanne C. Wen, Cecilia S. Lee, Pearse A. Keane, Sa Xiao, Yue Wu, Ariel Rokem, Philip P. Chen, Aaron Y. Lee |
Abstract | Purpose: To determine if deep learning networks could be trained to forecast a future 24-2 Humphrey Visual Field (HVF). Participants: All patients who obtained a HVF 24-2 at the University of Washington. Methods: All datapoints from consecutive 24-2 HVFs from 1998 to 2018 were extracted from a University of Washington database. Ten-fold cross validation with a held out test set was used to develop the three main phases of model development: model architecture selection, dataset combination selection, and time-interval model training with transfer learning, to train a deep learning artificial neural network capable of generating a point-wise visual field prediction. Results: More than 1.7 million perimetry points were extracted to the hundredth decibel from 32,443 24-2 HVFs. The best performing model with 20 million trainable parameters, CascadeNet-5, was selected. The overall MAE for the test set was 2.47 dB (95% CI: 2.45 dB to 2.48 dB). The 100 fully trained models were able to successfully predict progressive field loss in glaucomatous eyes up to 5.5 years in the future with a correlation of 0.92 between the MD of predicted and actual future HVF (p < 2.2 x 10 -16 ) and an average difference of 0.41 dB. Conclusions: Using unfiltered real-world datasets, deep learning networks show an impressive ability to not only learn spatio-temporal HVF changes but also to generate predictions for future HVFs up to 5.5 years, given only a single HVF. |
Tasks | Transfer Learning |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.04543v1 |
http://arxiv.org/pdf/1804.04543v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-future-humphrey-visual-fields |
Repo | https://github.com/uw-biomedical-ml/hvfProgression |
Framework | tf |
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Title | Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions |
Authors | Albert Gatt, Marc Tanti, Adrian Muscat, Patrizia Paggio, Reuben A. Farrugia, Claudia Borg, Kenneth P. Camilleri, Mike Rosner, Lonneke van der Plas |
Abstract | The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken `in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods. | |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03827v1 |
http://arxiv.org/pdf/1803.03827v1.pdf | |
PWC | https://paperswithcode.com/paper/face2text-collecting-an-annotated-image |
Repo | https://github.com/akanimax/T2F |
Framework | pytorch |