May 7, 2019

2937 words 14 mins read

Paper Group AWR 50

Paper Group AWR 50

Expert Gate: Lifelong Learning with a Network of Experts. Adversarial examples in the physical world. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning. COCO: A Platform for Comparing Continuous Optimizers in a Bla …

Expert Gate: Lifelong Learning with a Network of Experts

Title Expert Gate: Lifelong Learning with a Network of Experts
Authors Rahaf Aljundi, Punarjay Chakravarty, Tinne Tuytelaars
Abstract In this paper we introduce a model of lifelong learning, based on a Network of Experts. New tasks / experts are learned and added to the model sequentially, building on what was learned before. To ensure scalability of this process,data from previous tasks cannot be stored and hence is not available when learning a new task. A critical issue in such context, not addressed in the literature so far, relates to the decision which expert to deploy at test time. We introduce a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically forward the test sample to the relevant expert. This also brings memory efficiency as only one expert network has to be loaded into memory at any given time. Further, the autoencoders inherently capture the relatedness of one task to another, based on which the most relevant prior model to be used for training a new expert, with finetuning or learning without-forgetting, can be selected. We evaluate our method on image classification and video prediction problems.
Tasks Image Classification, Video Prediction
Published 2016-11-18
URL http://arxiv.org/abs/1611.06194v2
PDF http://arxiv.org/pdf/1611.06194v2.pdf
PWC https://paperswithcode.com/paper/expert-gate-lifelong-learning-with-a-network
Repo https://github.com/wannabeOG/ExpertNet-Pytorch
Framework pytorch

Adversarial examples in the physical world

Title Adversarial examples in the physical world
Authors Alexey Kurakin, Ian Goodfellow, Samy Bengio
Abstract Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifier still makes a mistake. Adversarial examples pose security concerns because they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier. This is not always the case for systems operating in the physical world, for example those which are using signals from cameras and other sensors as an input. This paper shows that even in such physical world scenarios, machine learning systems are vulnerable to adversarial examples. We demonstrate this by feeding adversarial images obtained from cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system. We find that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera.
Tasks
Published 2016-07-08
URL http://arxiv.org/abs/1607.02533v4
PDF http://arxiv.org/pdf/1607.02533v4.pdf
PWC https://paperswithcode.com/paper/adversarial-examples-in-the-physical-world
Repo https://github.com/1Konny/FGSM
Framework pytorch

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

Title InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
Authors Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel
Abstract This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound to the mutual information objective that can be optimized efficiently, and show that our training procedure can be interpreted as a variation of the Wake-Sleep algorithm. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods.
Tasks Image Generation, Representation Learning, Unsupervised Image Classification, Unsupervised MNIST
Published 2016-06-12
URL http://arxiv.org/abs/1606.03657v1
PDF http://arxiv.org/pdf/1606.03657v1.pdf
PWC https://paperswithcode.com/paper/infogan-interpretable-representation-learning
Repo https://github.com/sidneyp/bidirectional
Framework tf

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning

Title Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Authors Jiasen Lu, Caiming Xiong, Devi Parikh, Richard Socher
Abstract Attention-based neural encoder-decoder frameworks have been widely adopted for image captioning. Most methods force visual attention to be active for every generated word. However, the decoder likely requires little to no visual information from the image to predict non-visual words such as “the” and “of”. Other words that may seem visual can often be predicted reliably just from the language model e.g., “sign” after “behind a red stop” or “phone” following “talking on a cell”. In this paper, we propose a novel adaptive attention model with a visual sentinel. At each time step, our model decides whether to attend to the image (and if so, to which regions) or to the visual sentinel. The model decides whether to attend to the image and where, in order to extract meaningful information for sequential word generation. We test our method on the COCO image captioning 2015 challenge dataset and Flickr30K. Our approach sets the new state-of-the-art by a significant margin.
Tasks Image Captioning, Language Modelling
Published 2016-12-06
URL http://arxiv.org/abs/1612.01887v2
PDF http://arxiv.org/pdf/1612.01887v2.pdf
PWC https://paperswithcode.com/paper/knowing-when-to-look-adaptive-attention-via-a
Repo https://github.com/miroblog/AdaptiveAttention
Framework pytorch

COCO: A Platform for Comparing Continuous Optimizers in a Black-Box Setting

Title COCO: A Platform for Comparing Continuous Optimizers in a Black-Box Setting
Authors Nikolaus Hansen, Anne Auger, Olaf Mersmann, Tea Tusar, Dimo Brockhoff
Abstract COCO is a platform for Comparing Continuous Optimizers in a black-box setting. It aims at automatizing the tedious and repetitive task of benchmarking numerical optimization algorithms to the greatest possible extent. We present the rationals behind the development of the platform as a general proposition for a guideline towards better benchmarking. We detail underlying fundamental concepts of COCO such as its definition of a problem, the idea of instances, the relevance of target values, and runtime as central performance measure. Finally, we give a quick overview of the basic code structure and the available test suites.
Tasks
Published 2016-03-29
URL http://arxiv.org/abs/1603.08785v3
PDF http://arxiv.org/pdf/1603.08785v3.pdf
PWC https://paperswithcode.com/paper/coco-a-platform-for-comparing-continuous
Repo https://github.com/numbbo/coco
Framework none

Conditional Image Generation with PixelCNN Decoders

Title Conditional Image Generation with PixelCNN Decoders
Authors Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu
Abstract This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other networks. When conditioned on class labels from the ImageNet database, the model is able to generate diverse, realistic scenes representing distinct animals, objects, landscapes and structures. When conditioned on an embedding produced by a convolutional network given a single image of an unseen face, it generates a variety of new portraits of the same person with different facial expressions, poses and lighting conditions. We also show that conditional PixelCNN can serve as a powerful decoder in an image autoencoder. Additionally, the gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost.
Tasks Conditional Image Generation, Image Generation
Published 2016-06-16
URL http://arxiv.org/abs/1606.05328v2
PDF http://arxiv.org/pdf/1606.05328v2.pdf
PWC https://paperswithcode.com/paper/conditional-image-generation-with-pixelcnn
Repo https://github.com/openai/pixel-cnn
Framework tf

Deep Visual Foresight for Planning Robot Motion

Title Deep Visual Foresight for Planning Robot Motion
Authors Chelsea Finn, Sergey Levine
Abstract A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback. Model-based reinforcement learning holds the promise of enabling an agent to learn to predict the effects of its actions, which could provide flexible predictive models for a wide range of tasks and environments, without detailed human supervision. We develop a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data. Our approach does not require a calibrated camera, an instrumented training set-up, nor precise sensing and actuation. Our results show that our method enables a real robot to perform nonprehensile manipulation – pushing objects – and can handle novel objects not seen during training.
Tasks Video Prediction
Published 2016-10-03
URL http://arxiv.org/abs/1610.00696v2
PDF http://arxiv.org/pdf/1610.00696v2.pdf
PWC https://paperswithcode.com/paper/deep-visual-foresight-for-planning-robot
Repo https://github.com/m-serra/action-inference-for-video-prediction-benchmarking
Framework tf

Invertible Conditional GANs for image editing

Title Invertible Conditional GANs for image editing
Authors Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, Jose M. Álvarez
Abstract Generative Adversarial Networks (GANs) have recently demonstrated to successfully approximate complex data distributions. A relevant extension of this model is conditional GANs (cGANs), where the introduction of external information allows to determine specific representations of the generated images. In this work, we evaluate encoders to inverse the mapping of a cGAN, i.e., mapping a real image into a latent space and a conditional representation. This allows, for example, to reconstruct and modify real images of faces conditioning on arbitrary attributes. Additionally, we evaluate the design of cGANs. The combination of an encoder with a cGAN, which we call Invertible cGAN (IcGAN), enables to re-generate real images with deterministic complex modifications.
Tasks Conditional Image Generation, Image-to-Image Translation
Published 2016-11-19
URL http://arxiv.org/abs/1611.06355v1
PDF http://arxiv.org/pdf/1611.06355v1.pdf
PWC https://paperswithcode.com/paper/invertible-conditional-gans-for-image-editing
Repo https://github.com/AZHARTHEGEEK/GAN_s
Framework none

Improving Sampling from Generative Autoencoders with Markov Chains

Title Improving Sampling from Generative Autoencoders with Markov Chains
Authors Antonia Creswell, Kai Arulkumaran, Anil Anthony Bharath
Abstract We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model. Generative autoencoders are those which are trained to softly enforce a prior on the latent distribution learned by the inference model. We call the distribution to which the inference model maps observed samples, the learned latent distribution, which may not be consistent with the prior. We formulate a Markov chain Monte Carlo (MCMC) sampling process, equivalent to iteratively decoding and encoding, which allows us to sample from the learned latent distribution. Since, the generative model learns to map from the learned latent distribution, rather than the prior, we may use MCMC to improve the quality of samples drawn from the generative model, especially when the learned latent distribution is far from the prior. Using MCMC sampling, we are able to reveal previously unseen differences between generative autoencoders trained either with or without a denoising criterion.
Tasks
Published 2016-10-28
URL http://arxiv.org/abs/1610.09296v3
PDF http://arxiv.org/pdf/1610.09296v3.pdf
PWC https://paperswithcode.com/paper/improving-sampling-from-generative
Repo https://github.com/Kaixhin/Autoencoders
Framework torch

A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images

Title A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images
Authors David Vázquez, Jorge Bernal, F. Javier Sánchez, Gloria Fernández-Esparrach, Antonio M. López, Adriana Romero, Michal Drozdzal, Aaron Courville
Abstract Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation.
Tasks Scene Segmentation, Semantic Segmentation
Published 2016-12-02
URL http://arxiv.org/abs/1612.00799v1
PDF http://arxiv.org/pdf/1612.00799v1.pdf
PWC https://paperswithcode.com/paper/a-benchmark-for-endoluminal-scene
Repo https://github.com/guilhermesantos/Semantic-Image-Segmentation
Framework pytorch

Chained Gaussian Processes

Title Chained Gaussian Processes
Authors Alan D. Saul, James Hensman, Aki Vehtari, Neil D. Lawrence
Abstract Gaussian process models are flexible, Bayesian non-parametric approaches to regression. Properties of multivariate Gaussians mean that they can be combined linearly in the manner of additive models and via a link function (like in generalized linear models) to handle non-Gaussian data. However, the link function formalism is restrictive, link functions are always invertible and must convert a parameter of interest to a linear combination of the underlying processes. There are many likelihoods and models where a non-linear combination is more appropriate. We term these more general models Chained Gaussian Processes: the transformation of the GPs to the likelihood parameters will not generally be invertible, and that implies that linearisation would only be possible with multiple (localized) links, i.e. a chain. We develop an approximate inference procedure for Chained GPs that is scalable and applicable to any factorized likelihood. We demonstrate the approximation on a range of likelihood functions.
Tasks Gaussian Processes
Published 2016-04-18
URL http://arxiv.org/abs/1604.05263v1
PDF http://arxiv.org/pdf/1604.05263v1.pdf
PWC https://paperswithcode.com/paper/chained-gaussian-processes
Repo https://github.com/SheffieldML/ChainedGP
Framework none

Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

Title Known Unknowns: Uncertainty Quality in Bayesian Neural Networks
Authors Ramon Oliveira, Pedro Tabacof, Eduardo Valle
Abstract We evaluate the uncertainty quality in neural networks using anomaly detection. We extract uncertainty measures (e.g. entropy) from the predictions of candidate models, use those measures as features for an anomaly detector, and gauge how well the detector differentiates known from unknown classes. We assign higher uncertainty quality to candidate models that lead to better detectors. We also propose a novel method for sampling a variational approximation of a Bayesian neural network, called One-Sample Bayesian Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We compare the following candidate neural network models: Maximum Likelihood, Bayesian Dropout, OSBA, and — for MNIST — the standard variational approximation. We show that Bayesian Dropout and OSBA provide better uncertainty information than Maximum Likelihood, and are essentially equivalent to the standard variational approximation, but much faster.
Tasks Anomaly Detection
Published 2016-12-05
URL http://arxiv.org/abs/1612.01251v2
PDF http://arxiv.org/pdf/1612.01251v2.pdf
PWC https://paperswithcode.com/paper/known-unknowns-uncertainty-quality-in
Repo https://github.com/ramon-oliveira/deepstats
Framework none

Lost in Space: Geolocation in Event Data

Title Lost in Space: Geolocation in Event Data
Authors Sophie J. Lee, Howard Liu, Michael D. Ward
Abstract Extracting the “correct” location information from text data, i.e., determining the place of event, has long been a goal for automated text processing. To approximate human-like coding schema, we introduce a supervised machine learning algorithm that classifies each location word to be either correct or incorrect. We use news articles collected from around the world (Integrated Crisis Early Warning System [ICEWS] data and Open Event Data Alliance [OEDA] data) to test our algorithm that consists of two stages. In the feature selection stage, we extract contextual information from texts, namely, the N-gram patterns for location words, the frequency of mention, and the context of the sentences containing location words. In the classification stage, we use three classifiers to estimate the model parameters in the training set and then to predict whether a location word in the test set news articles is the place of the event. The validation results show that our algorithm improves the accuracy rate of the current geolocation methods of dictionary approach by as much as 25%.
Tasks Feature Selection
Published 2016-11-14
URL http://arxiv.org/abs/1611.04837v1
PDF http://arxiv.org/pdf/1611.04837v1.pdf
PWC https://paperswithcode.com/paper/lost-in-space-geolocation-in-event-data
Repo https://github.com/haoliuhoward/LostinSpace-PSRM
Framework none

Tensorial Mixture Models

Title Tensorial Mixture Models
Authors Or Sharir, Ronen Tamari, Nadav Cohen, Amnon Shashua
Abstract Casting neural networks in generative frameworks is a highly sought-after endeavor these days. Contemporary methods, such as Generative Adversarial Networks, capture some of the generative capabilities, but not all. In particular, they lack the ability of tractable marginalization, and thus are not suitable for many tasks. Other methods, based on arithmetic circuits and sum-product networks, do allow tractable marginalization, but their performance is challenged by the need to learn the structure of a circuit. Building on the tractability of arithmetic circuits, we leverage concepts from tensor analysis, and derive a family of generative models we call Tensorial Mixture Models (TMMs). TMMs assume a simple convolutional network structure, and in addition, lend themselves to theoretical analyses that allow comprehensive understanding of the relation between their structure and their expressive properties. We thus obtain a generative model that is tractable on one hand, and on the other hand, allows effective representation of rich distributions in an easily controlled manner. These two capabilities are brought together in the task of classification under missing data, where TMMs deliver state of the art accuracies with seamless implementation and design.
Tasks
Published 2016-10-13
URL http://arxiv.org/abs/1610.04167v5
PDF http://arxiv.org/pdf/1610.04167v5.pdf
PWC https://paperswithcode.com/paper/tensorial-mixture-models
Repo https://github.com/HUJI-Deep/caffe-simnets
Framework none

End-to-End Instance Segmentation with Recurrent Attention

Title End-to-End Instance Segmentation with Recurrent Attention
Authors Mengye Ren, Richard S. Zemel
Abstract While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combine large graphical models with low-level vision have been proposed to address this problem; however, we propose an end-to-end recurrent neural network (RNN) architecture with an attention mechanism to model a human-like counting process, and produce detailed instance segmentations. The network is jointly trained to sequentially produce regions of interest as well as a dominant object segmentation within each region. The proposed model achieves competitive results on the CVPPP, KITTI, and Cityscapes datasets.
Tasks Autonomous Driving, Image Captioning, Instance Segmentation, Question Answering, Semantic Segmentation, Structured Prediction, Visual Question Answering
Published 2016-05-30
URL http://arxiv.org/abs/1605.09410v5
PDF http://arxiv.org/pdf/1605.09410v5.pdf
PWC https://paperswithcode.com/paper/end-to-end-instance-segmentation-with
Repo https://github.com/renmengye/rec-attend-public
Framework tf
comments powered by Disqus