April 1, 2020

3014 words 15 mins read

Paper Group NANR 86

Paper Group NANR 86

Meta-Learning Initializations for Image Segmentation. Measuring the Reliability of Reinforcement Learning Algorithms. How many weights are enough : can tensor factorization learn efficient policies ?. Regularization Matters in Policy Optimization. Deceptive Opponent Modeling with Proactive Network Interdiction for Stochastic Goal Recognition Contro …

Meta-Learning Initializations for Image Segmentation

Title Meta-Learning Initializations for Image Segmentation
Authors Anonymous
Abstract While meta-learning approaches that utilize neural network representations have made progress in few-shot image classification, reinforcement learning, and, more recently, image semantic segmentation, the training algorithms and model architectures have become increasingly specialized to the few-shot domain. A natural question that arises is how to develop learning systems that scale from few-shot to many-shot settings while yielding human level performance in both. One scalable potential approach that does not require ensembling many models nor the computational costs of relation networks, is to meta-learn an initialization. In this work, we study first-order meta-learning of initializations for deep neural networks that must produce dense, structured predictions given an arbitrary amount of train- ing data for a new task. Our primary contributions include (1), an extension and experimental analysis of first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, (2) a formalization of the generalization error of episodic meta-learning algorithms, which we leverage to decrease error on unseen tasks, (3) a novel neural network architecture built for parameter efficiency which we call EfficientLab, and (4) an empirical study of how meta-learned initializations compare to ImageNet initializations as the training set size increases. We show that meta-learned initializations for image segmentation smoothly transition from canonical few-shot learning problems to larger datasets, outperforming random and ImageNet-trained initializations. Finally, we show both theoretically and empirically that a key limitation of MAML-type algorithms is that when adapting to new tasks, a single update procedure is used that is not conditioned on the data. We find that our network, with an empirically estimated optimal update procedure yields state of the art results on the FSS-1000 dataset, while only requiring one forward pass through a single model at evaluation time.
Tasks Few-Shot Image Classification, Few-Shot Learning, Image Classification, Meta-Learning, Semantic Segmentation
Published 2020-01-01
URL https://openreview.net/forum?id=SJgdpxHFvH
PDF https://openreview.net/pdf?id=SJgdpxHFvH
PWC https://paperswithcode.com/paper/meta-learning-initializations-for-image
Repo
Framework

Measuring the Reliability of Reinforcement Learning Algorithms

Title Measuring the Reliability of Reinforcement Learning Algorithms
Authors Anonymous
Abstract Inadequate reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a novel set of metrics that quantitatively measure different aspects of reliability. In this work, we address variability and risk, both during training and after learning (on a fixed policy). We designed these metrics to be general-purpose, and we also designed complementary statistical tests to enable rigorous comparisons on these metrics. In this paper, we first describe the desired properties of the metrics and their design, the aspects of reliability that they measure, and their applicability to different scenarios. We then describe the statistical tests and make additional practical recommendations for reporting results. Finally, we apply our metrics to a set of common RL algorithms and environments, compare them, and analyze the results.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SJlpYJBKvH
PDF https://openreview.net/pdf?id=SJlpYJBKvH
PWC https://paperswithcode.com/paper/measuring-the-reliability-of-reinforcement
Repo
Framework

How many weights are enough : can tensor factorization learn efficient policies ?

Title How many weights are enough : can tensor factorization learn efficient policies ?
Authors Anonymous
Abstract Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation. In this work, we employ tensor factorization in order to learn more compact representations for reinforcement learning policies. We show empirically that in the low-data regime, it is possible to learn online policies with 2 to 10 times less total coefficients, with little to no loss of performance. We also leverage progress in second order optimization, and use the theory of wavelet scattering to further reduce the number of learned coefficients, by foregoing learning the topmost convolutional layer filters altogether. We evaluate our results on the Atari suite against recent baseline algorithms that represent the state-of-the-art in data efficiency, and get comparable results with an order of magnitude gain in weight parsimony.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=B1l3M64KwB
PDF https://openreview.net/pdf?id=B1l3M64KwB
PWC https://paperswithcode.com/paper/how-many-weights-are-enough-can-tensor
Repo
Framework

Regularization Matters in Policy Optimization

Title Regularization Matters in Policy Optimization
Authors Anonymous
Abstract Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., $L_2$ regularization, dropout) have been largely ignored in RL methods, possibly because agents are typically trained and evaluated in the same environment. In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks. Interestingly, we find conventional regularization techniques on the policy networks can often bring large improvement on the task performance, and the improvement is typically more significant when the task is more difficult. We also compare with the widely used entropy regularization and find $L_2$ regularization is generally better. Our findings are further confirmed to be robust against the choice of training hyperparameters. We also study the effects of regularizing different components and find that only regularizing the policy network is typically enough. We hope our study provides guidance for future practices in regularizing policy optimization algorithms.
Tasks Continuous Control
Published 2020-01-01
URL https://openreview.net/forum?id=B1lqDertwr
PDF https://openreview.net/pdf?id=B1lqDertwr
PWC https://paperswithcode.com/paper/regularization-matters-in-policy-optimization
Repo
Framework

Deceptive Opponent Modeling with Proactive Network Interdiction for Stochastic Goal Recognition Control

Title Deceptive Opponent Modeling with Proactive Network Interdiction for Stochastic Goal Recognition Control
Authors Anonymous
Abstract Goal recognition based on the observations of the behaviors collected online has been used to model some potential applications. Newly formulated problem of goal recognition design aims at facilitating the online goal recognition process by performing offline redesign of the underlying environment with hard action removal. In this paper, we propose the stochastic goal recognition control (S-GRC) problem with two main stages: (1) deceptive opponent modeling based on maximum entropy regularized Markov decision processes (MDPs) and (2) goal recognition control under proactively static interdiction. For the purpose of evaluation, we propose to use the worst case distinctiveness (wcd) as a measure of the non-distinctive path without revealing the true goals, the task of S-GRC is to interdict a set of actions that improve or reduce the wcd. We empirically demonstrate that our proposed approach control the goal recognition process based on opponent’s deceptive behavior.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJgqalBKvH
PDF https://openreview.net/pdf?id=rJgqalBKvH
PWC https://paperswithcode.com/paper/deceptive-opponent-modeling-with-proactive
Repo
Framework

Learning The Difference That Makes A Difference With Counterfactually-Augmented Data

Title Learning The Difference That Makes A Difference With Counterfactually-Augmented Data
Authors Anonymous
Abstract Despite alarm over the reliance of machine learning systems on so-called spurious patterns in training data, the term lacks coherent meaning in standard statistical frameworks. However, the language of causality offers clarity: spurious associations are those due to a common cause (confounding) vs direct or indirect effects. In this paper, we focus on NLP, introducing methods and resources for training models insensitive to spurious patterns. Given documents and their initial labels, we task humans with revise each document to accord with a counterfactual target label, asking that the revised documents be internally coherent while avoiding any gratuitous changes. Interestingly, on sentiment analysis and natural language inference tasks, classifiers trained on original data fail on their counterfactually-revised counterparts and vice versa. Classifiers trained on combined datasets perform remarkably well, just shy of those specialized to either domain. While classifiers trained on either original or manipulated data alone are sensitive to spurious features (e.g., mentions of genre), models trained on the combined data are insensitive to this signal. We will publicly release both datasets.
Tasks Natural Language Inference, Sentiment Analysis
Published 2020-01-01
URL https://openreview.net/forum?id=Sklgs0NFvr
PDF https://openreview.net/pdf?id=Sklgs0NFvr
PWC https://paperswithcode.com/paper/learning-the-difference-that-makes-a-1
Repo
Framework

Fast is better than free: Revisiting adversarial training

Title Fast is better than free: Revisiting adversarial training
Authors Anonymous
Abstract Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy at epsilon=8/255 in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at epsilon=2/255 in 12 hours, in comparison to past work based on ``free’’ adversarial training which took 10 and 50 hours to reach the same respective thresholds. |
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BJx040EFvH
PDF https://openreview.net/pdf?id=BJx040EFvH
PWC https://paperswithcode.com/paper/fast-is-better-than-free-revisiting
Repo
Framework

Disentangling neural mechanisms for perceptual grouping

Title Disentangling neural mechanisms for perceptual grouping
Authors Anonymous
Abstract Forming perceptual groups and individuating objects in visual scenes is an essential step towards visual intelligence. This ability is thought to arise in the brain from computations implemented by bottom-up, horizontal, and top-down connections between neurons. However, the relative contributions of these connections to perceptual grouping are poorly understood. We address this question by systematically evaluating neural network architectures featuring combinations of these connections on two synthetic visual tasks, which stress low-level “Gestalt” vs. high-level object cues for perceptual grouping. We show that increasing the difficulty of either task strains learning for networks that rely solely on bottom-up processing. Horizontal connections resolve this limitation on tasks with Gestalt cues by supporting incremental spatial propagation of activities, whereas top-down connections rescue learning on tasks with high-level object cues by modifying coarse predictions about the position of the target object. Our findings dissociate the computational roles of bottom-up, horizontal and top-down connectivity, and demonstrate how a model featuring all of these interactions can more flexibly learn to form perceptual groups.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJxrVA4FDS
PDF https://openreview.net/pdf?id=HJxrVA4FDS
PWC https://paperswithcode.com/paper/disentangling-neural-mechanisms-for-1
Repo
Framework

Variational Constrained Reinforcement Learning with Application to Planning at Roundabout

Title Variational Constrained Reinforcement Learning with Application to Planning at Roundabout
Authors Anonymous
Abstract Planning at roundabout is crucial for autonomous driving in urban and rural environments. Reinforcement learning is promising not only in dealing with complicated environment but also taking safety constraints into account as a as a constrained Markov Decision Process. However, the safety constraints should be explicitly mathematically formulated while this is challenging for planning at roundabout due to unpredicted dynamic behavior of the obstacles. Therefore, to discriminate the obstacles’ states as either safe or unsafe is desired which is known as situation awareness modeling. In this paper, we combine variational learning and constrained reinforcement learning to simultaneously learn a Conditional Representation Model (CRM) to encode the states into safe and unsafe distributions respectively as well as to learn the corresponding safe policy. Our approach is evaluated in using Simulation of Urban Mobility (SUMO) traffic simulator and it can generalize to various traffic flows.
Tasks Autonomous Driving
Published 2020-01-01
URL https://openreview.net/forum?id=H1e3HlSFDr
PDF https://openreview.net/pdf?id=H1e3HlSFDr
PWC https://paperswithcode.com/paper/variational-constrained-reinforcement
Repo
Framework

Massively Multilingual Sparse Word Representations

Title Massively Multilingual Sparse Word Representations
Authors Anonymous
Abstract In this paper, we introduce Mamus for constructing multilingual sparse word representations. Our algorithm operates by determining a shared set of semantic units which get reutilized across languages, providing it a competitive edge both in terms of speed and evaluation performance. We demonstrate that our proposed algorithm behaves competitively to strong baselines through a series of rigorous experiments performed towards downstream applications spanning over dependency parsing, document classification and natural language inference. Additionally, our experiments relying on the QVEC-CCA evaluation score suggests that the proposed sparse word representations convey an increased interpretability as opposed to alternative approaches. Finally, we are releasing our multilingual sparse word representations for the 27 typologically diverse set of languages that we conducted our various experiments on.
Tasks Dependency Parsing, Document Classification, Natural Language Inference
Published 2020-01-01
URL https://openreview.net/forum?id=HyeYTgrFPB
PDF https://openreview.net/pdf?id=HyeYTgrFPB
PWC https://paperswithcode.com/paper/massively-multilingual-sparse-word
Repo
Framework

RaPP: Novelty Detection with Reconstruction along Projection Pathway

Title RaPP: Novelty Detection with Reconstruction along Projection Pathway
Authors Anonymous
Abstract We propose RaPP, a new methodology for novelty detection by utilizing hidden space activation values obtained from a deep autoencoder. Precisely, RaPP compares input and its autoencoder reconstruction not only in the input space but also in the hidden spaces. We show that if we feed a reconstructed input to the same autoencoder again, its activated values in a hidden space are equivalent to the corresponding reconstruction in that hidden space given the original input. In order to aggregate the hidden space activation values, we propose two metrics, which enhance the novelty detection performance. Through extensive experiments using diverse datasets, we validate that RaPP improves novelty detection performances of autoencoder-based approaches. Besides, we show that RaPP outperforms recent novelty detection methods evaluated on popular benchmarks.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HkgeGeBYDB
PDF https://openreview.net/pdf?id=HkgeGeBYDB
PWC https://paperswithcode.com/paper/rapp-novelty-detection-with-reconstruction
Repo
Framework

Noise Regularization for Conditional Density Estimation

Title Noise Regularization for Conditional Density Estimation
Authors Anonymous
Abstract Modelling statistical relationships beyond the conditional mean is crucial in many settings. Conditional density estimation (CDE) aims to learn the full conditional probability density from data. Though highly expressive, neural network based CDE models can suffer from severe over-fitting when trained with the maximum likelihood objective. Due to the inherent structure of such models, classical regularization approaches in the parameter space are rendered ineffective. To address this issue, we develop a model-agnostic noise regularization method for CDE that adds random perturbations to the data during training. We demonstrate that the proposed approach corresponds to a smoothness regularization and prove its asymptotic consistency. In our experiments, noise regularization significantly and consistently outperforms other regularization methods across seven data sets and three CDE models. The effectiveness of noise regularization makes neural network based CDE the preferable method over previous non- and semi-parametric approaches, even when training data is scarce.
Tasks Density Estimation
Published 2020-01-01
URL https://openreview.net/forum?id=rygtPhVtDS
PDF https://openreview.net/pdf?id=rygtPhVtDS
PWC https://paperswithcode.com/paper/noise-regularization-for-conditional-density-1
Repo
Framework

A Deep Recurrent Neural Network via Unfolding Reweighted l1-l1 Minimization

Title A Deep Recurrent Neural Network via Unfolding Reweighted l1-l1 Minimization
Authors Anonymous
Abstract Deep unfolding methods design deep neural networks as learned variations of optimization methods. These networks have been shown to achieve faster convergence and higher accuracy than the original optimization methods. In this line of research, this paper develops a novel deep recurrent neural network (coined reweighted-RNN) by unfolding a reweighted l1-l1 minimization algorithm and applies it to the task of sequential signal reconstruction. To the best of our knowledge, this is the first deep unfolding method that explores reweighted minimization. Due to the underlying reweighted minimization model, our RNN has a different soft-thresholding function (alias, different activation function) for each hidden unit in each layer. Furthermore, it has higher network expressivity than existing deep unfolding RNN models due to the over-parameterizing weights. Moreover, we establish theoretical generalization error bounds for the proposed reweighted-RNN model by means of Rademacher complexity. The bounds reveal that the parameterization of the proposed reweighted-RNN ensures good generalization. We apply the proposed reweighted-RNN to the problem of video-frame reconstruction from low-dimensional measurements, that is, sequential frame reconstruction. The experimental results on the moving MNIST dataset demonstrate that the proposed deep reweighted-RNN significantly outperforms existing RNN models.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=B1eYlgBYPH
PDF https://openreview.net/pdf?id=B1eYlgBYPH
PWC https://paperswithcode.com/paper/a-deep-recurrent-neural-network-via-unfolding
Repo
Framework

Continual learning with hypernetworks

Title Continual learning with hypernetworks
Authors Anonymous
Abstract Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key feature: instead of recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing task-specific weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving state-of-the-art performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display a very large capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.
Tasks Continual Learning, Transfer Learning
Published 2020-01-01
URL https://openreview.net/forum?id=SJgwNerKvB
PDF https://openreview.net/pdf?id=SJgwNerKvB
PWC https://paperswithcode.com/paper/continual-learning-with-hypernetworks
Repo
Framework

SVQN: Sequential Variational Soft Q-Learning Networks

Title SVQN: Sequential Variational Soft Q-Learning Networks
Authors Shiyu Huang, Hang Su, Jun Zhu, Ting Chen
Abstract Partially Observable Markov Decision Processes (POMDPs) are popular and flexible models for real-world decision-making applications that demand the information from past observations to make optimal decisions. Standard reinforcement learning algorithms for solving Markov Decision Processes (MDP) tasks are not applicable, as they cannot infer the unobserved states. In this paper, we propose a novel algorithm for POMDPs, named sequential variational soft Q-learning networks (SVQNs), which formalizes the inference of hidden states and maximum entropy reinforcement learning (MERL) under a unified graphical model and optimizes the two modules jointly. We further design a deep recurrent neural network to reduce the computational complexity of the algorithm. Experimental results show that SVQNs can utilize past information to help decision making for efficient inference, and outperforms other baselines on several challenging tasks. Our ablation study shows that SVQNs have the generalization ability over time and are robust to the disturbance of the observation.
Tasks Decision Making, Q-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=r1xPh2VtPB
PDF https://openreview.net/pdf?id=r1xPh2VtPB
PWC https://paperswithcode.com/paper/svqn-sequential-variational-soft-q-learning
Repo
Framework
comments powered by Disqus