April 1, 2020

3234 words 16 mins read

Paper Group NANR 87

Paper Group NANR 87

PROGRESSIVE LEARNING AND DISENTANGLEMENT OF HIERARCHICAL REPRESENTATIONS. Learning transport cost from subset correspondence. On Robustness of Neural Ordinary Differential Equations. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?. Sign Bits Are All You Need for Black-Box Attacks. Biologically inspired sleep algorithm for inc …

PROGRESSIVE LEARNING AND DISENTANGLEMENT OF HIERARCHICAL REPRESENTATIONS

Title PROGRESSIVE LEARNING AND DISENTANGLEMENT OF HIERARCHICAL REPRESENTATIONS
Authors Anonymous
Abstract Learning rich representation from data is an important task for deep generative models such as variational auto-encoder (VAE). However, by extracting high-level abstractions in the bottom-up inference process, the goal of preserving all factors of variations for top-down generation is compromised. Motivated by the concept of “starting small”, we present a strategy to progressively learn independent hierarchical representations from high- to low-levels of abstractions. The model starts with learning the most abstract representation, and then progressively grow the network architecture to introduce new representations at different levels of abstraction. We quantitatively demonstrate the ability of the presented model to improve disentanglement in comparison to existing works on two benchmark datasets using three disentanglement metrics, including a new metric we proposed to complement the previously-presented metric of mutual information gap. We further present both qualitative and quantitative evidence on how the progression of learning improves disentangling of hierarchical representations. By drawing on the respective advantage of hierarchical representation learning and progressive learning, this is to our knowledge the first attempt to improve disentanglement by progressively growing the capacity of VAE to learn hierarchical representations.
Tasks Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=SJxpsxrYPS
PDF https://openreview.net/pdf?id=SJxpsxrYPS
PWC https://paperswithcode.com/paper/progressive-learning-and-disentanglement-of
Repo
Framework

Learning transport cost from subset correspondence

Title Learning transport cost from subset correspondence
Authors Anonymous
Abstract Learning to align multiple datasets is an important problem with many applications, and it is especially useful when we need to integrate multiple experiments or correct for confounding. Optimal transport (OT) is a principled approach to align datasets, but a key challenge in applying OT is that we need to specify a cost function that accurately captures how the two datasets are related. Reliable cost functions are typically not available and practitioners often resort to using hand-crafted or Euclidean cost even if it may not be appropriate. In this work, we investigate how to learn the cost function using a small amount of side information which is often available. The side information we consider captures subset correspondence—i.e. certain subsets of points in the two data sets are known to be related. For example, we may have some images labeled as cars in both datasets; or we may have a common annotated cell type in single-cell data from two batches. We develop an end-to-end optimizer (OT-SI) that differentiates through the Sinkhorn algorithm and effectively learns the suitable cost function from side information. On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SJlRUkrFPS
PDF https://openreview.net/pdf?id=SJlRUkrFPS
PWC https://paperswithcode.com/paper/learning-transport-cost-from-subset
Repo
Framework

On Robustness of Neural Ordinary Differential Equations

Title On Robustness of Neural Ordinary Differential Equations
Authors Anonymous
Abstract Neural ordinary differential equations (ODEs) have been attracting increasing attention in various research domains recently. There have been some works studying optimization issues and approximation capabilities of neural ODEs, but their robustness is still yet unclear. In this work, we fill this important gap by exploring robustness properties of neural ODEs both empirically and theoretically. We first present an empirical study on the robustness of the neural ODE-based networks (ODENets) by exposing them to inputs with various types of perturbations and subsequently investigating the changes of the corresponding outputs. In contrast to conventional convolutional neural networks (CNNs), we find that the ODENets are more robust against both random Gaussian perturbations and adversarial attack examples. We then provide an insightful understanding of this phenomenon by exploiting a certain desirable property of the flow of a continuous-time ODE, namely that integral curves are non-intersecting. Our work suggests that, due to their intrinsic robustness, it is promising to use neural ODEs as a basic block for building robust deep network models. To further enhance the robustness of vanilla neural ODEs, we propose the time-invariant steady neural ODE (TisODE), which regularizes the flow on perturbed data via the time-invariant property and the imposition of a steady-state constraint. We show that the TisODE method outperforms vanilla neural ODEs and also can work in conjunction with other state-of-the-art architectural methods to build more robust deep networks.
Tasks Adversarial Attack
Published 2020-01-01
URL https://openreview.net/forum?id=B1e9Y2NYvS
PDF https://openreview.net/pdf?id=B1e9Y2NYvS
PWC https://paperswithcode.com/paper/on-robustness-of-neural-ordinary-differential-1
Repo
Framework

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Title Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
Authors Anonymous
Abstract Hierarchical reinforcement learning has demonstrated significant success at solving difficult reinforcement learning (RL) tasks. Previous works have motivated the use of hierarchy by appealing to a number of intuitive benefits, including learning over temporally extended transitions, exploring over temporally extended periods, and training and exploring in a more semantically meaningful action space, among others. However, in fully observed, Markovian settings, it is not immediately clear why hierarchical RL should provide benefits over standard “shallow” RL architectures. In this work, we isolate and evaluate the claimed benefits of hierarchical RL on a suite of tasks encompassing locomotion, navigation, and manipulation. Surprisingly, we find that most of the observed benefits of hierarchy can be attributed to improved exploration, as opposed to easier policy learning or imposed hierarchical structures. Given this insight, we present exploration techniques inspired by hierarchy that achieve performance competitive with hierarchical RL while at the same time being much simpler to use and implement.
Tasks Hierarchical Reinforcement Learning
Published 2020-01-01
URL https://openreview.net/forum?id=rJgSk04tDH
PDF https://openreview.net/pdf?id=rJgSk04tDH
PWC https://paperswithcode.com/paper/why-does-hierarchy-sometimes-work-so-well-in-1
Repo
Framework

Sign Bits Are All You Need for Black-Box Attacks

Title Sign Bits Are All You Need for Black-Box Attacks
Authors Anonymous
Abstract We present a novel black-box adversarial attack algorithm with state-of-the-art model evasion rates for query efficiency under $\ell_\infty$ and $\ell_2$ metrics. It exploits a \textit{sign-based}, rather than magnitude-based, gradient estimation approach that shifts the gradient estimation from continuous to binary black-box optimization. It adaptively constructs queries to estimate the gradient, one query relying upon the previous, rather than re-estimating the gradient each step with random query construction. Its reliance on sign bits yields a smaller memory footprint and it requires neither hyperparameter tuning or dimensionality reduction. Further, its theoretical performance is guaranteed and it can characterize adversarial subspaces better than white-box gradient-aligned subspaces. On two public black-box attack challenges and a model robustly trained against transfer attacks, the algorithm’s evasion rates surpass all submitted attacks. For a suite of published models, the algorithm is $3.8\times$ less failure-prone while spending $2.5\times$ fewer queries versus the best combination of state of art algorithms. For example, it evades a standard MNIST model using just $12$ queries on average. Similar performance is observed on a standard IMAGENET model with an average of $579$ queries.
Tasks Adversarial Attack, Dimensionality Reduction
Published 2020-01-01
URL https://openreview.net/forum?id=SygW0TEFwH
PDF https://openreview.net/pdf?id=SygW0TEFwH
PWC https://paperswithcode.com/paper/sign-bits-are-all-you-need-for-black-box
Repo
Framework

Biologically inspired sleep algorithm for increased generalization and adversarial robustness in deep neural networks

Title Biologically inspired sleep algorithm for increased generalization and adversarial robustness in deep neural networks
Authors Anonymous
Abstract Current artificial neural networks (ANNs) can perform and excel at a variety of tasks ranging from image classification to spam detection through training on large datasets of labeled data. While the trained network usually performs well on similar testing data, certain inputs that differ even slightly from the training data may trigger unpredictable behavior. Due to this limitation, it is possible to generate inputs with very small designed perturbations that can result in misclassification. These adversarial attacks present a security risk to deployed ANNs and indicate a divergence between how ANNs and humans perform classification. Humans are robust at behaving in the presence of noise and are capable of correctly classifying objects that are occluded, blurred, or otherwise distorted. It has been hypothesized that sleep promotes generalization and improves robustness against noise in animals and humans. In this work, we utilize a biologically inspired sleep phase in ANNs and demonstrate the benefit of sleep on defending against adversarial attacks as well as increasing ANN classification robustness. We compare the sleep algorithm’s performance on various robustness tasks with two previously proposed adversarial defenses, defensive distillation and fine-tuning. We report an increase in robustness after sleep to adversarial attacks as well as to general image distortions for three datasets: MNIST, CUB200, and a toy dataset. Overall, these results demonstrate the potential for biologically inspired solutions to solve existing problems in ANNs and guide the development of more robust, human-like ANNs.
Tasks Image Classification
Published 2020-01-01
URL https://openreview.net/forum?id=r1xGnA4Kvr
PDF https://openreview.net/pdf?id=r1xGnA4Kvr
PWC https://paperswithcode.com/paper/biologically-inspired-sleep-algorithm-for-1
Repo
Framework

Depth creates no more spurious local minima in linear networks

Title Depth creates no more spurious local minima in linear networks
Authors Anonymous
Abstract We show that for any convex differentiable loss, a deep linear network has no spurious local minima as long as it is true for the two layer case. This reduction greatly simplifies the study on the existence of spurious local minima in deep linear networks. When applied to the quadratic loss, our result immediately implies the powerful result by Kawaguchi (2016). Further, with the recent work by Zhou& Liang (2018), we can remove all the assumptions in (Kawaguchi, 2016). This property holds for more general “multi-tower” linear networks too. Our proof builds on the work in (Laurent & von Brecht, 2018) and develops a new perturbation argument to show that any spurious local minimum must have full rank, a structural property which can be useful more generally
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Bkx5XyrtPS
PDF https://openreview.net/pdf?id=Bkx5XyrtPS
PWC https://paperswithcode.com/paper/depth-creates-no-more-spurious-local-minima-1
Repo
Framework

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Title Implementation Matters in Deep RL: A Case Study on PPO and TRPO
Authors Anonymous
Abstract We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms, Proximal Policy Optimization and Trust Region Policy Optimization. We investigate the consequences of “code-level optimizations:” algorithm augmentations found only in implementations or described as auxiliary details to the core algorithm. Seemingly of secondary importance, such optimizations have a major impact on agent behavior. Our results show that they (a) are responsible for most of PPO’s gain in cumulative reward over TRPO, and (b) fundamentally change how RL methods function. These insights show the difficulty, and importance, of attributing performance gains in deep reinforcement learning.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=r1etN1rtPB
PDF https://openreview.net/pdf?id=r1etN1rtPB
PWC https://paperswithcode.com/paper/implementation-matters-in-deep-rl-a-case
Repo
Framework

SNODE: Spectral Discretization of Neural ODEs for System Identification

Title SNODE: Spectral Discretization of Neural ODEs for System Identification
Authors Anonymous
Abstract This paper proposes the use of spectral element methods \citep{canuto_spectral_1988} for fast and accurate training of Neural Ordinary Differential Equations (ODE-Nets; \citealp{Chen2018NeuralOD}) for system identification. This is achieved by expressing their dynamics as a truncated series of Legendre polynomials. The series coefficients, as well as the network weights, are computed by minimizing the weighted sum of the loss function and the violation of the ODE-Net dynamics. The problem is solved by coordinate descent that alternately minimizes, with respect to the coefficients and the weights, two unconstrained sub-problems using standard backpropagation and gradient methods. The resulting optimization scheme is fully time-parallel and results in a low memory footprint. Experimental comparison to standard methods, such as backpropagation through explicit solvers and the adjoint technique \citep{Chen2018NeuralOD}, on training surrogate models of small and medium-scale dynamical systems shows that it is at least one order of magnitude faster at reaching a comparable value of the loss function. The corresponding testing MSE is one order of magnitude smaller as well, suggesting generalization capabilities increase.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Sye0XkBKvS
PDF https://openreview.net/pdf?id=Sye0XkBKvS
PWC https://paperswithcode.com/paper/snode-spectral-discretization-of-neural-odes
Repo
Framework

Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

Title Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings
Authors Anonymous
Abstract Answering complex logical queries on large-scale incomplete knowledge graphs (KGs) is a fundamental yet challenging task. Recently, a promising approach to this problem has been to embed KG entities as well as the query into a vector space such that entities that answer the query are embedded close to the query. However, prior work models queries as single points in the vector space, which is problematic because a complex query represents a potentially large set of its answer entities, but it is unclear how such a set can be represented as a single point. Furthermore, prior work can only handle queries that use conjunctions ($\wedge$) and existential quantifiers ($\exists$). Handling queries with logical disjunctions ($\vee$) remains an open problem. Here we propose query2box, an embedding-based framework for reasoning over arbitrary queries with $\wedge$, $\vee$, and $\exists$ operators in massive and incomplete KGs. Our main insight is that queries can be embedded as boxes (i.e., hyper-rectangles), where a set of points inside the box corresponds to a set of answer entities of the query. We show that conjunctions can be naturally represented as intersections of boxes and also prove a negative result that handling disjunctions would require embedding with dimension proportional to the number of KG entities. However, we show that by transforming queries into a Disjunctive Normal Form, query2box is capable of handling arbitrary logical queries with $\wedge$, $\vee$, $\exists$ in a scalable manner. We demonstrate the effectiveness of query2box on two large KGs and show that query2box achieves up to 25% relative improvement over the state of the art.
Tasks Knowledge Graphs
Published 2020-01-01
URL https://openreview.net/forum?id=BJgr4kSFDS
PDF https://openreview.net/pdf?id=BJgr4kSFDS
PWC https://paperswithcode.com/paper/query2box-reasoning-over-knowledge-graphs-in
Repo
Framework

Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees

Title Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees
Authors Anonymous
Abstract We propose a meta path planning algorithm named \emph{Neural Exploration-Exploitation Trees~(NEXT)} for learning from prior experience for solving new path planning problems in high dimensional continuous state and action spaces. Compared to more classical sampling-based methods like RRT, our approach achieves much better sample efficiency in high-dimensions and can benefit from prior experience of planning in similar environments. More specifically, NEXT exploits a novel neural architecture which can learn promising search directions from problem structures. The learned prior is then integrated into a UCB-type algorithm to achieve an online balance between \emph{exploration} and \emph{exploitation} when solving a new problem. We conduct thorough experiments to show that NEXT accomplishes new planning problems with more compact search trees and significantly outperforms state-of-the-art methods on several benchmarks.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJgJDAVKvB
PDF https://openreview.net/pdf?id=rJgJDAVKvB
PWC https://paperswithcode.com/paper/learning-to-plan-in-high-dimensions-via
Repo
Framework

Implicit competitive regularization in GANs

Title Implicit competitive regularization in GANs
Authors Anonymous
Abstract Generative adversarial networks (GANs) are capable of producing high quality samples, but they suffer from numerous issues such as instability and mode collapse during training. To combat this, we propose to model the generator and discriminator as agents acting under local information, uncertainty, and awareness of their opponent. By doing so we achieve stable convergence, even when the underlying game has no Nash equilibria. We call this mechanism \emph{implicit competitive regularization} (ICR) and show that it is present in the recently proposed \emph{competitive gradient descent} (CGD). When comparing CGD to Adam using a variety of loss functions and regularizers on CIFAR10, CGD shows a much more consistent performance, which we attribute to ICR. In our experiments, we achieve the highest inception score when using the WGAN loss (without gradient penalty or weight clipping) together with CGD. This can be interpreted as minimizing a form of integral probability metric based on ICR.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SkxaueHFPB
PDF https://openreview.net/pdf?id=SkxaueHFPB
PWC https://paperswithcode.com/paper/implicit-competitive-regularization-in-gans
Repo
Framework

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Title And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Authors Anonymous
Abstract In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using byte-aligned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5MB (20x compression factor) while preserving a top-1 accuracy of 76.1% on ImageNet object classification and by compressing a Mask R-CNN with a 26x factor.
Tasks Object Classification, Quantization
Published 2020-01-01
URL https://openreview.net/forum?id=rJehVyrKwH
PDF https://openreview.net/pdf?id=rJehVyrKwH
PWC https://paperswithcode.com/paper/and-the-bit-goes-down-revisiting-the-1
Repo
Framework

Robust Local Features for Improving the Generalization of Adversarial Training

Title Robust Local Features for Improving the Generalization of Adversarial Training
Authors Anonymous
Abstract Adversarial training has been demonstrated as one of the most effective methods for training robust models so as to defend against adversarial examples. However, adversarially trained models often lack adversarially robust generalization on unseen testing data. Recent works show that adversarially trained models are more biased towards global structure features. Instead, in this work, we would like to investigate the relationship between the generalization of adversarial training and the robust local features, as the robust local features generalize well for unseen shape variation. To learn the robust local features, we develop a Random Block Shuffle (RBS) transformation to break up the global structure features on normal adversarial examples. We continue to propose a new approach called Robust Local Features for Adversarial Training (RLFAT), which first learns the robust local features by adversarial training on the RBS-transformed adversarial examples, and then transfers the robust local features into the training of normal adversarial examples. Finally, we implement RLFAT in two currently state-of-the-art adversarial training frameworks. Extensive experiments on STL-10, CIFAR-10, CIFAR-100 datasets show that RLFAT significantly improves both the adversarially robust generalization and the standard generalization of adversarial training. Additionally, we demonstrate that our models capture more local features of the object on the images, aligning better with human perception.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=H1lZJpVFvr
PDF https://openreview.net/pdf?id=H1lZJpVFvr
PWC https://paperswithcode.com/paper/robust-local-features-for-improving-the
Repo
Framework

LocalGAN: Modeling Local Distributions for Adversarial Response Generation

Title LocalGAN: Modeling Local Distributions for Adversarial Response Generation
Authors Anonymous
Abstract This paper presents a new methodology for modeling the local semantic distribution of responses to a given query in the human-conversation corpus, and on this basis, explores a specified adversarial learning mechanism for training Neural Response Generation (NRG) models to build conversational agents. The proposed mechanism aims to address the training instability problem and improve the quality of generated results of Generative Adversarial Nets (GAN) in their utilizations in the response generation scenario. Our investigation begins with the thorough discussions upon the objective function brought by general GAN architectures to NRG models, and the training instability problem is proved to be ascribed to the special local distributions of conversational corpora. Consequently, an energy function is employed to estimate the status of a local area restricted by the query and its responses in the semantic space, and the mathematical approximation of this energy-based distribution is finally found. Building on this foundation, a local distribution oriented objective is proposed and combined with the original objective, working as a hybrid loss for the adversarial training of response generation models, named as LocalGAN. Our experimental results demonstrate that the reasonable local distribution modeling of the query-response corpus is of great importance to adversarial NRG, and our proposed LocalGAN is promising for improving both the training stability and the quality of generated results.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=B1liraVYwr
PDF https://openreview.net/pdf?id=B1liraVYwr
PWC https://paperswithcode.com/paper/localgan-modeling-local-distributions-for
Repo
Framework
comments powered by Disqus