Paper Group NANR 64
A Simple Randomization Technique for Generalization in Deep Reinforcement Learning. Deep 3D Pan via Local adaptive “t-shaped” convolutions with global and local adaptive dilations. SPREAD DIVERGENCE. Simplicial Complex Networks. Identifying through Flows for Recovering Latent Representations. Adaptive network sparsification with dependent variation …
A Simple Randomization Technique for Generalization in Deep Reinforcement Learning
Title | A Simple Randomization Technique for Generalization in Deep Reinforcement Learning |
Authors | Anonymous |
Abstract | Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained ones), particularly when they are trained on high-dimensional state spaces such as images. In this paper, we propose a simple technique to improve a generalization ability of deep RL agents by introducing a randomized (convolutional) neural network that randomly perturbs input observations. It enables trained agents to adapt to new domains by learning robust features which are invariant across varied and randomized input observations. Furthermore, we propose an inference method based on Monte Carlo approximation to reduce the variance induced by this randomization. The proposed method significantly outperforms conventional techniques including various regularization and data augmentation techniques across 2D CoinRun, 3D DeepMind Lab exploration, and 3D robotics control tasks. |
Tasks | Data Augmentation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HJgcvJBFvB |
https://openreview.net/pdf?id=HJgcvJBFvB | |
PWC | https://paperswithcode.com/paper/a-simple-randomization-technique-for |
Repo | |
Framework | |
Deep 3D Pan via Local adaptive “t-shaped” convolutions with global and local adaptive dilations
Title | Deep 3D Pan via Local adaptive “t-shaped” convolutions with global and local adaptive dilations |
Authors | Anonymous |
Abstract | Recent advances in deep learning have shown promising results in many low-level vision tasks. However, solving the single-image-based view synthesis is still an open problem. In particular, the generation of new images at parallel camera views given a single input image is of great interest, as it enables 3D visualization of the 2D input scenery. We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with “t-shaped” adaptive kernels equipped with globally and locally adaptive dilations. Our proposed network architecture, the monster-net, is devised with a novel t-shaped adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image’s pixels for the synthesis of naturally looking 3D panned views when a 2-D input image is given. Extensive experiments were performed on the KITTI, CityScapes and our VXXLXX_STEREO indoors dataset to prove the efficacy of our method. Our monster-net significantly outperforms the state-of-the-art method, SOTA, by a large margin in all metrics of RMSE, PSNR, and SSIM. Our proposed monster-net is capable of reconstructing more reliable image structures in synthesized images with coherent geometry. Moreover, the disparity information that can be extracted from the “t-shaped” kernel is much more reliable than that of the SOTA for the unsupervised monocular depth estimation task, confirming the effectiveness of our method. |
Tasks | Depth Estimation, Monocular Depth Estimation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1gF56VYPH |
https://openreview.net/pdf?id=B1gF56VYPH | |
PWC | https://paperswithcode.com/paper/deep-3d-pan-via-local-adaptive-t-shaped |
Repo | |
Framework | |
SPREAD DIVERGENCE
Title | SPREAD DIVERGENCE |
Authors | Anonymous |
Abstract | For distributions $p$ and $q$ with different supports, the divergence $\div{p}{q}$ may not exist. We define a spread divergence $\sdiv{p}{q}$ on modified $p$ and $q$ and describe sufficient conditions for the existence of such a divergence. We demonstrate how to maximize the discriminatory power of a given divergence by parameterizing and learning the spread. We also give examples of using a spread divergence to train and improve implicit generative models, including linear models (Independent Components Analysis) and non-linear models (Deep Generative Networks). |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJeWHlSYDB |
https://openreview.net/pdf?id=SJeWHlSYDB | |
PWC | https://paperswithcode.com/paper/spread-divergence |
Repo | |
Framework | |
Simplicial Complex Networks
Title | Simplicial Complex Networks |
Authors | Mohammad Firouzi, Sadra Boreiri, Hamed Firouzi |
Abstract | Universal approximation property of neural networks is one of the motivations to use these models in various real-world problems. However, this property is not the only characteristic that makes neural networks unique as there is a wide range of other approaches with similar property. Another characteristic which makes these models interesting is that they can be trained with the backpropagation algorithm which allows an efficient gradient computation and gives these universal approximators the ability to efficiently learn complex manifolds from a large amount of data in different domains. Despite their abundant use in practice, neural networks are still not well understood and a broad range of ongoing research is to study the interpretability of neural networks. On the other hand, topological data analysis (TDA) relies on strong theoretical framework of (algebraic) topology along with other mathematical tools for analyzing possibly complex datasets. In this work, we leverage a universal approximation theorem originating from algebraic topology to build a connection between TDA and common neural network training framework. We introduce the notion of automatic subdivisioning and devise a particular type of neural networks for regression tasks: Simplicial Complex Networks (SCNs). SCN’s architecture is defined with a set of bias functions along with a particular policy during the forward pass which alternates the common architecture search framework in neural networks. We believe the view of SCNs can be used as a step towards building interpretable deep learning models. Finally, we verify its performance on a set of regression problems. |
Tasks | Topological Data Analysis |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJlRDCVtwr |
https://openreview.net/pdf?id=SJlRDCVtwr | |
PWC | https://paperswithcode.com/paper/simplicial-complex-networks |
Repo | |
Framework | |
Identifying through Flows for Recovering Latent Representations
Title | Identifying through Flows for Recovering Latent Representations |
Authors | Anonymous |
Abstract | Identifiability, or recovery of the true latent representations from which the observed data originates, is a fundamental goal of representation learning. However, most deep generative models do not address the question of identifiability, and cannot recover the true latent sources that generate the observations. Recent work proposed identifiable generative modelling using variational autoencoders (iVAE) with a theory of identifiability. However, due to the intractablity of KL divergence between variational approximate posterior and the true posterior, iVAE has to maximize the evidence lower bound of the marginal likelihood, leading to suboptimal solutions in both theory and practice. In contrast, we propose an identifiable framework for estimating latent representations using a flow-based model (iFlow). Our approach directly maximizes the marginal likelihood, allowing for theoretical guarantees on identifiability, without the need for variational approximations. We derive its learning objective in analytical form, making it possible to train iFlow in an end-to-end manner. Simulations on synthetic data validate the correctness and effectiveness of our proposed method and demonstrate its practical advantages over other existing methods. |
Tasks | Representation Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SklOUpEYvB |
https://openreview.net/pdf?id=SklOUpEYvB | |
PWC | https://paperswithcode.com/paper/identifying-through-flows-for-recovering-1 |
Repo | |
Framework | |
Adaptive network sparsification with dependent variational beta-Bernoulli dropout
Title | Adaptive network sparsification with dependent variational beta-Bernoulli dropout |
Authors | Anonymous |
Abstract | While variational dropout approaches have been shown to be effective for network sparsification, they are still suboptimal in the sense that they set the dropout rate for each neuron without consideration of the input data. With such input independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss. To overcome this limitation, we propose adaptive variational dropout whose probabilities are drawn from sparsity inducing beta-Bernoulli prior. It allows each neuron to be evolved either to be generic or specific for certain inputs, or dropped altogether. Such input-adaptive sparsity- inducing dropout allows the resulting network to tolerate larger degree of sparsity without losing its expressive power by removing redundancies among features. We validate our dependent variational beta-Bernoulli dropout on multiple public datasets, on which it obtains significantly more compact networks than baseline methods, with consistent accuracy improvements over the base networks. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rylfl6VFDH |
https://openreview.net/pdf?id=rylfl6VFDH | |
PWC | https://paperswithcode.com/paper/adaptive-network-sparsification-with-1 |
Repo | |
Framework | |
Label Cleaning with Likelihood Ratio Test
Title | Label Cleaning with Likelihood Ratio Test |
Authors | Anonymous |
Abstract | To collect large scale annotated data, it is inevitable to introduce label noise, i.e., incorrect class labels. A major challenge is to develop robust deep learning models that achieve high test performance despite training set label noise. We introduce a novel approach that directly cleans labels in order to train a high quality model. Our method leverages statistical principles to correct data labels and has a theoretical guarantee of the correctness. In particular, we use a likelihood ratio test(LRT) to flip the labels of training data. We prove that our LRT label correction algorithm is guaranteed to flip the label so it is consistent with the true Bayesian optimal decision rule with high probability. We incorporate our label correction algorithm into the training of deep neural networks and train models that achieve superior testing performance on multiple public datasets. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJeF_h4FwB |
https://openreview.net/pdf?id=SJeF_h4FwB | |
PWC | https://paperswithcode.com/paper/label-cleaning-with-likelihood-ratio-test |
Repo | |
Framework | |
Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders
Title | Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders |
Authors | Anonymous |
Abstract | Probabilistic models with hierarchical-latent-variable structures provide state-of-the-art results amongst non-autoregressive, unsupervised density-based models. However, the most common approach to training such models based on Variational Autoencoders often fails to leverage deep-latent hierarchies; successful approaches require complex inference and optimisation schemes. Optimal Transport is an alternative, non-likelihood-based framework for training generative models with appealing theoretical properties, in principle allowing easier training convergence between distributions. In this work we propose a novel approach to training models with deep-latent hierarchies based on Optimal Transport, without the need for highly bespoke models and inference networks. We show that our method enables the generative model to fully leverage its deep-latent hierarchy, and that in-so-doing, it is more effective than the original Wasserstein Autoencoder with Maximum Mean Discrepancy divergence. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=ByeadyrtPB |
https://openreview.net/pdf?id=ByeadyrtPB | |
PWC | https://paperswithcode.com/paper/learning-deep-latent-hierarchies-by-stacking |
Repo | |
Framework | |
Graph Neural Networks for Soft Semi-Supervised Learning on Hypergraphs
Title | Graph Neural Networks for Soft Semi-Supervised Learning on Hypergraphs |
Authors | Anonymous |
Abstract | Graph-based semi-supervised learning (SSL) assigns labels to initially unlabelled vertices in a graph. Graph neural networks (GNNs), esp. graph convolutional networks (GCNs), inspired the current-state-of-the art models for graph-based SSL problems. GCNs inherently assume that the labels of interest are numerical or categorical variables. However, in many real-world applications such as co-authorship networks, recommendation networks, etc., vertex labels can be naturally represented by probability distributions or histograms. Moreover, real-world network datasets have complex relationships going beyond pairwise associations. These relationships can be modelled naturally and flexibly by hypergraphs. In this paper, we explore GNNs for graph-based SSL of histograms. Motivated by complex relationships (those going beyond pairwise) in real-world networks, we propose a novel method for directed hypergraphs. Our work builds upon existing works on graph-based SSL of histograms derived from the theory of optimal transportation. A key contribution of this paper is to establish generalisation error bounds for a one-layer GNN within the framework of algorithmic stability. We also demonstrate our proposed methods’ effectiveness through detailed experimentation on real-world data. We have made the code available. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=ryestJBKPB |
https://openreview.net/pdf?id=ryestJBKPB | |
PWC | https://paperswithcode.com/paper/graph-neural-networks-for-soft-semi |
Repo | |
Framework | |
Neural Machine Translation with Universal Visual Representation
Title | Neural Machine Translation with Universal Visual Representation |
Authors | Anonymous |
Abstract | Though visual information has been introduced for enhancing neural machine translation (NMT), its effectiveness strongly relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations. In this paper, we present a universal visual representation learned over the monolingual corpora with image annotations, which overcomes the lack of large-scale bilingual sentence-image pairs, thereby extending image applicability in NMT. In detail, a group of images with similar topics to the source sentence will be retrieved from a light topic-image lookup table learned over the existing sentence-image pairs, and then is encoded as image representations by a pre-trained ResNet. An attention layer with a gated weighting is to fuse the visual information and text information as input to the decoder for predicting target translations. In particular, the proposed method enables the visual information to be integrated into large-scale text-only NMT in addition to the multimodel NMT. Experiments on four widely used translation datasets, including the WMT’16 English-to-Romanian, WMT’14 English-to-German, WMT’14 English-to-French, and Multi30K, show that the proposed approach achieves significant improvements over strong baselines. |
Tasks | Machine Translation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Byl8hhNYPS |
https://openreview.net/pdf?id=Byl8hhNYPS | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-with-universal |
Repo | |
Framework | |
AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift
Title | AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift |
Authors | Anonymous |
Abstract | Modern deep neural networks can achieve high accuracy when the training distribution and test distribution are identically distributed, but this assumption is frequently violated in practice. When the train and test distributions are mismatched, accuracy can plummet. Currently there are few techniques that improve robustness to data shift. In this work, we propose a technique to improve the robustness and uncertainty estimates of image classifiers. We propose AugMix, a data processing technique that is simple to implement, adds limited computational overhead, and helps models withstand data shift. AugMix significantly improves robustness and uncertainty measures on challenging image classification benchmarks, closing the gap between previous methods and the best possible performance in some cases by more than half. |
Tasks | Image Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1gmrxHFvB |
https://openreview.net/pdf?id=S1gmrxHFvB | |
PWC | https://paperswithcode.com/paper/augmix-a-simple-method-to-improve-robustness |
Repo | |
Framework | |
Rethinking Neural Network Quantization
Title | Rethinking Neural Network Quantization |
Authors | Anonymous |
Abstract | Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the gradient propagation process of neural networks by viewing the weights and intermediate activations as random variables, we discover two critical rules for efficient training. Recent quantization approaches violates the two rules and results in degenerated convergence. To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training. We also analyze the quantization error introduced in calculating the gradient in the popular parameterized clipping activation (PACT) technique. Through SAT together with gradient-calibrated PACT, quantized models obtain comparable or even better performance than their full-precision counterparts, achieving state-of-the-art accuracy with consistent improvement over previous quantization methods on a wide spectrum of models including MobileNet-V1/V2 and PreResNet-50. |
Tasks | Quantization |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HygQ7TNtPr |
https://openreview.net/pdf?id=HygQ7TNtPr | |
PWC | https://paperswithcode.com/paper/rethinking-neural-network-quantization |
Repo | |
Framework | |
Learning relevant features for statistical inference
Title | Learning relevant features for statistical inference |
Authors | Anonymous |
Abstract | We introduce an new technique to learn correlations between two types of data. The learned representation can be used to directly compute the expectations of functions over one type of data conditioned on the other, such as Bayesian estimators and their standard deviations. Specifically, our loss function teaches two neural nets to extract features representing the probability vectors of highest singular value for the stochastic map (set of conditional probabilities) implied by the joint dataset, relative to the inner product defined by the Fisher information metrics evaluated at the marginals. We test the approach using a synthetic dataset, analytical calculations, and inference on occluded MNIST images. Surprisingly, when applied to supervised learning (one dataset consists of labels), this approach automatically provides regularization and faster convergence compared to the cross-entropy objective. We also explore using this approach to discover salient independent features of a single dataset. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJeS16EKPr |
https://openreview.net/pdf?id=SJeS16EKPr | |
PWC | https://paperswithcode.com/paper/learning-relevant-features-for-statistical |
Repo | |
Framework | |
Defense against Adversarial Examples by Encoder-Assisted Search in the Latent Coding Space
Title | Defense against Adversarial Examples by Encoder-Assisted Search in the Latent Coding Space |
Authors | Anonymous |
Abstract | Deep neural networks were shown to be vulnerable to crafted adversarial perturbations, and thus bring serious safety problems. To solve this problem, we proposed $\text{AE-GAN}_\text{+sr}$, a framework for purifying input images by searching a closest natural reconstruction with little computation. We first build a reconstruction network AE-GAN, which adapted auto-encoder by introducing adversarial loss to the objective function. In this way, we can enhance the generative ability of decoder and preserve the abstraction ability of encoder to form a self-organized latent space. In the inference time, when given an input, we will start a search process in the latent space which aims to find the closest reconstruction to the given image on the distribution of normal data. The encoder can provide a good start point for the searching process, which saves much computation cost. Experiments show that our method is robust against various attacks and can reach comparable even better performance to similar methods with much fewer computations. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Hyg53gSYPB |
https://openreview.net/pdf?id=Hyg53gSYPB | |
PWC | https://paperswithcode.com/paper/defense-against-adversarial-examples-by |
Repo | |
Framework | |
Graph-based motion planning networks
Title | Graph-based motion planning networks |
Authors | Anonymous |
Abstract | Differentiable planning network architecture has shown to be powerful in solving transfer planning tasks while possesses a simple end-to-end training feature. Many great planning architectures that have been proposed later in literature are inspired by this design principle in which a recursive network architecture is applied to emulate backup operations of a value iteration algorithm. However existing frame-works can only learn and plan effectively on domains with a lattice structure, i.e. regular graphs embedded in a certain Euclidean space. In this paper, we propose a general planning network, called Graph-based Motion Planning Networks (GrMPN), that will be able to i) learn and plan on general irregular graphs, hence ii) render existing planning network architectures special cases. The proposed GrMPN framework is invariant to task graph permutation, i.e. graph isormophism. As a result, GrMPN possesses the generalization strength and data-efficiency ability. We demonstrate the performance of the proposed GrMPN method against other baselines on three domains ranging from 2D mazes (regular graph), path planning on irregular graphs, and motion planning (an irregular graph of robot configurations). |
Tasks | Motion Planning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HkxLiJSKwB |
https://openreview.net/pdf?id=HkxLiJSKwB | |
PWC | https://paperswithcode.com/paper/graph-based-motion-planning-networks |
Repo | |
Framework | |