April 1, 2020

2840 words 14 mins read

Paper Group NANR 14

Paper Group NANR 14

Phase Transitions for the Information Bottleneck in Representation Learning. DO-AutoEncoder: Learning and Intervening Bivariate Causal Mechanisms in Images. Task-Mediated Representation Learning. Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds. Learning Multi-facet Embeddings of Phrases and Sentences using Sparse Coding for U …

Phase Transitions for the Information Bottleneck in Representation Learning

Title Phase Transitions for the Information Bottleneck in Representation Learning
Authors Anonymous
Abstract In the Information Bottleneck (IB) (Tishby et al., 2000), when tuning the relative strength between compression and prediction terms, how do the two terms behave, and what’s their relationship with the dataset and the learned representation? In this paper, we set out to answer this question by studying multiple phase transitions in the IB objective: IBβ[p(zx)] = I(X;Z) − βI(Y ;Z), where sudden jumps of dI(Y ;Z)/dβ and prediction accuracy are observed with increasing β. We introduce a definition for IB phase transitions as a qualitative change of the IB loss landscape, and show that the transitions correspond to the onset of learning new classes. Using second-order calculus of variations, we derive a formula that provides the condition for IB phase transitions, and draw its connection with the Fisher information matrix for parameterized models. We provide two perspectives to understand the formula, revealing that each IB phase transition is finding a component of maximum (nonlinear) correlation between X and Y orthogonal to the learned representation, in close analogy with canonical-correlation analysis (CCA) in linear settings. Based on the theory, we present an algorithm for discovering phase transition points. Finally, we verify that our theory and algorithm accurately predict phase transitions in categorical datasets, predict the onset of learning new classes and class difficulty in MNIST, and predict prominent phase transitions in CIFAR10 experiments.
Tasks Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=HJloElBYvB
PDF https://openreview.net/pdf?id=HJloElBYvB
PWC https://paperswithcode.com/paper/phase-transitions-for-the-information
Repo
Framework

DO-AutoEncoder: Learning and Intervening Bivariate Causal Mechanisms in Images

Title DO-AutoEncoder: Learning and Intervening Bivariate Causal Mechanisms in Images
Authors Anonymous
Abstract Some fundamental limitations of deep learning have been exposed such as lacking generalizability and being vunerable to adversarial attack. Instead, researchers realize that causation is much more stable than association relationship in data. In this paper, we propose a new framework called do-calculus AutoEncoder(DO-AE) for deep representation learning that fully capture bivariate causal relationship in the images which allows us to intervene in images generation process. DO-AE consists of two key ingredients: causal relationship mining in images and intervention-enabling deep causal structured representation learning. The goal here is to learn deep representations that correspond to the concepts in the physical world as well as their causal structure. To verify the proposed method, we create a dataset named PHY2D, which contains abstract graphic description in accordance with the laws of physics. Our experiments demonstrate our method is able to correctly identify the bivariate causal relationship between concepts in images and the representation learned enables a do-calculus manipulation to images, which generates artificial images that might possibly break the physical law depending on where we intervene the causal system.
Tasks Adversarial Attack, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=r1e7NgrYvH
PDF https://openreview.net/pdf?id=r1e7NgrYvH
PWC https://paperswithcode.com/paper/do-autoencoder-learning-and-intervening
Repo
Framework

Task-Mediated Representation Learning

Title Task-Mediated Representation Learning
Authors Anonymous
Abstract Traditionally, unsupervised representation learning is used to discover underlying regularities from raw sensory data without relying on labeled data. A great number of algorithms in this field resorts to utilizing proxy objectives to facilitate learning. Further, learning how to act upon these regularities is left to a separate algorithm. Neural encoding in biological systems, on the other hand, is optimized to represent behaviorally relevant features of the environment in order to make inferences that guide successful behavior. Evidence suggests that neural encoding in biological systems is shaped by such behavioral objectives. In our work, we propose a model of inference-driven representation learning. Rather than following some auxiliary, a priori objective (e.g. minimization of reconstruction error, maximization of the fidelity of a generative model, etc.) and indiscriminately encoding information present in an observation, our model learns to build representations that support accurate inferences. Given a set of observations, our model encodes underlying regularities that de facto are necessary to solve the inference problem in hand. Rather than labeling the observations and learning representations that portray corresponding labels or learning representation in a self-supervised manner and learning explicit features of the input observations, we propose a model that learns representations that implicitly shaped by the goal of correct inference.
Tasks Representation Learning, Unsupervised Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=HklJ4gSFPS
PDF https://openreview.net/pdf?id=HklJ4gSFPS
PWC https://paperswithcode.com/paper/task-mediated-representation-learning
Repo
Framework

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

Title Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Authors Anonymous
Abstract We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. While other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a useful option for real world active learning problems.
Tasks Active Learning
Published 2020-01-01
URL https://openreview.net/forum?id=ryghZJBKPS
PDF https://openreview.net/pdf?id=ryghZJBKPS
PWC https://paperswithcode.com/paper/deep-batch-active-learning-by-diverse-1
Repo
Framework

Learning Multi-facet Embeddings of Phrases and Sentences using Sparse Coding for Unsupervised Semantic Applications

Title Learning Multi-facet Embeddings of Phrases and Sentences using Sparse Coding for Unsupervised Semantic Applications
Authors Anonymous
Abstract Most deep learning for NLP represents each word with a single point or single-mode region in semantic space, while the existing multi-mode word embeddings cannot represent longer word sequences like phrases or sentences. We introduce a phrase representation (also applicable to sentences) where each phrase has a distinct set of multi-mode codebook embeddings to capture different semantic facets of the phrase’s meaning. The codebook embeddings can be viewed as the cluster centers which summarize the distribution of possibly co-occurring words in a pre-trained word embedding space. We propose an end-to-end trainable neural model that directly predicts the set of cluster centers from the input text sequence (e.g., a phrase or a sentence) during test time. We find that the per-phrase/sentence codebook embeddings not only provide a more interpretable semantic representation but also outperform strong baselines (by a large margin in some tasks) on benchmark datasets for unsupervised phrase similarity, sentence similarity, hypernym detection, and extractive summarization.
Tasks Word Embeddings
Published 2020-01-01
URL https://openreview.net/forum?id=HkebMlrFPS
PDF https://openreview.net/pdf?id=HkebMlrFPS
PWC https://paperswithcode.com/paper/learning-multi-facet-embeddings-of-phrases
Repo
Framework

Neural Subgraph Isomorphism Counting

Title Neural Subgraph Isomorphism Counting
Authors Anonymous
Abstract In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms. Although the learning based approach is inexact, we are able to generalize to count large patterns and data graphs in polynomial time compared to the exponential time of the original NP-complete problem. Different from other traditional graph learning problems such as node classification and link prediction, subgraph isomorphism counting requires more global inference to oversee the whole graph. To tackle this problem, we propose a dynamic intermedium attention memory network (DIAMNet) which augments different representation learning architectures and iteratively attends pattern and target data graphs to memorize different subgraph isomorphisms for the global counting. We develop both small graphs (<= 1,024 subgraph isomorphisms in each) and large graphs (<= 4,096 subgraph isomorphisms in each) sets to evaluate different models. Experimental results show that learning based subgraph isomorphism counting can help reduce the time complexity with acceptable accuracy. Our DIAMNet can further improve existing representation learning models for this more global problem.
Tasks Link Prediction, Node Classification, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=HJx-akSKPS
PDF https://openreview.net/pdf?id=HJx-akSKPS
PWC https://paperswithcode.com/paper/neural-subgraph-isomorphism-counting
Repo
Framework

Composable Semi-parametric Modelling for Long-range Motion Generation

Title Composable Semi-parametric Modelling for Long-range Motion Generation
Authors Anonymous
Abstract Learning diverse and natural behaviors is one of the longstanding goal for creating intelligent characters in the animated world. In this paper, we propose ``COmposable Semi-parametric MOdelling’’ (COSMO), a method for generating long range diverse and distinctive behaviors to achieve a specific goal location. Our proposed method learns to model the motion of human by combining the complementary strengths of both non-parametric techniques and parametric ones. Given the starting and ending state, a memory bank is used to retrieve motion references that are provided as source material to a deep network. The synthesis is performed by a deep network that controls the style of the provided motion material and modifies it to become natural. On skeleton datasets with diverse motion, we show that the proposed method outperforms existing parametric and non-parametric baselines. We also demonstrate the generated sequences are useful as subgoals for actual physical execution in the animated world. |
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rkl44TEtwH
PDF https://openreview.net/pdf?id=rkl44TEtwH
PWC https://paperswithcode.com/paper/composable-semi-parametric-modelling-for-long
Repo
Framework

A Bayes-Optimal View on Adversarial Examples

Title A Bayes-Optimal View on Adversarial Examples
Authors Anonymous
Abstract Adversarial attacks on CNN classifiers can make an imperceptible change to an input image and alter the classification result. The source of these failures is still poorly understood, and many explanations invoke the “unreasonably linear extrapolation” used by CNNs along with the geometry of high dimensions. In this paper we show that similar attacks can be used against the Bayes-Optimal classifier for certain class distributions, while for others the optimal classifier is robust to such attacks. We present analytical results showing conditions on the data distribution under which all points can be made arbitrarily close to the optimal decision boundary and show that this can happen even when the classes are easy to separate, when the ideal classifier has a smooth decision surface and when the data lies in low dimensions. We introduce new datasets of realistic images of faces and digits where the Bayes-Optimal classifier can be calculated efficiently and show that for some of these datasets the optimal classifier is robust and for others it is vulnerable to adversarial examples. In systematic experiments with many such datasets, we find that standard CNN training consistently finds a vulnerable classifier even when the optimal classifier is robust while large-margin methods often find a robust classifier with the exact same training data. Our results suggest that adversarial vulnerability is not an unavoidable consequence of machine learning in high dimensions, and may often be a result of suboptimal training methods used in current practice.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=H1l3s6NtvH
PDF https://openreview.net/pdf?id=H1l3s6NtvH
PWC https://paperswithcode.com/paper/a-bayes-optimal-view-on-adversarial-examples
Repo
Framework

Multi-agent Reinforcement Learning for Networked System Control

Title Multi-agent Reinforcement Learning for Networked System Control
Authors Tianshu Chu, Sandeep Chinchali, Sachin Katti
Abstract This paper considers multi-agent reinforcement learning (MARL) in networked system control. Specifically, each agent learns a decentralized control policy based on local observations and messages from connected neighbors. We formulate such a networked MARL (NMARL) problem as a spatiotemporal Markov decision process and introduce a spatial discount factor to stabilize the training of each local agent. Further, we propose a new differentiable communication protocol, called NeurComm, to reduce information loss and non-stationarity in NMARL. Based on experiments in realistic NMARL scenarios of adaptive traffic signal control and cooperative adaptive cruise control, an appropriate spatial discount factor effectively enhances the learning curves of non-communicative MARL algorithms, while NeurComm outperforms existing communication protocols in both learning efficiency and control performance.
Tasks Multi-agent Reinforcement Learning
Published 2020-01-01
URL https://openreview.net/forum?id=Syx7A3NFvH
PDF https://openreview.net/pdf?id=Syx7A3NFvH
PWC https://paperswithcode.com/paper/multi-agent-reinforcement-learning-for
Repo
Framework

Task-Relevant Adversarial Imitation Learning

Title Task-Relevant Adversarial Imitation Learning
Authors Anonymous
Abstract We show that a critical problem in adversarial imitation from high-dimensional sensory data is the tendency of discriminator networks to distinguish agent and expert behaviour using task-irrelevant features beyond the control of the agent. We analyze this problem in detail and propose a solution as well as several baselines that outperform standard Generative Adversarial Imitation Learning (GAIL). Our proposed solution, Task-Relevant Adversarial Imitation Learning (TRAIL), uses a constrained optimization objective to overcome task-irrelevant features. Comprehensive experiments show that TRAIL can solve challenging manipulation tasks from pixels by imitating human operators, where other agents such as behaviour cloning (BC), standard GAIL, improved GAIL variants including our newly proposed baselines, and Deterministic Policy Gradients from Demonstrations (DPGfD) fail to find solutions, even when the other agents have access to task reward.
Tasks Imitation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=S1x2PCNKDB
PDF https://openreview.net/pdf?id=S1x2PCNKDB
PWC https://paperswithcode.com/paper/task-relevant-adversarial-imitation-learning-1
Repo
Framework

Retrospection: Leveraging the Past for Efficient Training of Deep Neural Networks

Title Retrospection: Leveraging the Past for Efficient Training of Deep Neural Networks
Authors Anonymous
Abstract Deep neural networks are powerful learning machines that have enabled breakthroughs in several domains. In this work, we introduce retrospection loss to improve the performance of neural networks by utilizing prior experiences during training. Minimizing the retrospection loss pushes the parameter state at the current training step towards the optimal parameter state while pulling it away from the parameter state at a previous training step. We conduct extensive experiments to show that the proposed retrospection loss results in improved performance across multiple tasks, input types and network architectures.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=H1eY00VFDB
PDF https://openreview.net/pdf?id=H1eY00VFDB
PWC https://paperswithcode.com/paper/retrospection-leveraging-the-past-for
Repo
Framework

Deep Multiple Instance Learning with Gaussian Weighting

Title Deep Multiple Instance Learning with Gaussian Weighting
Authors Anonymous
Abstract In this paper we present a deep Multiple Instance Learning (MIL) method that can be trained end-to-end to perform classification from weak supervision. Our MIL method is implemented as a two stream neural network, specialized in tasks of instance classification and weighting. Our instance weighting stream makes use of Gaussian radial basis function to normalize the instance weights by comparing instances locally within the bag and globally across bags. The final classification score of the bag is an aggregate of all instance classification scores. The instance representation is shared by both instance classification and weighting streams. The Gaussian instance weighting allows us to regularize the representation learning of instances such that all positive instances to be closer to each other w.r.t. the instance weighting function. We evaluate our method on five standard MIL datasets and show that our method outperforms other MIL methods. We also evaluate our model on two datasets where all models are trained end-to-end. Our method obtain better bag-classification and instance classification results on these datasets. We conduct extensive experiments to investigate the robustness of the proposed model and obtain interesting insights.
Tasks Multiple Instance Learning, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=Bklrea4KwS
PDF https://openreview.net/pdf?id=Bklrea4KwS
PWC https://paperswithcode.com/paper/deep-multiple-instance-learning-with-gaussian
Repo
Framework

Variational Autoencoders for Highly Multivariate Spatial Point Processes Intensities

Title Variational Autoencoders for Highly Multivariate Spatial Point Processes Intensities
Authors Anonymous
Abstract Multivariate spatial point process models can describe heterotopic data over space. However, highly multivariate intensities are computationally challenging due to the curse of dimensionality. To bridge this gap, we introduce a declustering based hidden variable model that leads to an efficient inference procedure via a variational autoencoder (VAE). We also prove that this model is a generalization of the VAE-based model for collaborative filtering. This leads to an interesting application of spatial point process models to recommender systems. Experimental results show the method’s utility on both synthetic data and real-world data sets.
Tasks Point Processes, Recommendation Systems
Published 2020-01-01
URL https://openreview.net/forum?id=B1lj20NFDS
PDF https://openreview.net/pdf?id=B1lj20NFDS
PWC https://paperswithcode.com/paper/variational-autoencoders-for-highly
Repo
Framework

$\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach

Title $\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach
Authors Anonymous
Abstract Robustness is an important property to guarantee the security of machine learning models. It has recently been demonstrated that strong robustness certificates can be obtained on ensemble classifiers generated by input randomization. However, tight robustness certificates are only known for symmetric norms including $\ell_0$ and $\ell_2$, while for asymmetric norms like $\ell_1$, the existing techniques do not apply. By converting the likelihood ratio into a one-dimensional mixed random variable, we derive the first tight $\ell_1$ robustness certificate under isotropic Laplace distributions. Empirically, the deep networks smoothed by Laplace distributions yield the state-of-the-art certified robustness in $\ell_1$ norm on CIFAR-10 and ImageNet.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=H1lQIgrFDS
PDF https://openreview.net/pdf?id=H1lQIgrFDS
PWC https://paperswithcode.com/paper/ell_1-adversarial-robustness-certificates-a
Repo
Framework

Sample-Based Point Cloud Decoder Networks

Title Sample-Based Point Cloud Decoder Networks
Authors Anonymous
Abstract Point clouds are a flexible and ubiquitous way to represent 3D objects with arbitrary resolution and precision. Previous work has shown that adapting encoder networks to match the semantics of their input point clouds can significantly improve their effectiveness over naive feedforward alternatives. However, the vast majority of work on point-cloud decoders are still based on fully-connected networks that map shape representations to a fixed number of output points. In this work, we investigate decoder architectures that more closely match the semantics of variable sized point clouds. Specifically, we study sample-based point-cloud decoders that map a shape representation to a point feature distribution, allowing an arbitrary number of sampled features to be transformed into individual output points. We develop three sample-based decoder architectures and compare their performance to each other and show their improved effectiveness over feedforward architectures. In addition, we investigate the learned distributions to gain insight into the output transformation. Our work is available as an extensible software platform to reproduce these results and serve as a baseline for future work.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SklVI1HKvH
PDF https://openreview.net/pdf?id=SklVI1HKvH
PWC https://paperswithcode.com/paper/sample-based-point-cloud-decoder-networks
Repo
Framework
comments powered by Disqus