Paper Group NANR 47
State2vec: Off-Policy Successor Feature Approximators. RTC-VAE: HARNESSING THE PECULIARITY OF TOTAL CORRELATION IN LEARNING DISENTANGLED REPRESENTATIONS. Pragmatic Evaluation of Adversarial Examples in Natural Language. Preventing Imitation Learning with Adversarial Policy Ensembles. Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic M …
State2vec: Off-Policy Successor Feature Approximators
Title | State2vec: Off-Policy Successor Feature Approximators |
Authors | Anonymous |
Abstract | A major challenge in reinforcement learning (RL) is how to design agents that are able to generalize across tasks that share common dynamics. A viable solution is meta-reinforcement learning, which identifies common structures among past tasks to be then generalized to new tasks (meta-test). In meta-training, the RL agent learns state representations that encode prior information from a set of tasks, used to generalize the value function approximation. This has been proposed in the literature as successor representation approximators. While promising, these methods do not generalize well across optimal policies, leading to sampling-inefficiency during meta-test phases. In this paper, we propose state2vec, an efficient and low-complexity framework for learning successor features which (i) generalize across policies, (ii) ensure sample-efficiency during meta-test. Representing each RL tasks as a graph, we extend the well known nod2vec framework to learn graph embeddings able to capture the discounted future state transitions in RL. The proposed off-policy state2vec captures the geometry of the underlying state space, making good basis functions for linear value function approximation. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HklOjkHKDr |
https://openreview.net/pdf?id=HklOjkHKDr | |
PWC | https://paperswithcode.com/paper/state2vec-off-policy-successor-feature |
Repo | |
Framework | |
RTC-VAE: HARNESSING THE PECULIARITY OF TOTAL CORRELATION IN LEARNING DISENTANGLED REPRESENTATIONS
Title | RTC-VAE: HARNESSING THE PECULIARITY OF TOTAL CORRELATION IN LEARNING DISENTANGLED REPRESENTATIONS |
Authors | Anonymous |
Abstract | In the problem of unsupervised learning of disentangled representations, one of the promising methods is to penalize the total correlation of sampled latent vari-ables. Unfortunately, this well-motivated strategy often fail to achieve disentanglement due to a problematic difference between the sampled latent representation and its corresponding mean representation. We provide a theoretical explanation that low total correlation of sample distribution cannot guarantee low total correlation of the mean representation. We prove that for the mean representation of arbitrarily high total correlation, there exist distributions of latent variables of abounded total correlation. However, we still believe that total correlation could be a key to the disentanglement of unsupervised representative learning, and we propose a remedy, RTC-VAE, which rectifies the total correlation penalty. Experiments show that our model has a more reasonable distribution of the mean representation compared with baseline models, e.g.,β-TCVAE and FactorVAE. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SkeuipVKDH |
https://openreview.net/pdf?id=SkeuipVKDH | |
PWC | https://paperswithcode.com/paper/rtc-vae-harnessing-the-peculiarity-of-total |
Repo | |
Framework | |
Pragmatic Evaluation of Adversarial Examples in Natural Language
Title | Pragmatic Evaluation of Adversarial Examples in Natural Language |
Authors | Anonymous |
Abstract | Attacks on natural language models are difficult to compare due to their different definitions of what constitutes a successful attack. We present a taxonomy of constraints to categorize these attacks. For each constraint, we present a real-world use case and a way to measure how well generated samples enforce the constraint. We then employ our framework to evaluate two state-of-the art attacks which fool models with synonym substitution. These attacks claim their adversarial perturbations preserve the semantics and syntactical correctness of the inputs, but our analysis shows these constraints are not strongly enforced. For a significant portion of these adversarial examples, a grammar checker detects an increase in errors. Additionally, human studies indicate that many of these adversarial examples diverge in semantic meaning from the input or do not appear to be human-written. Finally, we highlight the need for standardized evaluation of attacks that share constraints. Without shared evaluation metrics, it is up to researchers to set thresholds that determine the trade-off between attack quality and attack success. We recommend well-designed human studies to determine the best threshold to approximate human judgement. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkxmKgHtwH |
https://openreview.net/pdf?id=BkxmKgHtwH | |
PWC | https://paperswithcode.com/paper/pragmatic-evaluation-of-adversarial-examples |
Repo | |
Framework | |
Preventing Imitation Learning with Adversarial Policy Ensembles
Title | Preventing Imitation Learning with Adversarial Policy Ensembles |
Authors | Anonymous |
Abstract | Imitation learning can reproduce policies by observing experts, which poses a problem regarding policy propriety. Policies, such as human, or policies on deployed robots, can all be cloned without consent from the owners. How can we protect our proprietary policies from cloning by an external observer? To answer this question we introduce a new reinforcement learning framework, where we train an ensemble of optimal policies, whose demonstrations are guaranteed to be useless for an external observer. We formulate this idea by a constrained optimization problem, where the objective is to improve proprietary policies, and at the same time deteriorate the virtual policy of an eventual external observer. We design a tractable algorithm to solve this new optimization problem by modifying the standard policy gradient algorithm. It appears such problem formulation admits plausible interpretations of confidentiality, adversarial behaviour, which enables a broader perspective of this work. We demonstrate explicitly the existence of such ‘non-clonable’ ensembles, providing a solution to the above optimization problem, which is calculated by our modified policy gradient algorithm. To our knowledge, this is the first work regarding the protection and privacy of policies in Reinforcement Learning. |
Tasks | Imitation Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=ryxOBgBFPH |
https://openreview.net/pdf?id=ryxOBgBFPH | |
PWC | https://paperswithcode.com/paper/preventing-imitation-learning-with |
Repo | |
Framework | |
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
Title | Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games |
Authors | Anonymous |
Abstract | We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=H1lhqpEYPr |
https://openreview.net/pdf?id=H1lhqpEYPr | |
PWC | https://paperswithcode.com/paper/actor-critic-provably-finds-nash-equilibria-1 |
Repo | |
Framework | |
On the Tunability of Optimizers in Deep Learning
Title | On the Tunability of Optimizers in Deep Learning |
Authors | Anonymous |
Abstract | There is no consensus yet on the question whether adaptive gradient methods like Adam are easier to use than non-adaptive optimization methods like SGD. In this work, we fill in the important, yet ambiguous concept of ‘ease-of-use’ by defining an optimizer’s tunability: How easy is it to find good hyperparameter configurations using automatic random hyperparameter search? We propose a practical and universal quantitative measure for optimizer tunability that can form the basis for a fair optimizer benchmark. Evaluating a variety of optimizers on an extensive set of standard datasets and architectures, we find that Adam is the most tunable for the majority of problems, especially with a low budget for hyperparameter tuning. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=H1gEP6NFwr |
https://openreview.net/pdf?id=H1gEP6NFwr | |
PWC | https://paperswithcode.com/paper/on-the-tunability-of-optimizers-in-deep-1 |
Repo | |
Framework | |
Variational pSOM: Deep Probabilistic Clustering with Self-Organizing Maps
Title | Variational pSOM: Deep Probabilistic Clustering with Self-Organizing Maps |
Authors | Anonymous |
Abstract | Generating visualizations and interpretations from high-dimensional data is a common problem in many fields. Two key approaches for tackling this problem are clustering and representation learning. There are very performant deep clustering models on the one hand and interpretable representation learning techniques, often relying on latent topological structures such as self-organizing maps, on the other hand. However, current methods do not yet successfully combine these two approaches. We present a new deep architecture for probabilistic clustering, VarPSOM, and its extension to time series data, VarTPSOM, composed of VarPSOM modules connected by LSTM cells. We show that they achieve superior clustering performance compared to current deep clustering methods on static MNIST/Fashion-MNIST data as well as medical time series, while inducing an interpretable representation. Moreover, on the medical time series, VarTPSOM successfully predicts future trajectories in the original data space. |
Tasks | Representation Learning, Time Series |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HJxJdp4YvS |
https://openreview.net/pdf?id=HJxJdp4YvS | |
PWC | https://paperswithcode.com/paper/variational-psom-deep-probabilistic-1 |
Repo | |
Framework | |
Embodied Multimodal Multitask Learning
Title | Embodied Multimodal Multitask Learning |
Authors | Anonymous |
Abstract | Visually-grounded embodied language learning models have recently shown to be effective at learning multiple multimodal tasks such as following navigational instructions and answering questions. In this paper, we address two key limitations of these models, (a) the inability to transfer the grounded knowledge across different tasks and (b) the inability to transfer to new words and concepts not seen during training using only a few examples. We propose a multitask model which facilitates knowledge transfer across tasks by disentangling the knowledge of words and visual attributes in the intermediate representations. We create scenarios and datasets to quantify cross-task knowledge transfer and show that the proposed model outperforms a range of baselines in simulated 3D environments. We also show that this disentanglement of representations makes our model modular and interpretable which allows for transfer to instructions containing new concepts. |
Tasks | Transfer Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lQQeHYPr |
https://openreview.net/pdf?id=r1lQQeHYPr | |
PWC | https://paperswithcode.com/paper/embodied-multimodal-multitask-learning-1 |
Repo | |
Framework | |
On Universal Equivariant Set Networks
Title | On Universal Equivariant Set Networks |
Authors | Anonymous |
Abstract | Using deep neural networks that are either invariant or equivariant to permutations in order to learn functions on unordered sets has become prevalent. The most popular, basic models are DeepSets (Zaheer et al. 2017) and PointNet (Qi et al. 2017). While known to be universal for approximating invariant functions, DeepSets and PointNet are not known to be universal when approximating equivariant set functions. On the other hand, several recent equivariant set architectures have been proven equivariant universal (Sannai et al. 2019, Keriven and Peyre 2019), however these models either use layers that are not permutation equivariant (in the standard sense) and/or use higher order tensor variables which are less practical. There is, therefore, a gap in understanding the universality of popular equivariant set models versus theoretical ones. In this paper we close this gap by proving that: (i) PointNet is not equivariant universal; and (ii) adding a single linear transmission layer makes PointNet universal. We call this architecture PointNetST and argue it is the simplest permutation equivariant universal model known to date. Another consequence is that DeepSets is universal, and also PointNetSeg, a popular point cloud segmentation network (used e.g., in Qi et al. 2017) is universal. The key theoretical tool used to prove the above results is an explicit characterization of all permutation equivariant polynomial layers. Lastly, we provide numerical experiments validating the theoretical results and comparing different permutation equivariant models. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HkxTwkrKDB |
https://openreview.net/pdf?id=HkxTwkrKDB | |
PWC | https://paperswithcode.com/paper/on-universal-equivariant-set-networks |
Repo | |
Framework | |
Cyclic Graph Dynamic Multilayer Perceptron for Periodic Signals
Title | Cyclic Graph Dynamic Multilayer Perceptron for Periodic Signals |
Authors | Anonymous |
Abstract | We propose a feature extraction for periodic signals. Virtually every mechanized transportation vehicle, power generation, industrial machine, and robotic system contains rotating shafts. It is possible to collect data about periodicity by mea- suring a shaft’s rotation. However, it is difficult to perfectly control the collection timing of the measurements. Imprecise timing creates phase shifts in the resulting data. Although a phase shift does not materially affect the measurement of any given data point collected, it does alter the order in which all of the points are col- lected. It is difficult for classical methods, like multi-layer perceptron, to identify or quantify these alterations because they depend on the order of the input vectors’ components. This paper proposes a robust method for extracting features from phase shift data by adding a graph structure to each data point and constructing a suitable machine learning architecture for graph data with cyclic permutation. Simulation and experimental results illustrate its effectiveness. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1xSzyrYDB |
https://openreview.net/pdf?id=S1xSzyrYDB | |
PWC | https://paperswithcode.com/paper/cyclic-graph-dynamic-multilayer-perceptron |
Repo | |
Framework | |
One-Shot Neural Architecture Search via Compressive Sensing
Title | One-Shot Neural Architecture Search via Compressive Sensing |
Authors | Anonymous |
Abstract | Neural architecture search (NAS), or automated design of neural network models, remains a very challenging meta-learning problem. Several recent works (called “one-shot” approaches) have focused on dramatically reducing NAS running time by leveraging proxy models that still provide architectures with competitive performance. In our work, we propose a new meta-learning algorithm that we call CoNAS, or Compressive sensing-based Neural Architecture Search. Our approach merges ideas from one-shot NAS approaches with iterative techniques for learning low-degree sparse Boolean polynomial functions. We validate our approach on several standard test datasets, discover novel architectures hitherto unreported, and achieve competitive (or better) results in both performance and search time compared to existing NAS approaches. Further, we provide theoretical analysis via upper bounds on the number of validation error measurements needed to perform reliable meta-learning; to our knowledge, these analysis tools are novel to the NAS literature and may be of independent interest. |
Tasks | Compressive Sensing, Meta-Learning, Neural Architecture Search |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1lsXREYvr |
https://openreview.net/pdf?id=B1lsXREYvr | |
PWC | https://paperswithcode.com/paper/one-shot-neural-architecture-search-via-1 |
Repo | |
Framework | |
Generalized Clustering by Learning to Optimize Expected Normalized Cuts
Title | Generalized Clustering by Learning to Optimize Expected Normalized Cuts |
Authors | Anonymous |
Abstract | We introduce a novel end-to-end approach for learning to cluster in the absence of labeled examples. Our clustering objective is based on optimizing normalized cuts, a criterion which measures both intra-cluster similarity as well as inter-cluster dissimilarity. We define a differentiable loss function equivalent to the expected normalized cuts. Unlike much of the work in unsupervised deep learning, our trained model directly outputs final cluster assignments, rather than embeddings that need further processing to be usable. Our approach generalizes to unseen datasets across a wide variety of domains, including text, and image. Specifically, we achieve state-of-the-art results on popular unsupervised clustering benchmarks (e.g., MNIST, Reuters, CIFAR-10, and CIFAR-100), outperforming the strongest baselines by up to 10.9%. Our generalization results are superior (by up to 21.9%) to the recent top-performing clustering approach with the ability to generalize. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BklLVAEKvH |
https://openreview.net/pdf?id=BklLVAEKvH | |
PWC | https://paperswithcode.com/paper/generalized-clustering-by-learning-to-1 |
Repo | |
Framework | |
Expected Information Maximization: Using the I-Projection for Mixture Density Estimation
Title | Expected Information Maximization: Using the I-Projection for Mixture Density Estimation |
Authors | Anonymous |
Abstract | Modelling highly multi-modal data is a challenging problem in machine learning. Most algorithms are based on maximizing the likelihood, which corresponds to the M(oment)-projection of the data distribution to the model distribution. The M-Projection forces the model to average over modes that can not be represented by the model. In contrast, the I(information)-Projection ignores such modes in the data and concentrates on the modes the model can represent. Such behavior is appealing whenever we deal with highly multi-modal data where it is more important to model single modes correctly instead of covering all the modes. Despite this advantage, the I-projection is rarely used in practise due to the lack of algorithms that can efficiently optimize it based on data. In this work, we present a new algorithm called Expected Information Maximization (EIM) for computing the I-projection solely based on samples for general latent variable models, where we focus on Gaussian mixtures models and Gaussian mixture of experts. Our approach applies a variational upper bound to the I-projection objective which decomposes the original objective into single objectives for each mixture component as well as for the coefficients, allowing an efficient optimization. Similar to GANs, our approach also employs discriminators but uses a more stable optimization procedures optimizing a tight upper bound. We show that our algorithm is much more effective in computing the I-projection than recent GAN approaches and we illustrate the effectiveness of our approach for modelling multi-modal behavior on two pedestrian and traffic prediction datasets. |
Tasks | Density Estimation, Latent Variable Models, Traffic Prediction |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=ByglLlHFDS |
https://openreview.net/pdf?id=ByglLlHFDS | |
PWC | https://paperswithcode.com/paper/expected-information-maximization-using-the-i |
Repo | |
Framework | |
Demystifying Inter-Class Disentanglement
Title | Demystifying Inter-Class Disentanglement |
Authors | Anonymous |
Abstract | Learning to disentangle the hidden factors of variations within a set of observations is a key task for artificial intelligence. We present a unified formulation for class and content disentanglement and use it to illustrate the limitations of current methods. We therefore introduce LORD, a novel method based on Latent Optimization for Representation Disentanglement. We find that latent optimization, along with an asymmetric noise regularization, is superior to amortized inference for achieving disentangled representations. In extensive experiments, our method is shown to achieve better disentanglement performance than both adversarial and non-adversarial methods that use the same level of supervision. We further introduce a clustering-based approach for extending our method for settings that exhibit in-class variation with promising results on the task of domain translation. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Hyl9xxHYPr |
https://openreview.net/pdf?id=Hyl9xxHYPr | |
PWC | https://paperswithcode.com/paper/demystifying-inter-class-disentanglement |
Repo | |
Framework | |
Learning robust visual representations using data augmentation invariance
Title | Learning robust visual representations using data augmentation invariance |
Authors | Anonymous |
Abstract | Deep convolutional neural networks trained for image object categorization have shown remarkable similarities with representations found across the primate ventral visual stream. Yet, artificial and biological networks still exhibit important differences. Here we investigate one such property: increasing invariance to identity-preserving image transformations found along the ventral stream. Despite theoretical evidence that invariance should emerge naturally from the optimization process, we present empirical evidence that the activations of convolutional neural networks trained for object categorization are not robust to identity-preserving image transformations commonly used in data augmentation. As a solution, we propose data augmentation invariance, an unsupervised learning objective which improves the robustness of the learned representations by promoting the similarity between the activations of augmented image samples. Our results show that this approach is a simple, yet effective and efficient (10 % increase in training time) way of increasing the invariance of the models while obtaining similar categorization performance. |
Tasks | Data Augmentation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1elqkrKPH |
https://openreview.net/pdf?id=B1elqkrKPH | |
PWC | https://paperswithcode.com/paper/learning-robust-visual-representations-using-1 |
Repo | |
Framework | |