April 1, 2020

2833 words 14 mins read

Paper Group NANR 99

Paper Group NANR 99

Privacy-preserving Representation Learning by Disentanglement. Projection Based Constrained Policy Optimization. Single episode transfer for differing environmental dynamics in reinforcement learning. SELF: Learning to Filter Noisy Labels with Self-Ensembling. Efficient Probabilistic Logic Reasoning with Graph Neural Networks. Black-box Off-policy …

Privacy-preserving Representation Learning by Disentanglement

Title Privacy-preserving Representation Learning by Disentanglement
Authors Anonymous
Abstract Deep learning and latest machine learning technology heralded an era of success in data analysis. Accompanied by the ever increasing performance, reaching super-human performance in many areas, is the requirement of amassing more and more data to train these models. Often ignored or underestimated, the big data curation is associated with the risk of privacy leakages. The proposed approach seeks to mitigate these privacy issues. In order to sanitize data from sensitive content, we propose to learn a privacy-preserving data representation by disentangling into public and private part, with the public part being shareable without privacy infringement. The proposed approach deals with the setting where the private features are not explicit, and is estimated though the course of learning. This is particularly appealing, when the notion of sensitive attribute is fuzzy''. We showcase feasibility in terms of classification of facial attributes and identity on the CelebA dataset. The results suggest that private component can be removed in the cases where the the downstream task is known a priori (i.e., supervised’'), and the case where it is not known a priori (i.e., ``weakly-supervised’'). |
Tasks Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=rkewaxrtvr
PDF https://openreview.net/pdf?id=rkewaxrtvr
PWC https://paperswithcode.com/paper/privacy-preserving-representation-learning-by
Repo
Framework

Projection Based Constrained Policy Optimization

Title Projection Based Constrained Policy Optimization
Authors Anonymous
Abstract In this paper, we consider the problem of learning control policies that optimize areward function while satisfying constraints due to considerations of safety, fairness, or other costs. We propose a new algorithm - Projection Based ConstrainedPolicy Optimization (PCPO), an iterative method for optimizing policies in a two-step process - the first step performs an unconstrained update while the secondstep reconciles the constraint violation by projection the policy back onto the constraint set. We theoretically analyze PCPO and provide a lower bound on rewardimprovement, as well as an upper bound on constraint violation for each policy update. We further characterize the convergence of PCPO with projection basedon two different metrics - L2 norm and Kullback-Leibler divergence. Our empirical results over several control tasks demonstrate that our algorithm achievessuperior performance, averaging more than 3.5 times less constraint violation andaround 15% higher reward compared to state-of-the-art methods.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rke3TJrtPS
PDF https://openreview.net/pdf?id=rke3TJrtPS
PWC https://paperswithcode.com/paper/projection-based-constrained-policy
Repo
Framework

Single episode transfer for differing environmental dynamics in reinforcement learning

Title Single episode transfer for differing environmental dynamics in reinforcement learning
Authors Anonymous
Abstract Transfer and adaptation to new unknown environmental dynamics is a key challenge for reinforcement learning (RL). An even greater challenge is performing near-optimally in a single attempt at test time, possibly without access to dense rewards, which is not addressed by current methods that require multiple experience rollouts for adaptation. To achieve single episode transfer in a family of environments with related dynamics, we propose a general algorithm that optimizes a probe and an inference model to rapidly estimate underlying latent variables of test dynamics, which are then immediately used as input to a universal control policy. This modular approach enables integration of state-of-the-art algorithms for variational inference or RL. Moreover, our approach does not require access to rewards at test time, allowing it to perform in settings where existing adaptive approaches cannot. In diverse experimental domains with a single episode test constraint, our method significantly outperforms existing adaptive approaches and shows favorable performance against baselines for robust transfer.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJeQoCNYDS
PDF https://openreview.net/pdf?id=rJeQoCNYDS
PWC https://paperswithcode.com/paper/single-episode-transfer-for-differing
Repo
Framework

SELF: Learning to Filter Noisy Labels with Self-Ensembling

Title SELF: Learning to Filter Noisy Labels with Self-Ensembling
Authors Anonymous
Abstract Deep neural networks (DNNs) have been shown to over-fit a dataset when being trained with noisy labels for a long enough time. To overcome this problem, we present a simple and effective method self-ensemble label filtering (SELF) to progressively filter out the wrong labels during training. Our method improves the task performance by gradually allowing supervision only from the potentially non-noisy (clean) labels and stops learning on the filtered noisy labels. For the filtering, we form running averages of predictions over the entire training dataset using the network output at different training epochs. We show that these ensemble estimates yield more accurate identification of inconsistent predictions throughout training than the single estimates of the network at the most recent training epoch. While filtered samples are removed entirely from the supervised training loss, we dynamically leverage them via semi-supervised learning in the unsupervised loss. We demonstrate the positive effect of such an approach on various image classification tasks under both symmetric and asymmetric label noise and at different noise ratios. It substantially outperforms all previous works on noise-aware learning across different datasets and can be applied to a broad set of network architectures.
Tasks Image Classification
Published 2020-01-01
URL https://openreview.net/forum?id=HkgsPhNYPS
PDF https://openreview.net/pdf?id=HkgsPhNYPS
PWC https://paperswithcode.com/paper/self-learning-to-filter-noisy-labels-with-1
Repo
Framework

Efficient Probabilistic Logic Reasoning with Graph Neural Networks

Title Efficient Probabilistic Logic Reasoning with Graph Neural Networks
Authors Anonymous
Abstract Markov Logic Networks (MLNs), which elegantly combine logic rules and probabilistic graphical models, can be used to address many knowledge graph problems. However, inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult. In recent years, graph neural networks (GNNs) have emerged as efficient and effective tools for large-scale graph problems. Nevertheless, GNNs do not explicitly incorporate prior logic rules into the models, and may require many labeled examples for a target task. In this paper, we explore the combination of MLNs and GNNs, and use graph neural networks for variational inference in MLN. We propose a GNN variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model. Our extensive experiments on several benchmark datasets demonstrate that ExpressGNN leads to effective and efficient probabilistic logic reasoning.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJg76kStwH
PDF https://openreview.net/pdf?id=rJg76kStwH
PWC https://paperswithcode.com/paper/efficient-probabilistic-logic-reasoning-with
Repo
Framework

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Title Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Authors Anonymous
Abstract Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \citet{liu18breaking} proposed an approach that avoids the curse of horizon suffered by typical importance-sampling-based methods. While showing promising results, this approach is limited in practice as it requires data being collected by a known behavior policy. In this work, we propose a novel approach that eliminates such limitations. In particular, we formulate the problem as solving for the fixed point of a “backward flow” operator and show that the fixed point solution gives the desired importance ratios of stationary distributions between the target and behavior policies. We analyze its asymptotic consistency and finite-sample generalization. Experiments on benchmarks verify the effectiveness of our proposed approach.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=S1ltg1rFDS
PDF https://openreview.net/pdf?id=S1ltg1rFDS
PWC https://paperswithcode.com/paper/black-box-off-policy-estimation-for-infinite
Repo
Framework

On the Equivalence between Node Embeddings and Structural Graph Representations

Title On the Equivalence between Node Embeddings and Structural Graph Representations
Authors Anonymous
Abstract This work provides the first unifying theoretical framework for node embeddings and structural graph representations, bridging methods like matrix factorization and graph neural networks. Using invariant theory, we show that relationship between structural representations and node embeddings is analogous to that of a distribution and its samples. We prove that all tasks that can be performed by node embeddings can also be performed by structural representations and vice-versa. We also show that the concept of transductive and inductive learning is unrelated to node embeddings and graph representations, clearing another source of confusion in the literature. Finally, we introduce new practical guidelines to generating and using node embeddings, which further augments standard operating procedures used today.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SJxzFySKwH
PDF https://openreview.net/pdf?id=SJxzFySKwH
PWC https://paperswithcode.com/paper/on-the-equivalence-between-node-embeddings
Repo
Framework

Emergent Systematic Generalization In a Situated Agent

Title Emergent Systematic Generalization In a Situated Agent
Authors Anonymous
Abstract The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we demonstrate strong emergent systematic generalisation in a neural network agent and isolate the factors that support this ability. In environments ranging from a grid-world to a rich interactive 3D Unity room, we show that an agent can correctly exploit the compositional nature of a symbolic language to interpret never-seen-before instructions. We observe this capacity not only when instructions refer to object properties (colors and shapes) but also verb-like motor skills (lifting and putting) and abstract modifying operations (negation). We identify three factors that can contribute to this facility for systematic generalisation: (a) the number of object/word experiences in the training set; (b) the invariances afforded by a first-person, egocentric perspective; and (c) the variety of visual input experienced by an agent that perceives the world actively over time. Thus, while neural nets trained in idealised or reduced situations may fail to exhibit a compositional or systematic understanding of their experience, this competence can readily emerge when, like human learners, they have access to many examples of richly varying, multi-modal observations as they learn.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SklGryBtwr
PDF https://openreview.net/pdf?id=SklGryBtwr
PWC https://paperswithcode.com/paper/emergent-systematic-generalization-in-a
Repo
Framework

Learning Efficient Parameter Server Synchronization Policies for Distributed SGD

Title Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
Authors Anonymous
Abstract We apply a reinforcement learning (RL) based approach to learning optimal synchronization policies used for Parameter Server-based distributed training of machine learning models with Stochastic Gradient Descent (SGD). Utilizing a formal synchronization policy description in the PS-setting, we are able to derive a suitable and compact description of states and actions, allowing us to efficiently use the standard off-the-shelf deep Q-learning algorithm. As a result, we are able to learn synchronization policies which generalize to different cluster environments, different training datasets and small model variations and (most importantly) lead to considerable decreases in training time when compared to standard policies such as bulk synchronous parallel (BSP), asynchronous parallel (ASP), or stale synchronous parallel (SSP). To support our claims we present extensive numerical results obtained from experiments performed in simulated cluster environments. In our experiments training time is reduced by 44 on average and learned policies generalize to multiple unseen circumstances.
Tasks Q-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=rJxX8T4Kvr
PDF https://openreview.net/pdf?id=rJxX8T4Kvr
PWC https://paperswithcode.com/paper/learning-efficient-parameter-server
Repo
Framework

Unsupervised Clustering using Pseudo-semi-supervised Learning

Title Unsupervised Clustering using Pseudo-semi-supervised Learning
Authors Anonymous
Abstract In this paper, we propose a framework that leverages semi-supervised models to improve unsupervised clustering performance. To leverage semi-supervised models, we first need to automatically generate labels, called pseudo-labels. We find that prior approaches for generating pseudo-labels hurt clustering performance because of their low accuracy. Instead, we use an ensemble of deep networks to construct a similarity graph, from which we extract high accuracy pseudo-labels. The approach of finding high quality pseudo-labels using ensembles and training the semi-supervised model is iterated, yielding continued improvement. We show that our approach outperforms state of the art clustering results for multiple image and text datasets. For example, we achieve 54.6% accuracy for CIFAR-10 and 43.9% for 20news, outperforming state of the art by 8-12% in absolute terms.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJlnxkSYPS
PDF https://openreview.net/pdf?id=rJlnxkSYPS
PWC https://paperswithcode.com/paper/unsupervised-clustering-using-pseudo-semi
Repo
Framework

The Ingredients of Real World Robotic Reinforcement Learning

Title The Ingredients of Real World Robotic Reinforcement Learning
Authors Anonymous
Abstract The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning. In this work, we discuss the required elements of a robotic system that can continually and autonomously improve with data collected in the real world, and propose a particular instantiation of such a system. Subsequently, we investigate a number of challenges of learning without instrumentation – including the lack of episodic resets, state estimation, and hand-engineered rewards – and propose simple, scalable solutions to these challenges. We demonstrate the efficacy of our proposed system on dexterous robotic manipulation tasks in simulation and the real world, and also provide an insightful analysis and ablation study of the challenges associated with this learning paradigm.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJe2syrtvS
PDF https://openreview.net/pdf?id=rJe2syrtvS
PWC https://paperswithcode.com/paper/the-ingredients-of-real-world-robotic
Repo
Framework

Distance-based Composable Representations with Neural Networks

Title Distance-based Composable Representations with Neural Networks
Authors Anonymous
Abstract We introduce a new deep learning technique that builds individual and class representations based on distance estimates to randomly generated contextual dimensions for different modalities. Recent works have demonstrated advantages to creating representations from probability distributions over their contexts rather than single points in a low-dimensional Euclidean vector space. These methods, however, rely on pre-existing features and are limited to textual information. In this work, we obtain generic template representations that are vectors containing the average distance of a class to randomly generated contextual information. These representations have the benefit of being both interpretable and composable. They are initially learned by estimating the Wasserstein distance for different data subsets with deep neural networks. Individual samples or instances can then be compared to the generic class representations, which we call templates, to determine their similarity and thus class membership. We show that this technique, which we call WDVec, delivers good results for multi-label image classification. Additionally, we illustrate the benefit of templates and their composability by performing retrieval with complex queries where we modify the information content in the representations. Our method can be used in conjunction with any existing neural network and create theoretically infinitely large feature maps.
Tasks Image Classification
Published 2020-01-01
URL https://openreview.net/forum?id=HJgb7lSFwS
PDF https://openreview.net/pdf?id=HJgb7lSFwS
PWC https://paperswithcode.com/paper/distance-based-composable-representations
Repo
Framework

Jacobian Adversarially Regularized Networks for Robustness

Title Jacobian Adversarially Regularized Networks for Robustness
Authors Anonymous
Abstract Adversarial examples are crafted with imperceptible perturbations with the intent to fool neural networks. Against such attacks, adversarial training and its variants stand as the strongest defense to date. Previous studies have pointed out that robust models that have undergone adversarial training tend to produce more salient and interpretable Jacobian matrices than their non-robust counterparts. A natural question is whether a model trained with an objective to produce salient Jacobian can result in better robustness. This paper answers this question with affirmative empirical results. We propose Jacobian Adversarially Regularized Networks (JARN) as a method to optimize the saliency of a classifier’s Jacobian by adversarially regularizing the model’s Jacobian to resemble natural training images. Image classifiers trained with JARN show improved robust accuracy compared to standard models on the MNIST, SVHN and CIFAR-10 datasets, uncovering a new angle to boost robustness without using adversarial training.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Hke0V1rKPS
PDF https://openreview.net/pdf?id=Hke0V1rKPS
PWC https://paperswithcode.com/paper/jacobian-adversarially-regularized-networks
Repo
Framework

Duration-of-Stay Storage Assignment under Uncertainty

Title Duration-of-Stay Storage Assignment under Uncertainty
Authors Anonymous
Abstract Storage assignment, the act of choosing what goods are placed in what locations in a warehouse, is a central problem of supply chain logistics. Past literature has shown that the optimal method to assign pallets is to arrange them in increasing duration of stay in the warehouse (the Duration-of-Stay, or DoS, method), but the methodology requires perfect prior knowledge of DoS for each pallet, which is unknown and uncertain under realistic conditions. Attempts to predict DoS have largely been unfruitful due to the multi-valuedness nature (every shipment contains multiple identical pallets with different DoS) and data sparsity induced by lack of matching historical conditions. In this paper, we introduce a new framework for storage assignment that provides a solution to the DoS prediction problem through a distributional reformulation and a novel neural network, ParallelNet. Through collaboration with a world-leading cold storage company, we show that the system is able to predict DoS with a MAPE of 29%, a decrease of ~30% compared to a CNN-LSTM model, and suffers less performance decay into the future. The framework is then integrated into a first-of-its-kind Storage Assignment system, which is being deployed in warehouses across United States, with initial results showing up to 21% in labor savings. We also release the first publicly available set of warehousing records to facilitate research into this central problem.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Hkx7xRVYDr
PDF https://openreview.net/pdf?id=Hkx7xRVYDr
PWC https://paperswithcode.com/paper/duration-of-stay-storage-assignment-under
Repo
Framework

Weakly Supervised Disentanglement with Guarantees

Title Weakly Supervised Disentanglement with Guarantees
Authors Anonymous
Abstract Learning disentangled representations that correspond to factors of variation in real-world data is critical to interpretable and human-controllable machine learning. Recently, concerns about the viability of learning disentangled representations in a purely unsupervised manner has spurred a shift toward the incorporation of weak supervision. However, there is currently no formalism that identifies when and how weak supervision will guarantee disentanglement. To address this issue, we provide a theoretical framework—including a calculus of disentanglement— to assist in analyzing the disentanglement guarantees (or lack thereof) conferred by weak supervision when coupled with learning algorithms based on distribution matching. We empirically verify the guarantees and limitations of several weak supervision methods (restricted labeling, match-pairing, and rank-pairing), demonstrating the predictive power and usefulness of our theoretical framework.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJgSwyBKvr
PDF https://openreview.net/pdf?id=HJgSwyBKvr
PWC https://paperswithcode.com/paper/weakly-supervised-disentanglement-with
Repo
Framework
comments powered by Disqus