April 1, 2020

2833 words 14 mins read

Paper Group NANR 99

Privacy-preserving Representation Learning by Disentanglement. Projection Based Constrained Policy Optimization. Single episode transfer for differing environmental dynamics in reinforcement learning. SELF: Learning to Filter Noisy Labels with Self-Ensembling. Efficient Probabilistic Logic Reasoning with Graph Neural Networks. Black-box Off-policy …

Privacy-preserving Representation Learning by Disentanglement


Title	Privacy-preserving Representation Learning by Disentanglement
Authors	Anonymous
Abstract	Deep learning and latest machine learning technology heralded an era of success in data analysis. Accompanied by the ever increasing performance, reaching super-human performance in many areas, is the requirement of amassing more and more data to train these models. Often ignored or underestimated, the big data curation is associated with the risk of privacy leakages. The proposed approach seeks to mitigate these privacy issues. In order to sanitize data from sensitive content, we propose to learn a privacy-preserving data representation by disentangling into public and private part, with the public part being shareable without privacy infringement. The proposed approach deals with the setting where the private features are not explicit, and is estimated though the course of learning. This is particularly appealing, when the notion of sensitive attribute is `fuzzy''. We showcase feasibility in terms of classification of facial attributes and identity on the CelebA dataset. The results suggest that private component can be removed in the cases where the the downstream task is known a priori (i.e.,` supervised’'), and the case where it is not known a priori (i.e., ``weakly-supervised’'). \|
Tasks	Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rkewaxrtvr
PDF	https://openreview.net/pdf?id=rkewaxrtvr
PWC	https://paperswithcode.com/paper/privacy-preserving-representation-learning-by
Repo
Framework

Projection Based Constrained Policy Optimization


Title	Projection Based Constrained Policy Optimization
Authors	Anonymous
Abstract	In this paper, we consider the problem of learning control policies that optimize areward function while satisfying constraints due to considerations of safety, fairness, or other costs. We propose a new algorithm - Projection Based ConstrainedPolicy Optimization (PCPO), an iterative method for optimizing policies in a two-step process - the first step performs an unconstrained update while the secondstep reconciles the constraint violation by projection the policy back onto the constraint set. We theoretically analyze PCPO and provide a lower bound on rewardimprovement, as well as an upper bound on constraint violation for each policy update. We further characterize the convergence of PCPO with projection basedon two different metrics - L2 norm and Kullback-Leibler divergence. Our empirical results over several control tasks demonstrate that our algorithm achievessuperior performance, averaging more than 3.5 times less constraint violation andaround 15% higher reward compared to state-of-the-art methods.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rke3TJrtPS
PDF	https://openreview.net/pdf?id=rke3TJrtPS
PWC	https://paperswithcode.com/paper/projection-based-constrained-policy
Repo
Framework

Single episode transfer for differing environmental dynamics in reinforcement learning


Title	Single episode transfer for differing environmental dynamics in reinforcement learning
Authors	Anonymous
Abstract	Transfer and adaptation to new unknown environmental dynamics is a key challenge for reinforcement learning (RL). An even greater challenge is performing near-optimally in a single attempt at test time, possibly without access to dense rewards, which is not addressed by current methods that require multiple experience rollouts for adaptation. To achieve single episode transfer in a family of environments with related dynamics, we propose a general algorithm that optimizes a probe and an inference model to rapidly estimate underlying latent variables of test dynamics, which are then immediately used as input to a universal control policy. This modular approach enables integration of state-of-the-art algorithms for variational inference or RL. Moreover, our approach does not require access to rewards at test time, allowing it to perform in settings where existing adaptive approaches cannot. In diverse experimental domains with a single episode test constraint, our method significantly outperforms existing adaptive approaches and shows favorable performance against baselines for robust transfer.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rJeQoCNYDS
PDF	https://openreview.net/pdf?id=rJeQoCNYDS
PWC	https://paperswithcode.com/paper/single-episode-transfer-for-differing
Repo
Framework

SELF: Learning to Filter Noisy Labels with Self-Ensembling


Title	SELF: Learning to Filter Noisy Labels with Self-Ensembling
Authors	Anonymous
Abstract	Deep neural networks (DNNs) have been shown to over-fit a dataset when being trained with noisy labels for a long enough time. To overcome this problem, we present a simple and effective method self-ensemble label filtering (SELF) to progressively filter out the wrong labels during training. Our method improves the task performance by gradually allowing supervision only from the potentially non-noisy (clean) labels and stops learning on the filtered noisy labels. For the filtering, we form running averages of predictions over the entire training dataset using the network output at different training epochs. We show that these ensemble estimates yield more accurate identification of inconsistent predictions throughout training than the single estimates of the network at the most recent training epoch. While filtered samples are removed entirely from the supervised training loss, we dynamically leverage them via semi-supervised learning in the unsupervised loss. We demonstrate the positive effect of such an approach on various image classification tasks under both symmetric and asymmetric label noise and at different noise ratios. It substantially outperforms all previous works on noise-aware learning across different datasets and can be applied to a broad set of network architectures.
Tasks	Image Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=HkgsPhNYPS
PDF	https://openreview.net/pdf?id=HkgsPhNYPS
PWC	https://paperswithcode.com/paper/self-learning-to-filter-noisy-labels-with-1
Repo
Framework

Efficient Probabilistic Logic Reasoning with Graph Neural Networks


Title	Efficient Probabilistic Logic Reasoning with Graph Neural Networks
Authors	Anonymous
Abstract	Markov Logic Networks (MLNs), which elegantly combine logic rules and probabilistic graphical models, can be used to address many knowledge graph problems. However, inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult. In recent years, graph neural networks (GNNs) have emerged as efficient and effective tools for large-scale graph problems. Nevertheless, GNNs do not explicitly incorporate prior logic rules into the models, and may require many labeled examples for a target task. In this paper, we explore the combination of MLNs and GNNs, and use graph neural networks for variational inference in MLN. We propose a GNN variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model. Our extensive experiments on several benchmark datasets demonstrate that ExpressGNN leads to effective and efficient probabilistic logic reasoning.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rJg76kStwH
PDF	https://openreview.net/pdf?id=rJg76kStwH
PWC	https://paperswithcode.com/paper/efficient-probabilistic-logic-reasoning-with
Repo
Framework

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning


Title	Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Authors	Anonymous
Abstract	Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \citet{liu18breaking} proposed an approach that avoids the curse of horizon suffered by typical importance-sampling-based methods. While showing promising results, this approach is limited in practice as it requires data being collected by a known behavior policy. In this work, we propose a novel approach that eliminates such limitations. In particular, we formulate the problem as solving for the fixed point of a “backward flow” operator and show that the fixed point solution gives the desired importance ratios of stationary distributions between the target and behavior policies. We analyze its asymptotic consistency and finite-sample generalization. Experiments on benchmarks verify the effectiveness of our proposed approach.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=S1ltg1rFDS
PDF	https://openreview.net/pdf?id=S1ltg1rFDS
PWC	https://paperswithcode.com/paper/black-box-off-policy-estimation-for-infinite
Repo
Framework

On the Equivalence between Node Embeddings and Structural Graph Representations


Title	On the Equivalence between Node Embeddings and Structural Graph Representations
Authors	Anonymous
Abstract	This work provides the first unifying theoretical framework for node embeddings and structural graph representations, bridging methods like matrix factorization and graph neural networks. Using invariant theory, we show that relationship between structural representations and node embeddings is analogous to that of a distribution and its samples. We prove that all tasks that can be performed by node embeddings can also be performed by structural representations and vice-versa. We also show that the concept of transductive and inductive learning is unrelated to node embeddings and graph representations, clearing another source of confusion in the literature. Finally, we introduce new practical guidelines to generating and using node embeddings, which further augments standard operating procedures used today.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SJxzFySKwH
PDF	https://openreview.net/pdf?id=SJxzFySKwH
PWC	https://paperswithcode.com/paper/on-the-equivalence-between-node-embeddings
Repo
Framework

Emergent Systematic Generalization In a Situated Agent


Title	Emergent Systematic Generalization In a Situated Agent
Authors	Anonymous
Abstract	The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we demonstrate strong emergent systematic generalisation in a neural network agent and isolate the factors that support this ability. In environments ranging from a grid-world to a rich interactive 3D Unity room, we show that an agent can correctly exploit the compositional nature of a symbolic language to interpret never-seen-before instructions. We observe this capacity not only when instructions refer to object properties (colors and shapes) but also verb-like motor skills (lifting and putting) and abstract modifying operations (negation). We identify three factors that can contribute to this facility for systematic generalisation: (a) the number of object/word experiences in the training set; (b) the invariances afforded by a first-person, egocentric perspective; and (c) the variety of visual input experienced by an agent that perceives the world actively over time. Thus, while neural nets trained in idealised or reduced situations may fail to exhibit a compositional or systematic understanding of their experience, this competence can readily emerge when, like human learners, they have access to many examples of richly varying, multi-modal observations as they learn.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SklGryBtwr
PDF	https://openreview.net/pdf?id=SklGryBtwr
PWC	https://paperswithcode.com/paper/emergent-systematic-generalization-in-a
Repo
Framework

Learning Efficient Parameter Server Synchronization Policies for Distributed SGD


Title	Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
Authors	Anonymous
Abstract	We apply a reinforcement learning (RL) based approach to learning optimal synchronization policies used for Parameter Server-based distributed training of machine learning models with Stochastic Gradient Descent (SGD). Utilizing a formal synchronization policy description in the PS-setting, we are able to derive a suitable and compact description of states and actions, allowing us to efficiently use the standard off-the-shelf deep Q-learning algorithm. As a result, we are able to learn synchronization policies which generalize to different cluster environments, different training datasets and small model variations and (most importantly) lead to considerable decreases in training time when compared to standard policies such as bulk synchronous parallel (BSP), asynchronous parallel (ASP), or stale synchronous parallel (SSP). To support our claims we present extensive numerical results obtained from experiments performed in simulated cluster environments. In our experiments training time is reduced by 44 on average and learned policies generalize to multiple unseen circumstances.
Tasks	Q-Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rJxX8T4Kvr
PDF	https://openreview.net/pdf?id=rJxX8T4Kvr
PWC	https://paperswithcode.com/paper/learning-efficient-parameter-server
Repo
Framework

Unsupervised Clustering using Pseudo-semi-supervised Learning


Title	Unsupervised Clustering using Pseudo-semi-supervised Learning
Authors	Anonymous
Abstract	In this paper, we propose a framework that leverages semi-supervised models to improve unsupervised clustering performance. To leverage semi-supervised models, we first need to automatically generate labels, called pseudo-labels. We find that prior approaches for generating pseudo-labels hurt clustering performance because of their low accuracy. Instead, we use an ensemble of deep networks to construct a similarity graph, from which we extract high accuracy pseudo-labels. The approach of finding high quality pseudo-labels using ensembles and training the semi-supervised model is iterated, yielding continued improvement. We show that our approach outperforms state of the art clustering results for multiple image and text datasets. For example, we achieve 54.6% accuracy for CIFAR-10 and 43.9% for 20news, outperforming state of the art by 8-12% in absolute terms.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rJlnxkSYPS
PDF	https://openreview.net/pdf?id=rJlnxkSYPS
PWC	https://paperswithcode.com/paper/unsupervised-clustering-using-pseudo-semi
Repo
Framework

The Ingredients of Real World Robotic Reinforcement Learning


Title	The Ingredients of Real World Robotic Reinforcement Learning
Authors	Anonymous
Abstract	The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning. In this work, we discuss the required elements of a robotic system that can continually and autonomously improve with data collected in the real world, and propose a particular instantiation of such a system. Subsequently, we investigate a number of challenges of learning without instrumentation – including the lack of episodic resets, state estimation, and hand-engineered rewards – and propose simple, scalable solutions to these challenges. We demonstrate the efficacy of our proposed system on dexterous robotic manipulation tasks in simulation and the real world, and also provide an insightful analysis and ablation study of the challenges associated with this learning paradigm.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rJe2syrtvS
PDF	https://openreview.net/pdf?id=rJe2syrtvS
PWC	https://paperswithcode.com/paper/the-ingredients-of-real-world-robotic
Repo
Framework

Distance-based Composable Representations with Neural Networks


Title	Distance-based Composable Representations with Neural Networks
Authors	Anonymous
Abstract	We introduce a new deep learning technique that builds individual and class representations based on distance estimates to randomly generated contextual dimensions for different modalities. Recent works have demonstrated advantages to creating representations from probability distributions over their contexts rather than single points in a low-dimensional Euclidean vector space. These methods, however, rely on pre-existing features and are limited to textual information. In this work, we obtain generic template representations that are vectors containing the average distance of a class to randomly generated contextual information. These representations have the benefit of being both interpretable and composable. They are initially learned by estimating the Wasserstein distance for different data subsets with deep neural networks. Individual samples or instances can then be compared to the generic class representations, which we call templates, to determine their similarity and thus class membership. We show that this technique, which we call WDVec, delivers good results for multi-label image classification. Additionally, we illustrate the benefit of templates and their composability by performing retrieval with complex queries where we modify the information content in the representations. Our method can be used in conjunction with any existing neural network and create theoretically infinitely large feature maps.
Tasks	Image Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=HJgb7lSFwS
PDF	https://openreview.net/pdf?id=HJgb7lSFwS
PWC	https://paperswithcode.com/paper/distance-based-composable-representations
Repo
Framework

Jacobian Adversarially Regularized Networks for Robustness


Title	Jacobian Adversarially Regularized Networks for Robustness
Authors	Anonymous
Abstract	Adversarial examples are crafted with imperceptible perturbations with the intent to fool neural networks. Against such attacks, adversarial training and its variants stand as the strongest defense to date. Previous studies have pointed out that robust models that have undergone adversarial training tend to produce more salient and interpretable Jacobian matrices than their non-robust counterparts. A natural question is whether a model trained with an objective to produce salient Jacobian can result in better robustness. This paper answers this question with affirmative empirical results. We propose Jacobian Adversarially Regularized Networks (JARN) as a method to optimize the saliency of a classifier’s Jacobian by adversarially regularizing the model’s Jacobian to resemble natural training images. Image classifiers trained with JARN show improved robust accuracy compared to standard models on the MNIST, SVHN and CIFAR-10 datasets, uncovering a new angle to boost robustness without using adversarial training.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Hke0V1rKPS
PDF	https://openreview.net/pdf?id=Hke0V1rKPS
PWC	https://paperswithcode.com/paper/jacobian-adversarially-regularized-networks
Repo
Framework

Duration-of-Stay Storage Assignment under Uncertainty


Title	Duration-of-Stay Storage Assignment under Uncertainty
Authors	Anonymous
Abstract	Storage assignment, the act of choosing what goods are placed in what locations in a warehouse, is a central problem of supply chain logistics. Past literature has shown that the optimal method to assign pallets is to arrange them in increasing duration of stay in the warehouse (the Duration-of-Stay, or DoS, method), but the methodology requires perfect prior knowledge of DoS for each pallet, which is unknown and uncertain under realistic conditions. Attempts to predict DoS have largely been unfruitful due to the multi-valuedness nature (every shipment contains multiple identical pallets with different DoS) and data sparsity induced by lack of matching historical conditions. In this paper, we introduce a new framework for storage assignment that provides a solution to the DoS prediction problem through a distributional reformulation and a novel neural network, ParallelNet. Through collaboration with a world-leading cold storage company, we show that the system is able to predict DoS with a MAPE of 29%, a decrease of ~30% compared to a CNN-LSTM model, and suffers less performance decay into the future. The framework is then integrated into a first-of-its-kind Storage Assignment system, which is being deployed in warehouses across United States, with initial results showing up to 21% in labor savings. We also release the first publicly available set of warehousing records to facilitate research into this central problem.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkx7xRVYDr
PDF	https://openreview.net/pdf?id=Hkx7xRVYDr
PWC	https://paperswithcode.com/paper/duration-of-stay-storage-assignment-under
Repo
Framework

Weakly Supervised Disentanglement with Guarantees


Title	Weakly Supervised Disentanglement with Guarantees
Authors	Anonymous
Abstract	Learning disentangled representations that correspond to factors of variation in real-world data is critical to interpretable and human-controllable machine learning. Recently, concerns about the viability of learning disentangled representations in a purely unsupervised manner has spurred a shift toward the incorporation of weak supervision. However, there is currently no formalism that identifies when and how weak supervision will guarantee disentanglement. To address this issue, we provide a theoretical framework—including a calculus of disentanglement— to assist in analyzing the disentanglement guarantees (or lack thereof) conferred by weak supervision when coupled with learning algorithms based on distribution matching. We empirically verify the guarantees and limitations of several weak supervision methods (restricted labeling, match-pairing, and rank-pairing), demonstrating the predictive power and usefulness of our theoretical framework.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=HJgSwyBKvr
PDF	https://openreview.net/pdf?id=HJgSwyBKvr
PWC	https://paperswithcode.com/paper/weakly-supervised-disentanglement-with
Repo
Framework