April 1, 2020

2995 words 15 mins read

Paper Group NANR 92

Paper Group NANR 92

Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth. Combining Q-Learning and Search with Amortized Value Estimates. Improved memory in recurrent neural networks with sequential non-normal dynamics. Differentiable Reasoning over a Virtual Knowledge Base. Hamiltonian Generative Networks. GenDICE: Generalized Offli …

Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth

Title Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth
Authors Anonymous
Abstract In most machine learning tasks unambiguous ground truth labels can easily be acquired. However, this luxury is often not afforded to many high-stakes, real-world scenarios such as medical image interpretation, where even expert human annotators typically exhibit very high levels of disagreement with one another. While prior works have focused on overcoming noisy labels during training, the question of how to evaluate models when annotators disagree about ground truth has remained largely unexplored. To address this, we propose the discrepancy ratio: a novel, task-independent and principled framework for validating machine learning models in the presence of high label noise. Conceptually, our approach evaluates a model by comparing its predictions to those of human annotators, taking into account the degree to which annotators disagree with one another. While our approach is entirely general, we show that in the special case of binary classification, our proposed metric can be evaluated in terms of simple, closed-form expressions that depend only on aggregate statistics of the labels and not on any individual label. Finally, we demonstrate how this framework can be used effectively to validate machine learning models that we trained on two real-world tasks from medical imaging. The discrepancy ratio metric reveals what conventional metrics do not: that our models not only vastly exceed the average human performance, but even exceed the performance of the best human experts in our datasets.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Byg-wJSYDS
PDF https://openreview.net/pdf?id=Byg-wJSYDS
PWC https://paperswithcode.com/paper/discrepancy-ratio-evaluating-model
Repo
Framework

Combining Q-Learning and Search with Amortized Value Estimates

Title Combining Q-Learning and Search with Amortized Value Estimates
Authors Anonymous
Abstract We introduce “Search with Amortized Value Estimates” (SAVE), an approach for combining model-free Q-learning with model-based Monte-Carlo Tree Search (MCTS). In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values. The new Q-estimates are then used in combination with real experience to update the prior. This effectively amortizes the value computation performed by MCTS, resulting in a cooperative relationship between model-free learning and model-based search. SAVE can be implemented on top of any Q-learning agent with access to a model, which we demonstrate by incorporating it into agents that perform challenging physical reasoning tasks and Atari. SAVE consistently achieves higher rewards with fewer training steps, and—in contrast to typical model-based search approaches—yields strong performance with very small search budgets. By combining real experience with information computed during search, SAVE demonstrates that it is possible to improve on both the performance of model-free learning and the computational cost of planning.
Tasks Q-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=SkeAaJrKDS
PDF https://openreview.net/pdf?id=SkeAaJrKDS
PWC https://paperswithcode.com/paper/combining-q-learning-and-search-with
Repo
Framework

Improved memory in recurrent neural networks with sequential non-normal dynamics

Title Improved memory in recurrent neural networks with sequential non-normal dynamics
Authors Anonymous
Abstract Training recurrent neural networks (RNNs) is a hard problem due to degeneracies in the optimization landscape, a problem also known as the vanishing/exploding gradients problem. Short of designing new RNN architectures, various methods that have been proposed for dealing with this problem usually boil down to orthogonalization of the recurrent dynamics, either at initialization or during the entire training period. The basic motivation behind these methods is that orthogonal transformations are isometries of the Euclidean space, hence they preserve (Euclidean) norms and effectively deal with the vanishing/exploding gradients problem. However, this idea ignores the crucial effects of non-linearity and noise. In the presence of a non-linearity, orthogonal transformations no longer preserve norms, suggesting that alternative transformations might be better suited to non-linear networks. Moreover, in the presence of noise, norm preservation itself ceases to be the ideal objective. A more sensible objective is maximizing the signal-to-noise ratio (SNR) of the propagated signal instead. Previous work has shown that in the linear case, recurrent networks that maximize the SNR display strongly non-normal, sequential dynamics and orthogonal networks are highly suboptimal by this measure. Motivated by this finding, we investigate the potential of non-normal RNNs, i.e. RNNs with a non-normal recurrent connectivity matrix, in sequential processing tasks. Our experimental results show that non-normal RNNs outperform their orthogonal counterparts in a diverse range of benchmarks. We also find evidence for increased non-normality and hidden chain-like feedforward structures in trained RNNs initialized with orthogonal recurrent connectivity matrices.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=ryx1wRNFvB
PDF https://openreview.net/pdf?id=ryx1wRNFvB
PWC https://paperswithcode.com/paper/improved-memory-in-recurrent-neural-networks-1
Repo
Framework

Differentiable Reasoning over a Virtual Knowledge Base

Title Differentiable Reasoning over a Virtual Knowledge Base
Authors Anonymous
Abstract We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB). In particular, we describe a neural module, DrKIT, that traverses textual data like a virtual KB, softly following paths of relations between mentions of entities in the corpus. At each step the operation uses a combination of sparse-matrix TFIDF indices and maximum inner product search (MIPS) on a special index of contextual representations. This module is differentiable, so the full system can be trained completely end-to-end using gradient based methods, starting from natural language inputs. We also describe a pretraining scheme for the index mention encoder by generating hard negative examples using existing knowledge bases. We show that DrKIT improves accuracy by 9 points on 3-hop questions in the MetaQA dataset, cutting the gap between text-based and KB-based state-of-the-art by 70%. DrKIT is also very efficient, processing upto 10x more queries per second than existing state-of-the-art QA systems.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SJxstlHFPH
PDF https://openreview.net/pdf?id=SJxstlHFPH
PWC https://paperswithcode.com/paper/differentiable-reasoning-over-a-virtual
Repo
Framework

Hamiltonian Generative Networks

Title Hamiltonian Generative Networks
Authors Anonymous
Abstract The Hamiltonian formalism plays a central role in classical and quantum physics. Hamiltonians are the main tool for modelling the continuous time evolution of systems with conserved quantities, and they come equipped with many useful properties, like time reversibility and smooth interpolation in time. These properties are important for many machine learning problems - from sequence prediction to reinforcement learning and density modelling - but are not typically provided out of the box by standard tools such as recurrent neural networks. In this paper, we introduce the Hamiltonian Generative Network (HGN), the first approach capable of consistently learning Hamiltonian dynamics from high-dimensional observations (such as images) without restrictive domain assumptions. Once trained, we can use HGN to sample new trajectories, perform rollouts both forward and backward in time, and even speed up or slow down the learned dynamics. We demonstrate how a simple modification of the network architecture turns HGN into a powerful normalising flow model, called Neural Hamiltonian Flow (NHF), that uses Hamiltonian dynamics to model expressive densities. Hence, we hope that our work serves as a first practical demonstration of the value that the Hamiltonian formalism can bring to machine learning. More results and video evaluations are available at: http://tiny.cc/hgn
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJenn6VFvB
PDF https://openreview.net/pdf?id=HJenn6VFvB
PWC https://paperswithcode.com/paper/hamiltonian-generative-networks-1
Repo
Framework

GenDICE: Generalized Offline Estimation of Stationary Values

Title GenDICE: Generalized Offline Estimation of Stationary Values
Authors Anonymous
Abstract An important problem that arises in reinforcement learning and Monte Carlo methods is estimating quantities defined by the stationary distribution of a Markov chain. In many real-world applications, access to the underlying transition operator is limited to a fixed set of data that has already been collected, without additional interaction with the environment being available. We show that consistent estimation remains possible in this scenario, and that effective estimation can still be achieved in important applications. Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions, derived from fundamental properties of the stationary distribution, and exploiting constraint reformulations based on variational divergence minimization. The resulting algorithm, GenDICE, is straightforward and effective. We prove the consistency of the method under general conditions, provide a detailed error analysis, and demonstrate strong empirical performance on benchmark tasks, including off-line PageRank and off-policy policy evaluation.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HkxlcnVFwB
PDF https://openreview.net/pdf?id=HkxlcnVFwB
PWC https://paperswithcode.com/paper/gendice-generalized-offline-estimation-of
Repo
Framework

Extreme Classification via Adversarial Softmax Approximation

Title Extreme Classification via Adversarial Softmax Approximation
Authors Anonymous
Abstract Training a classifier over a large number of classes, known as ‘extreme classification’, has become a topic of major interest with applications in technology, science, and e-commerce. Traditional softmax regression induces a gradient cost proportional to the number of classes C, which often is prohibitively expensive. A popular scalable softmax approximation relies on uniform negative sampling, which suffers from slow convergence due a poor signal-to-noise ratio. In this paper, we propose a simple training method for drastically enhancing the gradient signal by drawing negative samples from an adversarial model that mimics the data distribution. Our contributions are three-fold: (i) an adversarial sampling mechanism that produces negative samples at a cost only logarithmic in C, thus still resulting in cheap gradient updates; (ii) a mathematical proof that this adversarial sampling minimizes the gradient variance while any bias due to non-uniform sampling can be removed; (iii) experimental results on large scale data sets that show a reduction of the training time by an order of magnitude relative to several competitive baselines.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rJxe3xSYDS
PDF https://openreview.net/pdf?id=rJxe3xSYDS
PWC https://paperswithcode.com/paper/extreme-classification-via-adversarial
Repo
Framework

Leveraging inductive bias of neural networks for learning without explicit human annotations

Title Leveraging inductive bias of neural networks for learning without explicit human annotations
Authors Anonymous
Abstract Classification problems today are typically solved by first collecting examples along with candidate labels, second obtaining clean labels from workers, and third training a large, overparameterized deep neural network on the clean examples. The second, labeling step is often the most expensive one as it requires manually going through all examples. In this paper we skip the labeling step entirely and propose to directly train the deep neural network on the noisy raw labels and early stop the training to avoid overfitting. With this procedure we exploit an intriguing property of large overparameterized neural networks: While they are capable of perfectly fitting the noisy data, gradient descent fits clean labels much faster than the noisy ones, thus early stopping resembles training on the clean labels. Our results show that early stopping the training of standard deep networks such as ResNet-18 on part of the Tiny Images dataset, which does not involve any human labeled data, and of which only about half of the labels are correct, gives a significantly higher test performance than when trained on the clean CIFAR-10 training dataset, which is a labeled version of the Tiny Images dataset, for the same classification problem. In addition, our results show that the noise generated through the label collection process is not nearly as adversarial for learning as the noise generated by randomly flipping labels, which is the noise most prevalent in works demonstrating noise robustness of neural networks.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJeIX6EKvr
PDF https://openreview.net/pdf?id=HJeIX6EKvr
PWC https://paperswithcode.com/paper/leveraging-inductive-bias-of-neural-networks-1
Repo
Framework

Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

Title Skew-Fit: State-Covering Self-Supervised Reinforcement Learning
Authors Anonymous
Abstract Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills. Defining each skill with a manually-designed reward function limits this repertoire and imposes a manual engineering burden. Self-supervised agents that set their own goals can automate this process, but designing appropriate goal setting objectives can be difficult, and often involves heuristic design decisions. In this paper, we propose a formal exploration objective for goal-reaching policies that maximizes state coverage. We show that this objective is equivalent to maximizing the entropy of the goal distribution together with goal reaching performance, where goals correspond to full state observations. To instantiate this principle, we present an algorithm called Skew-Fit for learning a maximum-entropy goal distributions. Skew-Fit enables self-supervised agents to autonomously choose and practice reaching diverse goals. We show that, under certain regularity conditions, our method converges to a uniform distribution over the set of valid states, even when we do not know this set beforehand. Our experiments show that it can learn a variety of manipulation tasks from images, including opening a door with a real robot, entirely from scratch and without any manually-designed reward function.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=r1gIdySFPH
PDF https://openreview.net/pdf?id=r1gIdySFPH
PWC https://paperswithcode.com/paper/skew-fit-state-covering-self-supervised-1
Repo
Framework

Dynamics-Aware Embeddings

Title Dynamics-Aware Embeddings
Authors Anonymous
Abstract In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and actions. These embeddings capture the structure of the environment’s dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.
Tasks Continuous Control, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=BJgZGeHFPH
PDF https://openreview.net/pdf?id=BJgZGeHFPH
PWC https://paperswithcode.com/paper/dynamics-aware-embeddings-1
Repo
Framework

Extreme Tensoring for Low-Memory Preconditioning

Title Extreme Tensoring for Low-Memory Preconditioning
Authors Anonymous
Abstract State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption. This has created a recent demand for memory-efficient optimizers. To this end, we investigate the limits and performance tradeoffs of memory-efficient adaptively preconditioned gradient methods. We propose \emph{extreme tensoring} for high-dimensional stochastic optimization, showing that an optimizer needs very little memory to benefit from adaptive preconditioning. Our technique applies to arbitrary models (not necessarily with tensor-shaped parameters), and is accompanied by regret and convergence guarantees, which shed light on the tradeoffs between preconditioner quality and expressivity. On a large-scale NLP model, we reduce the optimizer memory overhead by three orders of magnitude, without degrading performance.
Tasks Stochastic Optimization
Published 2020-01-01
URL https://openreview.net/forum?id=SklKcRNYDH
PDF https://openreview.net/pdf?id=SklKcRNYDH
PWC https://paperswithcode.com/paper/extreme-tensoring-for-low-memory-1
Repo
Framework

Compositional languages emerge in a neural iterated learning model

Title Compositional languages emerge in a neural iterated learning model
Authors Anonymous
Abstract The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary. If compositionality is indeed a natural property of language, we may expect it to appear in communication protocols that are created by neural agents via grounded language learning. Inspired by the iterated learning framework, which simulates the process of language evolution, we propose an effective neural iterated learning algorithm that, when applied to interacting neural agents, facilitates the emergence of a more structured type of language. Indeed, these languages provide specific advantages to neural agents during training, which translates as a larger posterior probability, which is then incrementally amplified via the iterated learning procedure. Our experiments confirm our analysis, and also demonstrate that the emerged languages largely improve the generalization of the neural agent communication.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HkePNpVKPB
PDF https://openreview.net/pdf?id=HkePNpVKPB
PWC https://paperswithcode.com/paper/compositional-languages-emerge-in-a-neural
Repo
Framework

Knowledge Consistency between Neural Networks and Beyond

Title Knowledge Consistency between Neural Networks and Beyond
Authors Anonymous
Abstract This paper aims to analyze knowledge consistency between pre-trained deep neural networks. We propose a generic definition for knowledge consistency between neural networks at different fuzziness levels. A task-agnostic method is designed to disentangle feature components, which represent the consistent knowledge, from raw intermediate-layer features of each neural network. As a generic tool, our method can be broadly used for different applications. In preliminary experiments, we have used knowledge consistency as a tool to diagnose knowledge representations of neural networks. Knowledge consistency provides new insights to explain the success of existing deep-learning techniques, such as knowledge distillation and network compression. More crucially, knowledge consistency can also be used to refine pre-trained networks and boost performance.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BJeS62EtwH
PDF https://openreview.net/pdf?id=BJeS62EtwH
PWC https://paperswithcode.com/paper/knowledge-consistency-between-neural-networks
Repo
Framework

Convolutional Tensor-Train LSTM for Long-Term Video Prediction

Title Convolutional Tensor-Train LSTM for Long-Term Video Prediction
Authors Anonymous
Abstract Long-term video prediction is highly challenging since it entails simultaneously capturing spatial and temporal information across a long range of image frames.Standard recurrent models are ineffective since they are prone to error propagation and cannot effectively capture higher-order correlations. A potential solution is to extend to higher-order spatio-temporal recurrent models. However, such a model requires a large number of parameters and operations, making it intractable to learn in practice and is prone to overfitting. In this work, we propose convolutional tensor-train LSTM (Conv-TT-LSTM), which learns higher-orderConvolutional LSTM (ConvLSTM) efficiently using convolutional tensor-train decomposition (CTTD). Our proposed model naturally incorporates higher-order spatio-temporal information at a small cost of memory and computation by using efficient low-rank tensor representations. We evaluate our model on Moving-MNIST and KTH datasets and show improvements over standard ConvLSTM and better/comparable results to other ConvLSTM-based approaches, but with much fewer parameters.
Tasks Video Prediction
Published 2020-01-01
URL https://openreview.net/forum?id=Hkee1JBKwB
PDF https://openreview.net/pdf?id=Hkee1JBKwB
PWC https://paperswithcode.com/paper/convolutional-tensor-train-lstm-for-long-term
Repo
Framework

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

Title VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
Authors Anonymous
Abstract Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally as in the case of pixel-level autoregressive models, or do not directly optimize the likelihood of the data. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modeling of video.
Tasks Video Generation, Video Prediction
Published 2020-01-01
URL https://openreview.net/forum?id=rJgUfTEYvH
PDF https://openreview.net/pdf?id=rJgUfTEYvH
PWC https://paperswithcode.com/paper/videoflow-a-conditional-flow-based-model-for
Repo
Framework
comments powered by Disqus