April 1, 2020

2522 words 12 mins read

Paper Group NANR 90

Paper Group NANR 90

SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning. Transferable Perturbations of Deep Feature Distributions. Mutual Information Maximization for Robust Plannable Representations. Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation. Non-Sequential Melody Generation. Smart Ternary Quantizatio …

SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning

Title SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning
Authors Anonymous
Abstract SNOW is a novel learning method to improve training/serving throughput as well as accuracy for transfer and lifelong learning based on knowledge subscription. SNOW selects the useful top-K intermediate feature maps for a target task from a pre-trained and frozen source model through a novel channel pooling scheme, and utilizes them in the task-specific delta model. The source model is responsible for generating a large number of generic feature maps, and the delta model fuses the subscribed feature maps (through channel pooling) with its own local ones to deliver high accuracy for the target task. Since a source model participates both training and serving of all target tasks in an inference only mode, one source model can serve multiple delta models, enabling significant computation sharing. The sizes of such delta models are fractional of the source model, thus SNOW also provides model-size efficiency. Our experimental results show that SNOW offers a superior balance between accuracy and training/inference speed for various image classification tasks to the existing transfer and lifelong learning practices.
Tasks Image Classification
Published 2020-01-01
URL https://openreview.net/forum?id=rJxtgJBKDr
PDF https://openreview.net/pdf?id=rJxtgJBKDr
PWC https://paperswithcode.com/paper/snow-subscribing-to-knowledge-via-channel
Repo
Framework

Transferable Perturbations of Deep Feature Distributions

Title Transferable Perturbations of Deep Feature Distributions
Authors Nathan Inkawhich, Kevin Liang, Lawrence Carin, Yiran Chen
Abstract Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples.
Tasks Adversarial Attack
Published 2020-01-01
URL https://openreview.net/forum?id=rJxAo2VYwr
PDF https://openreview.net/pdf?id=rJxAo2VYwr
PWC https://paperswithcode.com/paper/transferable-perturbations-of-deep-feature
Repo
Framework

Mutual Information Maximization for Robust Plannable Representations

Title Mutual Information Maximization for Robust Plannable Representations
Authors Anonymous
Abstract Extending the capabilities of robotics to real-world complex, unstructured environments requires the capability of developing better perception systems while maintaining low sample complexity. When dealing with high-dimensional state spaces, current methods are either model-free, or model-based with reconstruction based objectives. The sample inefficiency of the former constitutes a major barrier for applying them to the real-world. While the latter present low sample complexity, they learn latent spaces that need to reconstruct every single detail of the scene. Real-world environments are unstructured and cluttered with objects. Capturing all the variability on the latent representation harms its applicability to downstream tasks. In this work, we present mutual information maximization for robust plannable representations (MIRO), an information theoretic representational learning objective for model-based reinforcement learning. Our objective optimizes for a latent space that maximizes the mutual information with future observations and emphasizes the relevant aspects of the dynamics, which allows to capture all the information needed for planning. We show that our approach learns a latent representation that in cluttered scenes focuses on the task relevant features, ignoring the irrelevant aspects. At the same time, state-of-the-art methods with reconstruction objectives are unable to learn in such environments.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Syx9Q1rYvH
PDF https://openreview.net/pdf?id=Syx9Q1rYvH
PWC https://paperswithcode.com/paper/mutual-information-maximization-for-robust
Repo
Framework

Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation

Title Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
Authors Anonymous
Abstract Open-domain dialogue generation has gained increasing attention in Natural Language Processing. Comparing these methods requires a holistic means of dialogue evaluation. Human ratings are deemed as the gold standard. As human evaluation is inefficient and costly, an automated substitute is desirable. In this paper, we propose holistic evaluation metrics which capture both the quality and diversity of dialogues. Our metrics consists of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, and, (3) $n$-gram based diversity in responses to augmented queries. The empirical validity of our metrics is demonstrated by strong correlation with human judgments. We provide the associated code, datasets and human ratings.
Tasks Dialogue Generation
Published 2020-01-01
URL https://openreview.net/forum?id=BJg_FgBtPH
PDF https://openreview.net/pdf?id=BJg_FgBtPH
PWC https://paperswithcode.com/paper/towards-holistic-and-automatic-evaluation-of
Repo
Framework

Non-Sequential Melody Generation

Title Non-Sequential Melody Generation
Authors Mitchell Billard, Robert Bishop, Moustafa Elsisy, Laura Graves, Antonina Kolokolova, Vineel Nagisetty, Zachary Northcott, Heather Patey
Abstract In this paper we present a method for algorithmic melody generation using a generative adversarial network without recurrent components. Music generation has been successfully done using recurrent neural networks, where the model learns sequence information that can help create authentic sounding melodies. Here, we use DCGAN architecture with dilated convolutions and towers to capture sequential information as spatial image information, and learn long-range dependencies in fixed-length melody forms such as Irish traditional reel.
Tasks Music Generation
Published 2020-01-01
URL https://openreview.net/forum?id=HkePOCNtPH
PDF https://openreview.net/pdf?id=HkePOCNtPH
PWC https://paperswithcode.com/paper/non-sequential-melody-generation
Repo
Framework

Smart Ternary Quantization

Title Smart Ternary Quantization
Authors Anonymous
Abstract Neural network models are resource hungry. Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements. Ternary quantization provides a more flexible model and often beats binary quantization in terms of accuracy, but doubles memory and increases computation cost. Mixed quantization depth models, on another hand, allows a trade-off between accuracy and memory footprint. In such models, quantization depth is often chosen manually (which is a tiring task), or is tuned using a separate optimization routine (which requires training a quantized network multiple times). Here, we propose Smart Ternary Quantization (STQ) in which we modify the quantization depth directly through an adaptive regularization function, so that we train a model only once. This method jumps between binary and ternary quantization while training. We show its application on image classification.
Tasks Image Classification, Quantization
Published 2020-01-01
URL https://openreview.net/forum?id=SyxhaxBKPS
PDF https://openreview.net/pdf?id=SyxhaxBKPS
PWC https://paperswithcode.com/paper/smart-ternary-quantization
Repo
Framework

Ranking Policy Gradient

Title Ranking Policy Gradient
Authors Anonymous
Abstract Sample inefficiency is a long-lasting problem in reinforcement learning (RL). The state-of-the-art uses action value function to derive policy while it usually involves an extensive search over the state-action space and unstable optimization. Towards the sample-efficient RL, we propose ranking policy gradient (RPG), a policy gradient method that learns the optimal rank of a set of discrete actions. To accelerate the learning of policy gradient methods, we establish the equivalence between maximizing the lower bound of return and imitating a near-optimal policy without accessing any oracles. These results lead to a general off-policy learning framework, which preserves the optimality, reduces variance, and improves the sample-efficiency. We conduct extensive experiments showing that when consolidating with the off-policy learning framework, RPG substantially reduces the sample complexity, comparing to the state-of-the-art.
Tasks Policy Gradient Methods
Published 2020-01-01
URL https://openreview.net/forum?id=rJld3hEYvS
PDF https://openreview.net/pdf?id=rJld3hEYvS
PWC https://paperswithcode.com/paper/ranking-policy-gradient-1
Repo
Framework

Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs

Title Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs
Authors Anonymous
Abstract We present a deep reinforcement learning approach to minimizing the execution cost of neural network computation graphs in an optimizing compiler. Unlike earlier learning-based works that require training the optimizer on the same graph to be optimized, we propose a learning approach that trains an optimizer offline and then generalizes to previously unseen graphs without further training. This allows our approach to produce high-quality execution decisions on real-world TensorFlow graphs in seconds instead of hours. We consider two optimization tasks for computation graphs: minimizing running time and peak memory usage. In comparison to an extensive set of baselines, our approach achieves significant improvements over classical and other learning-based methods on these two tasks.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rkxDoJBYPB
PDF https://openreview.net/pdf?id=rkxDoJBYPB
PWC https://paperswithcode.com/paper/reinforced-genetic-algorithm-learning-for
Repo
Framework

Scalable Deep Neural Networks via Low-Rank Matrix Factorization

Title Scalable Deep Neural Networks via Low-Rank Matrix Factorization
Authors Anonymous
Abstract Compressing deep neural networks (DNNs) is important for real-world applications operating on resource-constrained devices. However, it is difficult to change the model size once the training is completed, which needs re-training to configure models suitable for different devices. In this paper, we propose a novel method that enables DNNs to flexibly change their size after training. We factorize the weight matrices of the DNNs via singular value decomposition (SVD) and change their ranks according to the target size. In contrast with existing methods, we introduce simple criteria that characterize the importance of each basis and layer, which enables to effectively compress the error and complexity of models as little as possible. In experiments on multiple image-classification tasks, our method exhibits favorable performance compared with other methods.
Tasks Image Classification
Published 2020-01-01
URL https://openreview.net/forum?id=rygT_JHtDr
PDF https://openreview.net/pdf?id=rygT_JHtDr
PWC https://paperswithcode.com/paper/scalable-deep-neural-networks-via-low-rank
Repo
Framework

Task-Agnostic Robust Encodings for Combating Adversarial Typos

Title Task-Agnostic Robust Encodings for Combating Adversarial Typos
Authors Anonymous
Abstract Despite achieving excellent benchmark performance, state-of-the-art NLP models can still be easily fooled by adversarial perturbations such as typos. Previous heuristic defenses cannot guard against the exponentially large number of possible perturbations, and previous certified defenses only work with limited model sizes and simple architectures. In this paper, we construct task-agnostic robust encodings (TARE): sentence representations that improve the robustness of any model for multiple downstream tasks at once, and enable efficient exact computation of robust accuracy (accuracy on worst-case perturbations) for a fixed family of perturbations. The core idea behind TARE is to map sentences through a discrete bottleneck before feeding them to a downstream model. To create robust encodings, we must optimize for two competing goals: the encoding of a sentence must retain enough information about the sentence, but should also map all perturbations of the sentence to the same encoding to ensure invariance to perturbations. Averaged across six tasks from GLUE, a standard suite of NLP tasks, the same encoding leads to robust accuracy of 71.2% when defending against a large family of typos, while a strong baseline that uses a typo corrector achieves only 38.5% accuracy, and training on random typos achieves only 9.9% accuracy.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rkgOuJBtPS
PDF https://openreview.net/pdf?id=rkgOuJBtPS
PWC https://paperswithcode.com/paper/task-agnostic-robust-encodings-for-combating
Repo
Framework

Graph Convolutional Reinforcement Learning

Title Graph Convolutional Reinforcement Learning
Authors Anonymous
Abstract Learning to cooperate is crucially important in multi-agent environments. The key is to understand the mutual interplay between agents. However, multi-agent environments are highly dynamic, which makes it hard to learn abstract representations of their mutual interplay. To tackle these difficulties, we propose graph convolutional reinforcement learning, where graph convolution adapts to the dynamics of the underlying graph of the multi-agent environment, and relation kernels capture the interplay between agents by their relation representations. Latent features produced by convolutional layers from gradually increased receptive fields are exploited to learn cooperation, and cooperation is further boosted by temporal relation regularization for consistency. Empirically, we show that our method substantially outperforms existing methods in a variety of cooperative scenarios.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HkxdQkSYDB
PDF https://openreview.net/pdf?id=HkxdQkSYDB
PWC https://paperswithcode.com/paper/graph-convolutional-reinforcement-learning-1
Repo
Framework

Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep RL

Title Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep RL
Authors Anonymous
Abstract Saliency maps are often used to suggest explanations of the behavior of deep rein- forcement learning (RL) agents. However, the explanations derived from saliency maps are often unfalsifiable and can be highly subjective. We introduce an empirical approach grounded in counterfactual reasoning to test the hypotheses generated from saliency maps and show that explanations suggested by saliency maps are often not supported by experiments. Our experiments suggest that saliency maps are best viewed as an exploratory tool rather than an explanatory tool.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rkl3m1BFDB
PDF https://openreview.net/pdf?id=rkl3m1BFDB
PWC https://paperswithcode.com/paper/exploratory-not-explanatory-counterfactual
Repo
Framework

A Generalized Training Approach for Multiagent Learning

Title A Generalized Training Approach for Multiagent Learning
Authors Anonymous
Abstract This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle as special cases, and (2) in principle applies to general-sum, many-player games. Despite this, prior studies of PSRO have been focused on two-player zero-sum games, a regime wherein Nash equilibria are tractably computable. In moving from two-player zero-sum games to more general settings, computation of Nash equilibria quickly becomes infeasible. Here, we extend the theoretical underpinnings of PSRO by considering an alternative solution concept, α-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and tractable to compute in general-sum, many-player settings. We establish convergence guarantees in several games classes, and identify links between Nash equilibria and α-Rank. We demonstrate the competitive performance of α-Rank-based PSRO against an exact Nash solver-based PSRO in 2-player Kuhn and Leduc Poker. We then go beyond the reach of prior PSRO applications by considering 3- to 5-player poker games, yielding instances where α-Rank achieves faster convergence than approximate Nash solvers, thus establishing it as a favorable general games solver. We also carry out an initial empirical validation in MuJoCo soccer, illustrating the feasibility of the proposed approach in another complex domain.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Bkl5kxrKDr
PDF https://openreview.net/pdf?id=Bkl5kxrKDr
PWC https://paperswithcode.com/paper/a-generalized-training-approach-for
Repo
Framework

Scalable Model Compression by Entropy Penalized Reparameterization

Title Scalable Model Compression by Entropy Penalized Reparameterization
Authors Anonymous
Abstract We describe a simple and general neural network weight compression approach, in which the network parameters (weights and biases) are represented in a “latent” space, amounting to a reparameterization. This space is equipped with a learned probability model, which is used to impose an entropy penalty on the parameter representation during training, and to compress the representation using a simple arithmetic coder after training. Classification accuracy and model compressibility is maximized jointly, with the bitrate–accuracy trade-off specified by a hyperparameter. We evaluate the method on the MNIST, CIFAR-10 and ImageNet classification benchmarks using six distinct model architectures. Our results show that state-of-the-art model compression can be achieved in a scalable and general way without requiring complex procedures such as multi-stage training.
Tasks Model Compression
Published 2020-01-01
URL https://openreview.net/forum?id=HkgxW0EYDS
PDF https://openreview.net/pdf?id=HkgxW0EYDS
PWC https://paperswithcode.com/paper/scalable-model-compression-by-entropy
Repo
Framework

ES-MAML: Simple Hessian-Free Meta Learning

Title ES-MAML: Simple Hessian-Free Meta Learning
Authors Anonymous
Abstract We introduce ES-MAML, a new framework for solving the model agnostic meta learning (MAML) problem based on Evolution Strategies (ES). Existing algorithms for MAML are based on policy gradients, and incur significant difficulties when attempting to estimate second derivatives using backpropagation on stochastic policies. We show how ES can be applied to MAML to obtain an algorithm which avoids the problem of estimating second derivatives, and is also conceptually simple and easy to implement. Moreover, ES-MAML can handle new types of nonsmooth adaptation operators, and other techniques for improving performance and estimation of ES methods become applicable. We show empirically that ES-MAML is competitive with existing methods and often yields better adaptation with fewer queries.
Tasks Meta-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=S1exA2NtDB
PDF https://openreview.net/pdf?id=S1exA2NtDB
PWC https://paperswithcode.com/paper/es-maml-simple-hessian-free-meta-learning-1
Repo
Framework
comments powered by Disqus