October 20, 2019

2985 words 15 mins read

Paper Group AWR 228

Distributed Prioritized Experience Replay. Parallel Architecture and Hyperparameter Search via Successive Halving and Classification. Certified Robustness to Adversarial Examples with Differential Privacy. DATA Agent. Learning Symmetric and Low-energy Locomotion. Full deep neural network training on a pruned weight budget. Implementing Neural Turin …

Distributed Prioritized Experience Replay


Title	Distributed Prioritized Experience Replay
Authors	Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver
Abstract	We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors. Our architecture substantially improves the state of the art on the Arcade Learning Environment, achieving better final performance in a fraction of the wall-clock training time.
Tasks	Atari Games
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00933v1
PDF	http://arxiv.org/pdf/1803.00933v1.pdf
PWC	https://paperswithcode.com/paper/distributed-prioritized-experience-replay
Repo	https://github.com/Lyusungwon/apex_dqn_pytorch
Framework	pytorch

Parallel Architecture and Hyperparameter Search via Successive Halving and Classification


Title	Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
Authors	Manoj Kumar, George E. Dahl, Vijay Vasudevan, Mohammad Norouzi
Abstract	We present a simple and powerful algorithm for parallel black box optimization called Successive Halving and Classification (SHAC). The algorithm operates in $K$ stages of parallel function evaluations and trains a cascade of binary classifiers to iteratively cull the undesirable regions of the search space. SHAC is easy to implement, requires no tuning of its own configuration parameters, is invariant to the scale of the objective function and can be built using any choice of binary classifier. We adopt tree-based classifiers within SHAC and achieve competitive performance against several strong baselines for optimizing synthetic functions, hyperparameters and architectures.
Tasks
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10255v1
PDF	http://arxiv.org/pdf/1805.10255v1.pdf
PWC	https://paperswithcode.com/paper/parallel-architecture-and-hyperparameter
Repo	https://github.com/titu1994/pyshac
Framework	tf

Certified Robustness to Adversarial Examples with Differential Privacy


Title	Certified Robustness to Adversarial Examples with Differential Privacy
Authors	Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Suman Jana
Abstract	Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks, but they either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google’s Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired formalism, that provides a rigorous, generic, and flexible foundation for defense.
Tasks
Published	2018-02-09
URL	https://arxiv.org/abs/1802.03471v4
PDF	https://arxiv.org/pdf/1802.03471v4.pdf
PWC	https://paperswithcode.com/paper/certified-robustness-to-adversarial-examples
Repo	https://github.com/locuslab/smoothing
Framework	pytorch

DATA Agent


Title	DATA Agent
Authors	Michael Cerny Green, Gabriella A. B. Barros, Antonios Liapis, Julian Togelius
Abstract	This paper introduces DATA Agent, a system which creates murder mystery adventures from open data. In the game, the player takes on the role of a detective tasked with finding the culprit of a murder. All characters, places, and items in DATA Agent games are generated using open data as source content. The paper discusses the general game design and user interface of DATA Agent, and provides details on the generative algorithms which transform linked data into different game objects. Findings from a user study with 30 participants playing through two games of DATA Agent show that the game is easy and fun to play, and that the mysteries it generates are straightforward to solve.
Tasks
Published	2018-09-28
URL	https://arxiv.org/abs/1810.02251v1
PDF	https://arxiv.org/pdf/1810.02251v1.pdf
PWC	https://paperswithcode.com/paper/data-agent
Repo	https://github.com/michaelbrave/Procedural-Generation-And-Generative-Systems-Resources
Framework	none

Learning Symmetric and Low-energy Locomotion


Title	Learning Symmetric and Low-energy Locomotion
Authors	Wenhao Yu, Greg Turk, C. Karen Liu
Abstract	Learning locomotion skills is a challenging problem. To generate realistic and smooth locomotion, existing methods use motion capture, finite state machines or morphology-specific knowledge to guide the motion generation algorithms. Deep reinforcement learning (DRL) is a promising approach for the automatic creation of locomotion control. Indeed, a standard benchmark for DRL is to automatically create a running controller for a biped character from a simple reward function. Although several different DRL algorithms can successfully create a running controller, the resulting motions usually look nothing like a real runner. This paper takes a minimalist learning approach to the locomotion problem, without the use of motion examples, finite state machines, or morphology-specific knowledge. We introduce two modifications to the DRL approach that, when used together, produce locomotion behaviors that are symmetric, low-energy, and much closer to that of a real person. First, we introduce a new term to the loss function (not the reward function) that encourages symmetric actions. Second, we introduce a new curriculum learning method that provides modulated physical assistance to help the character with left/right balance and forward movement. The algorithm automatically computes appropriate assistance to the character and gradually relaxes this assistance, so that eventually the character learns to move entirely without help. Because our method does not make use of motion capture data, it can be applied to a variety of character morphologies. We demonstrate locomotion controllers for the lower half of a biped, a full humanoid, a quadruped, and a hexapod. Our results show that learned policies are able to produce symmetric, low-energy gaits. In addition, speed-appropriate gait patterns emerge without any guidance from motion examples or contact planning.
Tasks	Motion Capture
Published	2018-01-24
URL	http://arxiv.org/abs/1801.08093v3
PDF	http://arxiv.org/pdf/1801.08093v3.pdf
PWC	https://paperswithcode.com/paper/learning-symmetric-and-low-energy-locomotion
Repo	https://github.com/jyf588/lrle-rl-examples
Framework	tf

Full deep neural network training on a pruned weight budget


Title	Full deep neural network training on a pruned weight budget
Authors	Maximilian Golub, Guy Lemieux, Mieszko Lis
Abstract	We introduce a DNN training technique that learns only a fraction of the full parameter set without incurring an accuracy penalty. To do this, our algorithm constrains the total number of weights updated during backpropagation to those with the highest total gradients. The remaining weights are not tracked, and their initial value is regenerated at every access to avoid storing them in memory. This can dramatically reduce the number of off-chip memory accesses during both training and inference, a key component of the energy needs of DNN accelerators. By ensuring that the total weight diffusion remains close to that of baseline unpruned SGD, networks pruned using our technique are able to retain state-of-the-art accuracy across network architectures – including networks previously identified as difficult to compress, such as Densenet and WRN. With ResNet18 on ImageNet, we observe an 11.7$\times$ weight reduction with no accuracy loss, and up to 24.4$\times$ with a small accuracy impact.
Tasks
Published	2018-06-11
URL	https://arxiv.org/abs/1806.06949v2
PDF	https://arxiv.org/pdf/1806.06949v2.pdf
PWC	https://paperswithcode.com/paper/dropback-continuous-pruning-during-training
Repo	https://github.com/snownus/COOP
Framework	none

Implementing Neural Turing Machines


Title	Implementing Neural Turing Machines
Authors	Mark Collier, Joeran Beel
Abstract	Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural Networks, a new class of recurrent neural networks which decouple computation from memory by introducing an external memory unit. NTMs have demonstrated superior performance over Long Short-Term Memory Cells in several sequence learning tasks. A number of open source implementations of NTMs exist but are unstable during training and/or fail to replicate the reported performance of NTMs. This paper presents the details of our successful implementation of a NTM. Our implementation learns to solve three sequential learning tasks from the original NTM paper. We find that the choice of memory contents initialization scheme is crucial in successfully implementing a NTM. Networks with memory contents initialized to small constant values converge on average 2 times faster than the next best memory contents initialization scheme.
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08518v3
PDF	http://arxiv.org/pdf/1807.08518v3.pdf
PWC	https://paperswithcode.com/paper/implementing-neural-turing-machines
Repo	https://github.com/MarkPKCollier/NeuralTuringMachine
Framework	tf

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor


Title	Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Authors	Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
Abstract	A platform for Applied Reinforcement Learning (Applied RL)
Tasks	Continuous Control, Decision Making, Q-Learning
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01290v2
PDF	http://arxiv.org/pdf/1801.01290v2.pdf
PWC	https://paperswithcode.com/paper/soft-actor-critic-off-policy-maximum-entropy
Repo	https://github.com/facebookresearch/Horizon
Framework	pytorch


Title	Learning a Representation Map for Robot Navigation using Deep Variational Autoencoder
Authors	Kaixin Hu, Peter O’Connor
Abstract	The aim of this work is to use Variational Autoencoder (VAE) to learn a representation of an indoor environment that can be used for robot navigation. We use images extracted from a video, in which a camera takes a tour around a house, for training the VAE model with a 4 dimensional latent space. After the model is trained, each real frame has a corresponding representation point on manifold in the latent space, and each representation point has corresponding reconstructed image. For the navigation problem, we map the starting image and destination image to the latent space, then optimize a path on the learned manifold connecting the two points, and finally map the path back through decoder to a sequence of images. The ideal sequence of images should correspond to a route that is spatially continuous - i.e. neighbor images in the route should correspond to neighbor locations in physical space. Such a route could be used for navigation with computer vision techniques, i.e. a robot could follow the image sequence from starting location to destination in the environment step by step. We implement this algorithm, but find in our experimental results that the resulting route is not satisfactory. The route consist of several discontinuous image frames along the ideal routes, so that the route could not be followed by a robot with computer vision techniques in practice. In our evaluation, we propose two reasons for our failure to automatically find continuous routes: (1) The VAE tends to capture global structures, but discard the details; (2) the Euclidean similarity metric used for measuring continuity between house images is sub-optimal. For further work, we propose: trying other generative models like VAE-GANs which may be better at reconstructing the details to learn the representation map, and adjusting the similarity metric in the path selecting algorithm.
Tasks	Robot Navigation
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02401v2
PDF	http://arxiv.org/pdf/1807.02401v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-representation-map-for-robot
Repo	https://github.com/augustkx/VAE_learning-a-representation-for-navigation
Framework	none

Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm


Title	Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm
Authors	Emma Hart, Andreas S. W. Steyven, Ben Paechter
Abstract	The presence of functional diversity within a group has been demonstrated to lead to greater robustness, higher performance and increased problem-solving ability in a broad range of studies that includes insect groups, human groups and swarm robotics. Evolving group diversity however has proved challenging within Evolutionary Robotics, requiring reproductive isolation and careful attention to population size and selection mechanisms. To tackle this issue, we introduce a novel, decentralised, variant of the MAP-Elites illumination algorithm which is hybridised with a well-known distributed evolutionary algorithm (mEDEA). The algorithm simultaneously evolves multiple diverse behaviours for multiple robots, with respect to a simple token-gathering task. Each robot in the swarm maintains a local archive defined by two pre-specified functional traits which is shared with robots it come into contact with. We investigate four different strategies for sharing, exploiting and combining local archives and compare results to mEDEA. Experimental results show that in contrast to previous claims, it is possible to evolve a functionally diverse swarm without geographical isolation, and that the new method outperforms mEDEA in terms of the diversity, coverage and precision of the evolved swarm.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07655v1
PDF	http://arxiv.org/pdf/1804.07655v1.pdf
PWC	https://paperswithcode.com/paper/evolution-of-a-functionally-diverse-swarm-via
Repo	https://github.com/asteyven/EDQD-GECCO2018
Framework	none

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators


Title	Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators
Authors	Yulhwa Kim, Hyungjun Kim, Jae-Joon Kim
Abstract	Recently, RRAM-based Binary Neural Network (BNN) hardware has been gaining interests as it requires 1-bit sense-amp only and eliminates the need for high-resolution ADC and DAC. However, RRAM-based BNN hardware still requires high-resolution ADC for partial sum calculation to implement large-scale neural network using multiple memory arrays. We propose a neural network-hardware co-design approach to split input to fit each split network on a RRAM array so that the reconstructed BNNs calculate 1-bit output neuron in each array. As a result, ADC can be completely eliminated from the design even for large-scale neural network. Simulation results show that the proposed network reconstruction and retraining recovers the inference accuracy of the original BNN. The accuracy loss of the proposed scheme in the CIFAR-10 testcase was less than 1.1% compared to the original network. The code for training and running proposed BNN models is available at: https://github.com/YulhwaKim/RRAMScalable_BNN.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02187v2
PDF	http://arxiv.org/pdf/1811.02187v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-hardware-co-design-for
Repo	https://github.com/YulhwaKim/RRAMScalable_BNN
Framework	none

The committee machine: Computational to statistical gaps in learning a two-layers neural network


Title	The committee machine: Computational to statistical gaps in learning a two-layers neural network
Authors	Benjamin Aubin, Antoine Maillard, Jean Barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová
Abstract	Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap.
Tasks
Published	2018-06-14
URL	https://arxiv.org/abs/1806.05451v2
PDF	https://arxiv.org/pdf/1806.05451v2.pdf
PWC	https://paperswithcode.com/paper/the-committee-machine-computational-to
Repo	https://github.com/benjaminaubin/TheCommitteeMachine
Framework	none

Treepedia 2.0: Applying Deep Learning for Large-scale Quantification of Urban Tree Cover


Title	Treepedia 2.0: Applying Deep Learning for Large-scale Quantification of Urban Tree Cover
Authors	Bill Yang Cai, Xiaojiang Li, Ian Seiferling, Carlo Ratti
Abstract	Recent advances in deep learning have made it possible to quantify urban metrics at fine resolution, and over large extents using street-level images. Here, we focus on measuring urban tree cover using Google Street View (GSV) images. First, we provide a small-scale labelled validation dataset and propose standard metrics to compare the performance of automated estimations of street tree cover using GSV. We apply state-of-the-art deep learning models, and compare their performance to a previously established benchmark of an unsupervised method. Our training procedure for deep learning models is novel; we utilize the abundance of openly available and similarly labelled street-level image datasets to pre-train our model. We then perform additional training on a small training dataset consisting of GSV images. We find that deep learning models significantly outperform the unsupervised benchmark method. Our semantic segmentation model increased mean intersection-over-union (IoU) from 44.10% to 60.42% relative to the unsupervised method and our end-to-end model decreased Mean Absolute Error from 10.04% to 4.67%. We also employ a recently developed method called gradient-weighted class activation map (Grad-CAM) to interpret the features learned by the end-to-end model. This technique confirms that the end-to-end model has accurately learned to identify tree cover area as key features for predicting percentage tree cover. Our paper provides an example of applying advanced deep learning techniques on a large-scale, geo-tagged and image-based dataset to efficiently estimate important urban metrics. The results demonstrate that deep learning models are highly accurate, can be interpretable, and can also be efficient in terms of data-labelling effort and computational resources.
Tasks	Semantic Segmentation
Published	2018-08-14
URL	http://arxiv.org/abs/1808.04754v1
PDF	http://arxiv.org/pdf/1808.04754v1.pdf
PWC	https://paperswithcode.com/paper/treepedia-20-applying-deep-learning-for-large
Repo	https://github.com/billcai/treepedia_dl_public
Framework	tf

Post-Processing of Word Representations via Variance Normalization and Dynamic Embedding


Title	Post-Processing of Word Representations via Variance Normalization and Dynamic Embedding
Authors	Bin Wang, Fenxiao Chen, Angela Wang, C. -C. Jay Kuo
Abstract	Although embedded vector representations of words offer impressive performance on many natural language processing (NLP) applications, the information of ordered input sequences is lost to some extent if only context-based samples are used in the training. For further performance improvement, two new post-processing techniques, called post-processing via variance normalization (PVN) and post-processing via dynamic embedding (PDE), are proposed in this work. The PVN method normalizes the variance of principal components of word vectors while the PDE method learns orthogonal latent variables from ordered input sequences. The PVN and the PDE methods can be integrated to achieve better performance. We apply these post-processing techniques to two popular word embedding methods (i.e., word2vec and GloVe) to yield their post-processed representations. Extensive experiments are conducted to demonstrate the effectiveness of the proposed post-processing techniques.
Tasks
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06305v3
PDF	http://arxiv.org/pdf/1808.06305v3.pdf
PWC	https://paperswithcode.com/paper/post-processing-of-word-representations-via
Repo	https://github.com/BinWang28/PVN-Post-Processing-of-word-representation-via-variance-normalization
Framework	none

Text-to-Image-to-Text Translation using Cycle Consistent Adversarial Networks


Title	Text-to-Image-to-Text Translation using Cycle Consistent Adversarial Networks
Authors	Satya Krishna Gorti, Jeremy Ma
Abstract	Text-to-Image translation has been an active area of research in the recent past. The ability for a network to learn the meaning of a sentence and generate an accurate image that depicts the sentence shows ability of the model to think more like humans. Popular methods on text to image translation make use of Generative Adversarial Networks (GANs) to generate high quality images based on text input, but the generated images don’t always reflect the meaning of the sentence given to the model as input. We address this issue by using a captioning network to caption on generated images and exploit the distance between ground truth captions and generated captions to improve the network further. We show extensive comparisons between our method and existing methods.
Tasks
Published	2018-08-14
URL	http://arxiv.org/abs/1808.04538v1
PDF	http://arxiv.org/pdf/1808.04538v1.pdf
PWC	https://paperswithcode.com/paper/text-to-image-to-text-translation-using-cycle
Repo	https://github.com/CSC2548/text2image2textGAN
Framework	pytorch