October 20, 2019

2836 words 14 mins read

Paper Group AWR 261

Paper Group AWR 261

Behavioral Cloning from Observation. Learning to Drive in a Day. DOOBNet: Deep Object Occlusion Boundary Detection from an Image. Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget. Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant. Context is Everything: Finding Meaning Stat …

Behavioral Cloning from Observation

Title Behavioral Cloning from Observation
Authors Faraz Torabi, Garrett Warnell, Peter Stone
Abstract Humans often learn how to perform tasks via imitation: they observe others perform a task, and then very quickly infer the appropriate actions to take based on their observations. While extending this paradigm to autonomous agents is a well-studied problem in general, there are two particular aspects that have largely been overlooked: (1) that the learning is done from observation only (i.e., without explicit action information), and (2) that the learning is typically done very quickly. In this work, we propose a two-phase, autonomous imitation learning technique called behavioral cloning from observation (BCO), that aims to provide improved performance with respect to both of these aspects. First, we allow the agent to acquire experience in a self-supervised fashion. This experience is used to develop a model which is then utilized to learn a particular task by observing an expert perform that task without the knowledge of the specific actions taken. We experimentally compare BCO to imitation learning methods, including the state-of-the-art, generative adversarial imitation learning (GAIL) technique, and we show comparable task performance in several different simulation domains while exhibiting increased learning speed after expert trajectories become available.
Tasks Imitation Learning
Published 2018-05-04
URL http://arxiv.org/abs/1805.01954v2
PDF http://arxiv.org/pdf/1805.01954v2.pdf
PWC https://paperswithcode.com/paper/behavioral-cloning-from-observation
Repo https://github.com/montaserFath/BCO
Framework pytorch

Learning to Drive in a Day

Title Learning to Drive in a Day
Authors Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, Amar Shah
Abstract We demonstrate the first application of deep reinforcement learning to autonomous driving. From randomly initialised parameters, our model is able to learn a policy for lane following in a handful of training episodes using a single monocular image as input. We provide a general and easy to obtain reward: the distance travelled by the vehicle without the safety driver taking control. We use a continuous, model-free deep reinforcement learning algorithm, with all exploration and optimisation performed on-vehicle. This demonstrates a new framework for autonomous driving which moves away from reliance on defined logical rules, mapping, and direct supervision. We discuss the challenges and opportunities to scale this approach to a broader range of autonomous driving tasks.
Tasks Autonomous Driving
Published 2018-07-01
URL http://arxiv.org/abs/1807.00412v2
PDF http://arxiv.org/pdf/1807.00412v2.pdf
PWC https://paperswithcode.com/paper/learning-to-drive-in-a-day
Repo https://github.com/nautilusPrime/autodrive_ddpg
Framework none

DOOBNet: Deep Object Occlusion Boundary Detection from an Image

Title DOOBNet: Deep Object Occlusion Boundary Detection from an Image
Authors Guoxia Wang, Xiaohui Liang, Frederick W. B. Li
Abstract Object occlusion boundary detection is a fundamental and crucial research problem in computer vision. This is challenging to solve as encountering the extreme boundary/non-boundary class imbalance during training an object occlusion boundary detector. In this paper, we propose to address this class imbalance by up-weighting the loss contribution of false negative and false positive examples with our novel Attention Loss function. We also propose a unified end-to-end multi-task deep object occlusion boundary detection network (DOOBNet) by sharing convolutional features to simultaneously predict object boundary and occlusion orientation. DOOBNet adopts an encoder-decoder structure with skip connection in order to automatically learn multi-scale and multi-level features. We significantly surpass the state-of-the-art on the PIOD dataset (ODS F-score of .702) and the BSDS ownership dataset (ODS F-score of .555), as well as improving the detecting speed to as 0.037s per image on the PIOD dataset.
Tasks Boundary Detection
Published 2018-06-11
URL http://arxiv.org/abs/1806.03772v3
PDF http://arxiv.org/pdf/1806.03772v3.pdf
PWC https://paperswithcode.com/paper/doobnet-deep-object-occlusion-boundary
Repo https://github.com/GuoxiaWang/DOOBNet
Framework tf

Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget

Title Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget
Authors Jaewoo Lee, Daniel Kifer
Abstract Iterative algorithms, like gradient descent, are common tools for solving a variety of problems, such as model fitting. For this reason, there is interest in creating differentially private versions of them. However, their conversion to differentially private algorithms is often naive. For instance, a fixed number of iterations are chosen, the privacy budget is split evenly among them, and at each iteration, parameters are updated with a noisy gradient. In this paper, we show that gradient-based algorithms can be improved by a more careful allocation of privacy budget per iteration. Intuitively, at the beginning of the optimization, gradients are expected to be large, so that they do not need to be measured as accurately. However, as the parameters approach their optimal values, the gradients decrease and hence need to be measured more accurately. We add a basic line-search capability that helps the algorithm decide when more accurate gradient measurements are necessary. Our gradient descent algorithm works with the recently introduced zCDP version of differential privacy. It outperforms prior algorithms for model fitting and is competitive with the state-of-the-art for $(\epsilon,\delta)$-differential privacy, a strictly weaker definition than zCDP.
Tasks
Published 2018-08-28
URL http://arxiv.org/abs/1808.09501v1
PDF http://arxiv.org/pdf/1808.09501v1.pdf
PWC https://paperswithcode.com/paper/concentrated-differentially-private-gradient
Repo https://github.com/ppmlguy/DP-AGD
Framework none

Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant

Title Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant
Authors Dominik Marek Loroch, Franz-Josef Pfreundt, Norbert Wehn, Janis Keuper
Abstract Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and aug- mented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to the high levels of redundancy in the network parameters. It has been shown that sparsity can be promoted specifically and the achieved sparsity can be very high. But in many cases the methods are evaluated on rather small topologies. It is not clear if the results transfer onto deeper topologies. In this paper, the TensorQuant toolbox has been extended to offer a platform to investigate sparsity, especially in deeper models. Several practical relevant topologies for varying classification problem sizes are investigated to show the differences in sparsity for activations, weights and gradients.
Tasks Autonomous Driving
Published 2018-08-27
URL http://arxiv.org/abs/1808.08784v1
PDF http://arxiv.org/pdf/1808.08784v1.pdf
PWC https://paperswithcode.com/paper/sparsity-in-deep-neural-networks-an-empirical
Repo https://github.com/DominikFHG/TensorQuant
Framework tf

Context is Everything: Finding Meaning Statistically in Semantic Spaces

Title Context is Everything: Finding Meaning Statistically in Semantic Spaces
Authors Eric Zelikman
Abstract This paper introduces Contextual Salience (CoSal), a simple and explicit measure of a word’s importance in context which is a more theoretically natural, practically simpler, and more accurate replacement to tf-idf. CoSal supports very small contexts (20 or more sentences), out-of context words, and is easy to calculate. A word vector space generated with both bigram phrases and unigram tokens reveals that contextually significant words disproportionately define phrases. This relationship is applied to produce simple weighted bag-of-words sentence embeddings. This model outperforms SkipThought and the best models trained on unordered sentences in most tests in Facebook’s SentEval, beats tf-idf on all available tests, and is generally comparable to the state of the art. This paper also applies CoSal to sentence and document summarization and an improved and context-aware cosine distance. Applying the premise that unexpected words are important, CoSal is presented as a replacement for tf-idf and an intuitive measure of contextual word importance.
Tasks Document Summarization, Sentence Embeddings
Published 2018-03-22
URL http://arxiv.org/abs/1803.08493v5
PDF http://arxiv.org/pdf/1803.08493v5.pdf
PWC https://paperswithcode.com/paper/context-is-everything-finding-meaning
Repo https://github.com/ezelikman/Context-Is-Everything
Framework none

Entropy and mutual information in models of deep neural networks

Title Entropy and mutual information in models of deep neural networks
Authors Marylou Gabrié, Andre Manoel, Clément Luneau, Jean Barbier, Nicolas Macris, Florent Krzakala, Lenka Zdeborová
Abstract We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive.
Tasks
Published 2018-05-24
URL http://arxiv.org/abs/1805.09785v2
PDF http://arxiv.org/pdf/1805.09785v2.pdf
PWC https://paperswithcode.com/paper/entropy-and-mutual-information-in-models-of
Repo https://github.com/sphinxteam/dnner
Framework none

Real Time System for Facial Analysis

Title Real Time System for Facial Analysis
Authors Janne Tommola, Pedram Ghazi, Bishwo Adhikari, Heikki Huttunen
Abstract In this paper we describe the anatomy of a real-time facial analysis system. The system recognizes the age, gender and facial expression from users in appearing in front of the camera. All components are based on convolutional neural networks, whose accuracy we study on commonly used training and evaluation sets. A key contribution of the work is the description of the interplay between processing threads for frame grabbing, face detection and the three types of recognition. The python code for executing the system uses common libraries–keras/tensorflow, opencv and dlib–and is available for download.
Tasks Face Detection
Published 2018-09-14
URL http://arxiv.org/abs/1809.05474v1
PDF http://arxiv.org/pdf/1809.05474v1.pdf
PWC https://paperswithcode.com/paper/real-time-system-for-facial-analysis
Repo https://github.com/mahehu/TUT-live-age-estimator
Framework tf

2.5D Visual Sound

Title 2.5D Visual Sound
Authors Ruohan Gao, Kristen Grauman
Abstract Binaural audio provides a listener with 3D sound sensation, allowing a rich perceptual experience of the scene. However, binaural recordings are scarcely available and require nontrivial expertise and equipment to obtain. We propose to convert common monaural audio into binaural audio by leveraging video. The key idea is that visual frames reveal significant spatial cues that, while explicitly lacking in the accompanying single-channel audio, are strongly linked to it. Our multi-modal approach recovers this link from unlabeled video. We devise a deep convolutional neural network that learns to decode the monaural (single-channel) soundtrack into its binaural counterpart by injecting visual information about object and scene configurations. We call the resulting output 2.5D visual sound—the visual stream helps “lift” the flat single channel audio into spatialized sound. In addition to sound generation, we show the self-supervised representation learned by our network benefits audio-visual source separation. Our video results: http://vision.cs.utexas.edu/projects/2.5D_visual_sound/
Tasks
Published 2018-12-11
URL http://arxiv.org/abs/1812.04204v4
PDF http://arxiv.org/pdf/1812.04204v4.pdf
PWC https://paperswithcode.com/paper/25d-visual-sound
Repo https://github.com/facebookresearch/FAIR-Play
Framework none

Asynchronous Bidirectional Decoding for Neural Machine Translation

Title Asynchronous Bidirectional Decoding for Neural Machine Translation
Authors Xiangwen Zhang, Jinsong Su, Yue Qin, Yang Liu, Rongrong Ji, Hongji Wang
Abstract The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation. Traditionally, the NMT decoders adopt recurrent neural networks (RNNs) to perform translation in a left-toright manner, leaving the target-side contexts generated from right to left unexploited during translation. In this paper, we equip the conventional attentional encoder-decoder NMT framework with a backward decoder, in order to explore bidirectional decoding for NMT. Attending to the hidden state sequence produced by the encoder, our backward decoder first learns to generate the target-side hidden state sequence from right to left. Then, the forward decoder performs translation in the forward direction, while in each translation prediction timestep, it simultaneously applies two attention models to consider the source-side and reverse target-side hidden states, respectively. With this new architecture, our model is able to fully exploit source- and target-side contexts to improve translation quality altogether. Experimental results on NIST Chinese-English and WMT English-German translation tasks demonstrate that our model achieves substantial improvements over the conventional NMT by 3.14 and 1.38 BLEU points, respectively. The source code of this work can be obtained from https://github.com/DeepLearnXMU/ABDNMT.
Tasks Machine Translation
Published 2018-01-16
URL http://arxiv.org/abs/1801.05122v2
PDF http://arxiv.org/pdf/1801.05122v2.pdf
PWC https://paperswithcode.com/paper/asynchronous-bidirectional-decoding-for
Repo https://github.com/DeepLearnXMU/ABD-NMT
Framework none

Deep Retinex Decomposition for Low-Light Enhancement

Title Deep Retinex Decomposition for Low-Light Enhancement
Authors Chen Wei, Wenjing Wang, Wenhan Yang, Jiaying Liu
Abstract Retinex model is an effective tool for low-light image enhancement. It assumes that observed images can be decomposed into the reflectance and illumination. Most existing Retinex-based methods have carefully designed hand-crafted constraints and parameters for this highly ill-posed decomposition, which may be limited by model capacity when applied in various scenes. In this paper, we collect a LOw-Light dataset (LOL) containing low/normal-light image pairs and propose a deep Retinex-Net learned on this dataset, including a Decom-Net for decomposition and an Enhance-Net for illumination adjustment. In the training process for Decom-Net, there is no ground truth of decomposed reflectance and illumination. The network is learned with only key constraints including the consistent reflectance shared by paired low/normal-light images, and the smoothness of illumination. Based on the decomposition, subsequent lightness enhancement is conducted on illumination by an enhancement network called Enhance-Net, and for joint denoising there is a denoising operation on reflectance. The Retinex-Net is end-to-end trainable, so that the learned decomposition is by nature good for lightness adjustment. Extensive experiments demonstrate that our method not only achieves visually pleasing quality for low-light enhancement but also provides a good representation of image decomposition.
Tasks Denoising, Image Enhancement, Low-Light Image Enhancement
Published 2018-08-14
URL http://arxiv.org/abs/1808.04560v1
PDF http://arxiv.org/pdf/1808.04560v1.pdf
PWC https://paperswithcode.com/paper/deep-retinex-decomposition-for-low-light
Repo https://github.com/weichen582/RetinexNet
Framework tf

Open3D: A Modern Library for 3D Data Processing

Title Open3D: A Modern Library for 3D Data Processing
Authors Qian-Yi Zhou, Jaesik Park, Vladlen Koltun
Abstract Open3D is an open-source library that supports rapid development of software that deals with 3D data. The Open3D frontend exposes a set of carefully selected data structures and algorithms in both C++ and Python. The backend is highly optimized and is set up for parallelization. Open3D was developed from a clean slate with a small and carefully considered set of dependencies. It can be set up on different platforms and compiled from source with minimal effort. The code is clean, consistently styled, and maintained via a clear code review mechanism. Open3D has been used in a number of published research projects and is actively deployed in the cloud. We welcome contributions from the open-source community.
Tasks
Published 2018-01-30
URL http://arxiv.org/abs/1801.09847v1
PDF http://arxiv.org/pdf/1801.09847v1.pdf
PWC https://paperswithcode.com/paper/open3d-a-modern-library-for-3d-data
Repo https://github.com/IntelVCL/Open3D
Framework none

Transferring GANs: generating images from limited data

Title Transferring GANs: generating images from limited data
Authors Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, Bogdan Raducanu
Abstract Transferring the knowledge of pretrained networks to new domains by means of finetuning is a widely used practice for applications based on discriminative models. To the best of our knowledge this practice has not been studied within the context of generative deep networks. Therefore, we study domain adaptation applied to image generation with generative adversarial networks. We evaluate several aspects of domain adaptation, including the impact of target domain size, the relative distance between source and target domain, and the initialization of conditional GANs. Our results show that using knowledge from pretrained networks can shorten the convergence time and can significantly improve the quality of the generated images, especially when the target data is limited. We show that these conclusions can also be drawn for conditional GANs even when the pretrained model was trained without conditioning. Our results also suggest that density may be more important than diversity and a dataset with one or few densely sampled classes may be a better source model than more diverse datasets such as ImageNet or Places.
Tasks Domain Adaptation, Image Generation
Published 2018-05-04
URL http://arxiv.org/abs/1805.01677v2
PDF http://arxiv.org/pdf/1805.01677v2.pdf
PWC https://paperswithcode.com/paper/transferring-gans-generating-images-from
Repo https://github.com/WuChenshen/MeRGAN
Framework tf

Breaking the Activation Function Bottleneck through Adaptive Parameterization

Title Breaking the Activation Function Bottleneck through Adaptive Parameterization
Authors Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot
Abstract Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly. We present an adaptive LSTM that advances the state of the art for the Penn Treebank and WikiText-2 word-modeling tasks while using fewer parameters and converging in less than half as many iterations.
Tasks
Published 2018-05-22
URL http://arxiv.org/abs/1805.08574v4
PDF http://arxiv.org/pdf/1805.08574v4.pdf
PWC https://paperswithcode.com/paper/breaking-the-activation-function-bottleneck
Repo https://github.com/flennerhag/alstm
Framework pytorch

Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction

Title Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
Authors Ningyu Zhang, Shumin Deng, Zhanlin Sun, Xi Chen, Wei Zhang, Huajun Chen
Abstract A capsule is a group of neurons, whose activity vector represents the instantiation parameters of a specific type of entity. In this paper, we explore the capsule networks used for relation extraction in a multi-instance multi-label learning framework and propose a novel neural approach based on capsule networks with attention mechanisms. We evaluate our method with different benchmarks, and it is demonstrated that our method improves the precision of the predicted relations. Particularly, we show that capsule networks improve multiple entity pairs relation extraction.
Tasks Multi-Label Learning, Relation Extraction
Published 2018-12-29
URL http://arxiv.org/abs/1812.11321v1
PDF http://arxiv.org/pdf/1812.11321v1.pdf
PWC https://paperswithcode.com/paper/attention-based-capsule-networks-with-dynamic
Repo https://github.com/WHUNLPLab/Papers-to-read
Framework none
comments powered by Disqus