Paper Group AWR 261
Behavioral Cloning from Observation. Learning to Drive in a Day. DOOBNet: Deep Object Occlusion Boundary Detection from an Image. Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget. Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant. Context is Everything: Finding Meaning Stat …
Behavioral Cloning from Observation
Title | Behavioral Cloning from Observation |
Authors | Faraz Torabi, Garrett Warnell, Peter Stone |
Abstract | Humans often learn how to perform tasks via imitation: they observe others perform a task, and then very quickly infer the appropriate actions to take based on their observations. While extending this paradigm to autonomous agents is a well-studied problem in general, there are two particular aspects that have largely been overlooked: (1) that the learning is done from observation only (i.e., without explicit action information), and (2) that the learning is typically done very quickly. In this work, we propose a two-phase, autonomous imitation learning technique called behavioral cloning from observation (BCO), that aims to provide improved performance with respect to both of these aspects. First, we allow the agent to acquire experience in a self-supervised fashion. This experience is used to develop a model which is then utilized to learn a particular task by observing an expert perform that task without the knowledge of the specific actions taken. We experimentally compare BCO to imitation learning methods, including the state-of-the-art, generative adversarial imitation learning (GAIL) technique, and we show comparable task performance in several different simulation domains while exhibiting increased learning speed after expert trajectories become available. |
Tasks | Imitation Learning |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01954v2 |
http://arxiv.org/pdf/1805.01954v2.pdf | |
PWC | https://paperswithcode.com/paper/behavioral-cloning-from-observation |
Repo | https://github.com/montaserFath/BCO |
Framework | pytorch |
Learning to Drive in a Day
Title | Learning to Drive in a Day |
Authors | Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, Amar Shah |
Abstract | We demonstrate the first application of deep reinforcement learning to autonomous driving. From randomly initialised parameters, our model is able to learn a policy for lane following in a handful of training episodes using a single monocular image as input. We provide a general and easy to obtain reward: the distance travelled by the vehicle without the safety driver taking control. We use a continuous, model-free deep reinforcement learning algorithm, with all exploration and optimisation performed on-vehicle. This demonstrates a new framework for autonomous driving which moves away from reliance on defined logical rules, mapping, and direct supervision. We discuss the challenges and opportunities to scale this approach to a broader range of autonomous driving tasks. |
Tasks | Autonomous Driving |
Published | 2018-07-01 |
URL | http://arxiv.org/abs/1807.00412v2 |
http://arxiv.org/pdf/1807.00412v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-drive-in-a-day |
Repo | https://github.com/nautilusPrime/autodrive_ddpg |
Framework | none |
DOOBNet: Deep Object Occlusion Boundary Detection from an Image
Title | DOOBNet: Deep Object Occlusion Boundary Detection from an Image |
Authors | Guoxia Wang, Xiaohui Liang, Frederick W. B. Li |
Abstract | Object occlusion boundary detection is a fundamental and crucial research problem in computer vision. This is challenging to solve as encountering the extreme boundary/non-boundary class imbalance during training an object occlusion boundary detector. In this paper, we propose to address this class imbalance by up-weighting the loss contribution of false negative and false positive examples with our novel Attention Loss function. We also propose a unified end-to-end multi-task deep object occlusion boundary detection network (DOOBNet) by sharing convolutional features to simultaneously predict object boundary and occlusion orientation. DOOBNet adopts an encoder-decoder structure with skip connection in order to automatically learn multi-scale and multi-level features. We significantly surpass the state-of-the-art on the PIOD dataset (ODS F-score of .702) and the BSDS ownership dataset (ODS F-score of .555), as well as improving the detecting speed to as 0.037s per image on the PIOD dataset. |
Tasks | Boundary Detection |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03772v3 |
http://arxiv.org/pdf/1806.03772v3.pdf | |
PWC | https://paperswithcode.com/paper/doobnet-deep-object-occlusion-boundary |
Repo | https://github.com/GuoxiaWang/DOOBNet |
Framework | tf |
Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget
Title | Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget |
Authors | Jaewoo Lee, Daniel Kifer |
Abstract | Iterative algorithms, like gradient descent, are common tools for solving a variety of problems, such as model fitting. For this reason, there is interest in creating differentially private versions of them. However, their conversion to differentially private algorithms is often naive. For instance, a fixed number of iterations are chosen, the privacy budget is split evenly among them, and at each iteration, parameters are updated with a noisy gradient. In this paper, we show that gradient-based algorithms can be improved by a more careful allocation of privacy budget per iteration. Intuitively, at the beginning of the optimization, gradients are expected to be large, so that they do not need to be measured as accurately. However, as the parameters approach their optimal values, the gradients decrease and hence need to be measured more accurately. We add a basic line-search capability that helps the algorithm decide when more accurate gradient measurements are necessary. Our gradient descent algorithm works with the recently introduced zCDP version of differential privacy. It outperforms prior algorithms for model fitting and is competitive with the state-of-the-art for $(\epsilon,\delta)$-differential privacy, a strictly weaker definition than zCDP. |
Tasks | |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09501v1 |
http://arxiv.org/pdf/1808.09501v1.pdf | |
PWC | https://paperswithcode.com/paper/concentrated-differentially-private-gradient |
Repo | https://github.com/ppmlguy/DP-AGD |
Framework | none |
Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant
Title | Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant |
Authors | Dominik Marek Loroch, Franz-Josef Pfreundt, Norbert Wehn, Janis Keuper |
Abstract | Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and aug- mented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to the high levels of redundancy in the network parameters. It has been shown that sparsity can be promoted specifically and the achieved sparsity can be very high. But in many cases the methods are evaluated on rather small topologies. It is not clear if the results transfer onto deeper topologies. In this paper, the TensorQuant toolbox has been extended to offer a platform to investigate sparsity, especially in deeper models. Several practical relevant topologies for varying classification problem sizes are investigated to show the differences in sparsity for activations, weights and gradients. |
Tasks | Autonomous Driving |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08784v1 |
http://arxiv.org/pdf/1808.08784v1.pdf | |
PWC | https://paperswithcode.com/paper/sparsity-in-deep-neural-networks-an-empirical |
Repo | https://github.com/DominikFHG/TensorQuant |
Framework | tf |
Context is Everything: Finding Meaning Statistically in Semantic Spaces
Title | Context is Everything: Finding Meaning Statistically in Semantic Spaces |
Authors | Eric Zelikman |
Abstract | This paper introduces Contextual Salience (CoSal), a simple and explicit measure of a word’s importance in context which is a more theoretically natural, practically simpler, and more accurate replacement to tf-idf. CoSal supports very small contexts (20 or more sentences), out-of context words, and is easy to calculate. A word vector space generated with both bigram phrases and unigram tokens reveals that contextually significant words disproportionately define phrases. This relationship is applied to produce simple weighted bag-of-words sentence embeddings. This model outperforms SkipThought and the best models trained on unordered sentences in most tests in Facebook’s SentEval, beats tf-idf on all available tests, and is generally comparable to the state of the art. This paper also applies CoSal to sentence and document summarization and an improved and context-aware cosine distance. Applying the premise that unexpected words are important, CoSal is presented as a replacement for tf-idf and an intuitive measure of contextual word importance. |
Tasks | Document Summarization, Sentence Embeddings |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08493v5 |
http://arxiv.org/pdf/1803.08493v5.pdf | |
PWC | https://paperswithcode.com/paper/context-is-everything-finding-meaning |
Repo | https://github.com/ezelikman/Context-Is-Everything |
Framework | none |
Entropy and mutual information in models of deep neural networks
Title | Entropy and mutual information in models of deep neural networks |
Authors | Marylou Gabrié, Andre Manoel, Clément Luneau, Jean Barbier, Nicolas Macris, Florent Krzakala, Lenka Zdeborová |
Abstract | We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive. |
Tasks | |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09785v2 |
http://arxiv.org/pdf/1805.09785v2.pdf | |
PWC | https://paperswithcode.com/paper/entropy-and-mutual-information-in-models-of |
Repo | https://github.com/sphinxteam/dnner |
Framework | none |
Real Time System for Facial Analysis
Title | Real Time System for Facial Analysis |
Authors | Janne Tommola, Pedram Ghazi, Bishwo Adhikari, Heikki Huttunen |
Abstract | In this paper we describe the anatomy of a real-time facial analysis system. The system recognizes the age, gender and facial expression from users in appearing in front of the camera. All components are based on convolutional neural networks, whose accuracy we study on commonly used training and evaluation sets. A key contribution of the work is the description of the interplay between processing threads for frame grabbing, face detection and the three types of recognition. The python code for executing the system uses common libraries–keras/tensorflow, opencv and dlib–and is available for download. |
Tasks | Face Detection |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05474v1 |
http://arxiv.org/pdf/1809.05474v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-system-for-facial-analysis |
Repo | https://github.com/mahehu/TUT-live-age-estimator |
Framework | tf |
2.5D Visual Sound
Title | 2.5D Visual Sound |
Authors | Ruohan Gao, Kristen Grauman |
Abstract | Binaural audio provides a listener with 3D sound sensation, allowing a rich perceptual experience of the scene. However, binaural recordings are scarcely available and require nontrivial expertise and equipment to obtain. We propose to convert common monaural audio into binaural audio by leveraging video. The key idea is that visual frames reveal significant spatial cues that, while explicitly lacking in the accompanying single-channel audio, are strongly linked to it. Our multi-modal approach recovers this link from unlabeled video. We devise a deep convolutional neural network that learns to decode the monaural (single-channel) soundtrack into its binaural counterpart by injecting visual information about object and scene configurations. We call the resulting output 2.5D visual sound—the visual stream helps “lift” the flat single channel audio into spatialized sound. In addition to sound generation, we show the self-supervised representation learned by our network benefits audio-visual source separation. Our video results: http://vision.cs.utexas.edu/projects/2.5D_visual_sound/ |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04204v4 |
http://arxiv.org/pdf/1812.04204v4.pdf | |
PWC | https://paperswithcode.com/paper/25d-visual-sound |
Repo | https://github.com/facebookresearch/FAIR-Play |
Framework | none |
Asynchronous Bidirectional Decoding for Neural Machine Translation
Title | Asynchronous Bidirectional Decoding for Neural Machine Translation |
Authors | Xiangwen Zhang, Jinsong Su, Yue Qin, Yang Liu, Rongrong Ji, Hongji Wang |
Abstract | The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation. Traditionally, the NMT decoders adopt recurrent neural networks (RNNs) to perform translation in a left-toright manner, leaving the target-side contexts generated from right to left unexploited during translation. In this paper, we equip the conventional attentional encoder-decoder NMT framework with a backward decoder, in order to explore bidirectional decoding for NMT. Attending to the hidden state sequence produced by the encoder, our backward decoder first learns to generate the target-side hidden state sequence from right to left. Then, the forward decoder performs translation in the forward direction, while in each translation prediction timestep, it simultaneously applies two attention models to consider the source-side and reverse target-side hidden states, respectively. With this new architecture, our model is able to fully exploit source- and target-side contexts to improve translation quality altogether. Experimental results on NIST Chinese-English and WMT English-German translation tasks demonstrate that our model achieves substantial improvements over the conventional NMT by 3.14 and 1.38 BLEU points, respectively. The source code of this work can be obtained from https://github.com/DeepLearnXMU/ABDNMT. |
Tasks | Machine Translation |
Published | 2018-01-16 |
URL | http://arxiv.org/abs/1801.05122v2 |
http://arxiv.org/pdf/1801.05122v2.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-bidirectional-decoding-for |
Repo | https://github.com/DeepLearnXMU/ABD-NMT |
Framework | none |
Deep Retinex Decomposition for Low-Light Enhancement
Title | Deep Retinex Decomposition for Low-Light Enhancement |
Authors | Chen Wei, Wenjing Wang, Wenhan Yang, Jiaying Liu |
Abstract | Retinex model is an effective tool for low-light image enhancement. It assumes that observed images can be decomposed into the reflectance and illumination. Most existing Retinex-based methods have carefully designed hand-crafted constraints and parameters for this highly ill-posed decomposition, which may be limited by model capacity when applied in various scenes. In this paper, we collect a LOw-Light dataset (LOL) containing low/normal-light image pairs and propose a deep Retinex-Net learned on this dataset, including a Decom-Net for decomposition and an Enhance-Net for illumination adjustment. In the training process for Decom-Net, there is no ground truth of decomposed reflectance and illumination. The network is learned with only key constraints including the consistent reflectance shared by paired low/normal-light images, and the smoothness of illumination. Based on the decomposition, subsequent lightness enhancement is conducted on illumination by an enhancement network called Enhance-Net, and for joint denoising there is a denoising operation on reflectance. The Retinex-Net is end-to-end trainable, so that the learned decomposition is by nature good for lightness adjustment. Extensive experiments demonstrate that our method not only achieves visually pleasing quality for low-light enhancement but also provides a good representation of image decomposition. |
Tasks | Denoising, Image Enhancement, Low-Light Image Enhancement |
Published | 2018-08-14 |
URL | http://arxiv.org/abs/1808.04560v1 |
http://arxiv.org/pdf/1808.04560v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-retinex-decomposition-for-low-light |
Repo | https://github.com/weichen582/RetinexNet |
Framework | tf |
Open3D: A Modern Library for 3D Data Processing
Title | Open3D: A Modern Library for 3D Data Processing |
Authors | Qian-Yi Zhou, Jaesik Park, Vladlen Koltun |
Abstract | Open3D is an open-source library that supports rapid development of software that deals with 3D data. The Open3D frontend exposes a set of carefully selected data structures and algorithms in both C++ and Python. The backend is highly optimized and is set up for parallelization. Open3D was developed from a clean slate with a small and carefully considered set of dependencies. It can be set up on different platforms and compiled from source with minimal effort. The code is clean, consistently styled, and maintained via a clear code review mechanism. Open3D has been used in a number of published research projects and is actively deployed in the cloud. We welcome contributions from the open-source community. |
Tasks | |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1801.09847v1 |
http://arxiv.org/pdf/1801.09847v1.pdf | |
PWC | https://paperswithcode.com/paper/open3d-a-modern-library-for-3d-data |
Repo | https://github.com/IntelVCL/Open3D |
Framework | none |
Transferring GANs: generating images from limited data
Title | Transferring GANs: generating images from limited data |
Authors | Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, Bogdan Raducanu |
Abstract | Transferring the knowledge of pretrained networks to new domains by means of finetuning is a widely used practice for applications based on discriminative models. To the best of our knowledge this practice has not been studied within the context of generative deep networks. Therefore, we study domain adaptation applied to image generation with generative adversarial networks. We evaluate several aspects of domain adaptation, including the impact of target domain size, the relative distance between source and target domain, and the initialization of conditional GANs. Our results show that using knowledge from pretrained networks can shorten the convergence time and can significantly improve the quality of the generated images, especially when the target data is limited. We show that these conclusions can also be drawn for conditional GANs even when the pretrained model was trained without conditioning. Our results also suggest that density may be more important than diversity and a dataset with one or few densely sampled classes may be a better source model than more diverse datasets such as ImageNet or Places. |
Tasks | Domain Adaptation, Image Generation |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01677v2 |
http://arxiv.org/pdf/1805.01677v2.pdf | |
PWC | https://paperswithcode.com/paper/transferring-gans-generating-images-from |
Repo | https://github.com/WuChenshen/MeRGAN |
Framework | tf |
Breaking the Activation Function Bottleneck through Adaptive Parameterization
Title | Breaking the Activation Function Bottleneck through Adaptive Parameterization |
Authors | Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot |
Abstract | Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly. We present an adaptive LSTM that advances the state of the art for the Penn Treebank and WikiText-2 word-modeling tasks while using fewer parameters and converging in less than half as many iterations. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08574v4 |
http://arxiv.org/pdf/1805.08574v4.pdf | |
PWC | https://paperswithcode.com/paper/breaking-the-activation-function-bottleneck |
Repo | https://github.com/flennerhag/alstm |
Framework | pytorch |
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
Title | Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction |
Authors | Ningyu Zhang, Shumin Deng, Zhanlin Sun, Xi Chen, Wei Zhang, Huajun Chen |
Abstract | A capsule is a group of neurons, whose activity vector represents the instantiation parameters of a specific type of entity. In this paper, we explore the capsule networks used for relation extraction in a multi-instance multi-label learning framework and propose a novel neural approach based on capsule networks with attention mechanisms. We evaluate our method with different benchmarks, and it is demonstrated that our method improves the precision of the predicted relations. Particularly, we show that capsule networks improve multiple entity pairs relation extraction. |
Tasks | Multi-Label Learning, Relation Extraction |
Published | 2018-12-29 |
URL | http://arxiv.org/abs/1812.11321v1 |
http://arxiv.org/pdf/1812.11321v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-capsule-networks-with-dynamic |
Repo | https://github.com/WHUNLPLab/Papers-to-read |
Framework | none |