May 7, 2019

1749 words 9 mins read

Paper Group AWR 106

Paper Group AWR 106

VisualBackProp: efficient visualization of CNNs. Definition Modeling: Learning to define word embeddings in natural language. Playing FPS Games with Deep Reinforcement Learning. Going Deeper with Contextual CNN for Hyperspectral Image Classification. Nonparametric Spherical Topic Modeling with Word Embeddings. Exploiting sparsity to build efficient …

VisualBackProp: efficient visualization of CNNs

Title VisualBackProp: efficient visualization of CNNs
Authors Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba
Abstract This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose was developed as a debugging tool for CNN-based systems for steering self-driving cars and is therefore required to run in real-time, i.e. it was designed to require less computations than a forward propagation. This makes the presented visualization method a valuable debugging tool which can be easily used during both training and inference. We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. Our theoretical findings stand in agreement with the experimental results. The empirical evaluation shows the plausibility of the proposed approach on the road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and simultaneously achieves order of magnitude speed-ups.
Tasks Self-Driving Cars
Published 2016-11-16
URL http://arxiv.org/abs/1611.05418v3
PDF http://arxiv.org/pdf/1611.05418v3.pdf
PWC https://paperswithcode.com/paper/visualbackprop-efficient-visualization-of
Repo https://github.com/AlexeyZhuravlev/visual-backprop
Framework pytorch

Definition Modeling: Learning to define word embeddings in natural language

Title Definition Modeling: Learning to define word embeddings in natural language
Authors Thanapon Noraset, Chen Liang, Larry Birnbaum, Doug Downey
Abstract Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this paper, we study whether it is possible to utilize distributed representations to generate dictionary definitions of words, as a more direct and transparent representation of the embeddings’ semantics. We introduce definition modeling, the task of generating a definition for a given word and its embedding. We present several definition model architectures based on recurrent neural networks, and experiment with the models over multiple data sets. Our results show that a model that controls dependencies between the word being defined and the definition words performs significantly better, and that a character-level convolution layer designed to leverage morphology can complement word-level embeddings. Finally, an error analysis suggests that the errors made by a definition model may provide insight into the shortcomings of word embeddings.
Tasks Word Embeddings
Published 2016-12-01
URL http://arxiv.org/abs/1612.00394v1
PDF http://arxiv.org/pdf/1612.00394v1.pdf
PWC https://paperswithcode.com/paper/definition-modeling-learning-to-define-word
Repo https://github.com/websail-nu/torch-defseq
Framework torch

Playing FPS Games with Deep Reinforcement Learning

Title Playing FPS Games with Deep Reinforcement Learning
Authors Guillaume Lample, Devendra Singh Chaplot
Abstract Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios.
Tasks FPS Games, Game of Doom, Q-Learning
Published 2016-09-18
URL http://arxiv.org/abs/1609.05521v2
PDF http://arxiv.org/pdf/1609.05521v2.pdf
PWC https://paperswithcode.com/paper/playing-fps-games-with-deep-reinforcement
Repo https://github.com/RENHANFEI/vizdoom_health_gathering
Framework pytorch

Going Deeper with Contextual CNN for Hyperspectral Image Classification

Title Going Deeper with Contextual CNN for Hyperspectral Image Classification
Authors Hyungtae Lee, Heesung Kwon
Abstract In this paper, we describe a novel deep convolutional neural network (CNN) that is deeper and wider than other existing deep networks for hyperspectral image classification. Unlike current state-of-the-art approaches in CNN-based hyperspectral image classification, the proposed network, called contextual deep CNN, can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring individual pixel vectors. The joint exploitation of the spatio-spectral information is achieved by a multi-scale convolutional filter bank used as an initial component of the proposed CNN pipeline. The initial spatial and spectral feature maps obtained from the multi-scale filter bank are then combined together to form a joint spatio-spectral feature map. The joint feature map representing rich spectral and spatial properties of the hyperspectral image is then fed through a fully convolutional network that eventually predicts the corresponding label of each pixel vector. The proposed approach is tested on three benchmark datasets: the Indian Pines dataset, the Salinas dataset and the University of Pavia dataset. Performance comparison shows enhanced classification performance of the proposed approach over the current state-of-the-art on the three datasets.
Tasks Hyperspectral Image Classification, Image Classification
Published 2016-04-12
URL http://arxiv.org/abs/1604.03519v3
PDF http://arxiv.org/pdf/1604.03519v3.pdf
PWC https://paperswithcode.com/paper/going-deeper-with-contextual-cnn-for
Repo https://github.com/eecn/Hyperspectral-Classification
Framework pytorch

Nonparametric Spherical Topic Modeling with Word Embeddings

Title Nonparametric Spherical Topic Modeling with Word Embeddings
Authors Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, Sam Gershman
Abstract Traditional topic models do not account for semantic regularities in language. Recent distributional representations of words exhibit semantic consistency over directional metrics such as cosine similarity. However, neither categorical nor Gaussian observational distributions used in existing topic models are appropriate to leverage such correlations. In this paper, we propose to use the von Mises-Fisher distribution to model the density of words over a unit sphere. Such a representation is well-suited for directional data. We use a Hierarchical Dirichlet Process for our base topic model and propose an efficient inference algorithm based on Stochastic Variational Inference. This model enables us to naturally exploit the semantic structures of word embeddings while flexibly discovering the number of topics. Experiments demonstrate that our method outperforms competitive approaches in terms of topic coherence on two different text corpora while offering efficient inference.
Tasks Topic Models, Word Embeddings
Published 2016-04-01
URL http://arxiv.org/abs/1604.00126v1
PDF http://arxiv.org/pdf/1604.00126v1.pdf
PWC https://paperswithcode.com/paper/nonparametric-spherical-topic-modeling-with
Repo https://github.com/Ardavans/sHDP
Framework none

Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation

Title Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation
Authors Mirko Polato, Fabio Aiolli
Abstract The increasing availability of implicit feedback datasets has raised the interest in developing effective collaborative filtering techniques able to deal asymmetrically with unambiguous positive feedback and ambiguous negative feedback. In this paper, we propose a principled kernel-based collaborative filtering method for top-N item recommendation with implicit feedback. We present an efficient implementation using the linear kernel, and we show how to generalize it to kernels of the dot product family preserving the efficiency. We also investigate on the elements which influence the sparsity of a standard cosine kernel. This analysis shows that the sparsity of the kernel strongly depends on the properties of the dataset, in particular on the long tail distribution. We compare our method with state-of-the-art algorithms achieving good results both in terms of efficiency and effectiveness.
Tasks
Published 2016-12-17
URL http://arxiv.org/abs/1612.05729v1
PDF http://arxiv.org/pdf/1612.05729v1.pdf
PWC https://paperswithcode.com/paper/exploiting-sparsity-to-build-efficient-kernel
Repo https://github.com/makgyver/pyros
Framework none

Categorical Reparameterization with Gumbel-Softmax

Title Categorical Reparameterization with Gumbel-Softmax
Authors Eric Jang, Shixiang Gu, Ben Poole
Abstract Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present an efficient gradient estimator that replaces the non-differentiable sample from a categorical distribution with a differentiable sample from a novel Gumbel-Softmax distribution. This distribution has the essential property that it can be smoothly annealed into a categorical distribution. We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Tasks
Published 2016-11-03
URL http://arxiv.org/abs/1611.01144v5
PDF http://arxiv.org/pdf/1611.01144v5.pdf
PWC https://paperswithcode.com/paper/categorical-reparameterization-with-gumbel
Repo https://github.com/tensorflow/models
Framework tf

SIFT: An Algorithm for Extracting Structural Information From Taxonomies

Title SIFT: An Algorithm for Extracting Structural Information From Taxonomies
Authors Jorge Martinez-Gil
Abstract In this work we present SIFT, a 3-step algorithm for the analysis of the structural information represented by means of a taxonomy. The major advantage of this algorithm is the capability to leverage the information inherent to the hierarchical structures of taxonomies to infer correspondences which can allow to merge them in a later step. This method is particular relevant in scenarios where taxonomy alignment techniques exploiting textual information from taxonomy nodes cannot operate successfully.
Tasks
Published 2016-02-23
URL http://arxiv.org/abs/1602.07064v1
PDF http://arxiv.org/pdf/1602.07064v1.pdf
PWC https://paperswithcode.com/paper/sift-an-algorithm-for-extracting-structural
Repo https://github.com/jorgemartinezgil/sift
Framework none

Deep TEN: Texture Encoding Network

Title Deep TEN: Texture Encoding Network
Authors Hang Zhang, Jia Xue, Kristin Dana
Abstract We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model. Current methods build from distinct components, using standard encoders with separate off-the-shelf features such as SIFT descriptors or pre-trained CNN features for material recognition. Our new approach provides an end-to-end learning framework, where the inherent visual vocabularies are learned directly from the loss function. The features, dictionaries and the encoding representation for the classifier are all learned simultaneously. The representation is orderless and therefore is particularly useful for material and texture recognition. The Encoding Layer generalizes robust residual encoders such as VLAD and Fisher Vectors, and has the property of discarding domain specific information which makes the learned convolutional features easier to transfer. Additionally, joint training using multiple datasets of varied sizes and class labels is supported resulting in increased recognition performance. The experimental results show superior performance as compared to state-of-the-art methods using gold-standard databases such as MINC-2500, Flickr Material Database, KTH-TIPS-2b, and two recent databases 4D-Light-Field-Material and GTOS. The source code for the complete system are publicly available.
Tasks Dictionary Learning, Material Recognition
Published 2016-12-08
URL http://arxiv.org/abs/1612.02844v1
PDF http://arxiv.org/pdf/1612.02844v1.pdf
PWC https://paperswithcode.com/paper/deep-ten-texture-encoding-network
Repo https://github.com/kmaninis/pytorch-encoding
Framework pytorch
comments powered by Disqus