May 7, 2019

1749 words 9 mins read

Paper Group AWR 106

VisualBackProp: efficient visualization of CNNs. Definition Modeling: Learning to define word embeddings in natural language. Playing FPS Games with Deep Reinforcement Learning. Going Deeper with Contextual CNN for Hyperspectral Image Classification. Nonparametric Spherical Topic Modeling with Word Embeddings. Exploiting sparsity to build efficient …

VisualBackProp: efficient visualization of CNNs


Title	VisualBackProp: efficient visualization of CNNs
Authors	Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba
Abstract	This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose was developed as a debugging tool for CNN-based systems for steering self-driving cars and is therefore required to run in real-time, i.e. it was designed to require less computations than a forward propagation. This makes the presented visualization method a valuable debugging tool which can be easily used during both training and inference. We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. Our theoretical findings stand in agreement with the experimental results. The empirical evaluation shows the plausibility of the proposed approach on the road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and simultaneously achieves order of magnitude speed-ups.
Tasks	Self-Driving Cars
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05418v3
PDF	http://arxiv.org/pdf/1611.05418v3.pdf
PWC	https://paperswithcode.com/paper/visualbackprop-efficient-visualization-of
Repo	https://github.com/AlexeyZhuravlev/visual-backprop
Framework	pytorch

Definition Modeling: Learning to define word embeddings in natural language


Title	Definition Modeling: Learning to define word embeddings in natural language
Authors	Thanapon Noraset, Chen Liang, Larry Birnbaum, Doug Downey
Abstract	Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this paper, we study whether it is possible to utilize distributed representations to generate dictionary definitions of words, as a more direct and transparent representation of the embeddings’ semantics. We introduce definition modeling, the task of generating a definition for a given word and its embedding. We present several definition model architectures based on recurrent neural networks, and experiment with the models over multiple data sets. Our results show that a model that controls dependencies between the word being defined and the definition words performs significantly better, and that a character-level convolution layer designed to leverage morphology can complement word-level embeddings. Finally, an error analysis suggests that the errors made by a definition model may provide insight into the shortcomings of word embeddings.
Tasks	Word Embeddings
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00394v1
PDF	http://arxiv.org/pdf/1612.00394v1.pdf
PWC	https://paperswithcode.com/paper/definition-modeling-learning-to-define-word
Repo	https://github.com/websail-nu/torch-defseq
Framework	torch

Playing FPS Games with Deep Reinforcement Learning


Title	Playing FPS Games with Deep Reinforcement Learning
Authors	Guillaume Lample, Devendra Singh Chaplot
Abstract	Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios.
Tasks	FPS Games, Game of Doom, Q-Learning
Published	2016-09-18
URL	http://arxiv.org/abs/1609.05521v2
PDF	http://arxiv.org/pdf/1609.05521v2.pdf
PWC	https://paperswithcode.com/paper/playing-fps-games-with-deep-reinforcement
Repo	https://github.com/RENHANFEI/vizdoom_health_gathering
Framework	pytorch

Going Deeper with Contextual CNN for Hyperspectral Image Classification


Title	Going Deeper with Contextual CNN for Hyperspectral Image Classification
Authors	Hyungtae Lee, Heesung Kwon
Abstract	In this paper, we describe a novel deep convolutional neural network (CNN) that is deeper and wider than other existing deep networks for hyperspectral image classification. Unlike current state-of-the-art approaches in CNN-based hyperspectral image classification, the proposed network, called contextual deep CNN, can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring individual pixel vectors. The joint exploitation of the spatio-spectral information is achieved by a multi-scale convolutional filter bank used as an initial component of the proposed CNN pipeline. The initial spatial and spectral feature maps obtained from the multi-scale filter bank are then combined together to form a joint spatio-spectral feature map. The joint feature map representing rich spectral and spatial properties of the hyperspectral image is then fed through a fully convolutional network that eventually predicts the corresponding label of each pixel vector. The proposed approach is tested on three benchmark datasets: the Indian Pines dataset, the Salinas dataset and the University of Pavia dataset. Performance comparison shows enhanced classification performance of the proposed approach over the current state-of-the-art on the three datasets.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2016-04-12
URL	http://arxiv.org/abs/1604.03519v3
PDF	http://arxiv.org/pdf/1604.03519v3.pdf
PWC	https://paperswithcode.com/paper/going-deeper-with-contextual-cnn-for
Repo	https://github.com/eecn/Hyperspectral-Classification
Framework	pytorch

Nonparametric Spherical Topic Modeling with Word Embeddings


Title	Nonparametric Spherical Topic Modeling with Word Embeddings
Authors	Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, Sam Gershman
Abstract	Traditional topic models do not account for semantic regularities in language. Recent distributional representations of words exhibit semantic consistency over directional metrics such as cosine similarity. However, neither categorical nor Gaussian observational distributions used in existing topic models are appropriate to leverage such correlations. In this paper, we propose to use the von Mises-Fisher distribution to model the density of words over a unit sphere. Such a representation is well-suited for directional data. We use a Hierarchical Dirichlet Process for our base topic model and propose an efficient inference algorithm based on Stochastic Variational Inference. This model enables us to naturally exploit the semantic structures of word embeddings while flexibly discovering the number of topics. Experiments demonstrate that our method outperforms competitive approaches in terms of topic coherence on two different text corpora while offering efficient inference.
Tasks	Topic Models, Word Embeddings
Published	2016-04-01
URL	http://arxiv.org/abs/1604.00126v1
PDF	http://arxiv.org/pdf/1604.00126v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-spherical-topic-modeling-with
Repo	https://github.com/Ardavans/sHDP
Framework	none

Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation


Title	Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation
Authors	Mirko Polato, Fabio Aiolli
Abstract	The increasing availability of implicit feedback datasets has raised the interest in developing effective collaborative filtering techniques able to deal asymmetrically with unambiguous positive feedback and ambiguous negative feedback. In this paper, we propose a principled kernel-based collaborative filtering method for top-N item recommendation with implicit feedback. We present an efficient implementation using the linear kernel, and we show how to generalize it to kernels of the dot product family preserving the efficiency. We also investigate on the elements which influence the sparsity of a standard cosine kernel. This analysis shows that the sparsity of the kernel strongly depends on the properties of the dataset, in particular on the long tail distribution. We compare our method with state-of-the-art algorithms achieving good results both in terms of efficiency and effectiveness.
Tasks
Published	2016-12-17
URL	http://arxiv.org/abs/1612.05729v1
PDF	http://arxiv.org/pdf/1612.05729v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-sparsity-to-build-efficient-kernel
Repo	https://github.com/makgyver/pyros
Framework	none

Categorical Reparameterization with Gumbel-Softmax


Title	Categorical Reparameterization with Gumbel-Softmax
Authors	Eric Jang, Shixiang Gu, Ben Poole
Abstract	Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present an efficient gradient estimator that replaces the non-differentiable sample from a categorical distribution with a differentiable sample from a novel Gumbel-Softmax distribution. This distribution has the essential property that it can be smoothly annealed into a categorical distribution. We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Tasks
Published	2016-11-03
URL	http://arxiv.org/abs/1611.01144v5
PDF	http://arxiv.org/pdf/1611.01144v5.pdf
PWC	https://paperswithcode.com/paper/categorical-reparameterization-with-gumbel
Repo	https://github.com/tensorflow/models
Framework	tf

SIFT: An Algorithm for Extracting Structural Information From Taxonomies


Title	SIFT: An Algorithm for Extracting Structural Information From Taxonomies
Authors	Jorge Martinez-Gil
Abstract	In this work we present SIFT, a 3-step algorithm for the analysis of the structural information represented by means of a taxonomy. The major advantage of this algorithm is the capability to leverage the information inherent to the hierarchical structures of taxonomies to infer correspondences which can allow to merge them in a later step. This method is particular relevant in scenarios where taxonomy alignment techniques exploiting textual information from taxonomy nodes cannot operate successfully.
Tasks
Published	2016-02-23
URL	http://arxiv.org/abs/1602.07064v1
PDF	http://arxiv.org/pdf/1602.07064v1.pdf
PWC	https://paperswithcode.com/paper/sift-an-algorithm-for-extracting-structural
Repo	https://github.com/jorgemartinezgil/sift
Framework	none

Deep TEN: Texture Encoding Network


Title	Deep TEN: Texture Encoding Network
Authors	Hang Zhang, Jia Xue, Kristin Dana
Abstract	We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model. Current methods build from distinct components, using standard encoders with separate off-the-shelf features such as SIFT descriptors or pre-trained CNN features for material recognition. Our new approach provides an end-to-end learning framework, where the inherent visual vocabularies are learned directly from the loss function. The features, dictionaries and the encoding representation for the classifier are all learned simultaneously. The representation is orderless and therefore is particularly useful for material and texture recognition. The Encoding Layer generalizes robust residual encoders such as VLAD and Fisher Vectors, and has the property of discarding domain specific information which makes the learned convolutional features easier to transfer. Additionally, joint training using multiple datasets of varied sizes and class labels is supported resulting in increased recognition performance. The experimental results show superior performance as compared to state-of-the-art methods using gold-standard databases such as MINC-2500, Flickr Material Database, KTH-TIPS-2b, and two recent databases 4D-Light-Field-Material and GTOS. The source code for the complete system are publicly available.
Tasks	Dictionary Learning, Material Recognition
Published	2016-12-08
URL	http://arxiv.org/abs/1612.02844v1
PDF	http://arxiv.org/pdf/1612.02844v1.pdf
PWC	https://paperswithcode.com/paper/deep-ten-texture-encoding-network
Repo	https://github.com/kmaninis/pytorch-encoding
Framework	pytorch