May 7, 2019

2914 words 14 mins read

Paper Group AWR 1

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. Artistic style transfer for videos. Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering. Phase Transitions and a Model Order Selection Criterion for Spectral Grap …

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues


Title	A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Authors	Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio
Abstract	Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.
Tasks
Published	2016-05-19
URL	http://arxiv.org/abs/1605.06069v3
PDF	http://arxiv.org/pdf/1605.06069v3.pdf
PWC	https://paperswithcode.com/paper/a-hierarchical-latent-variable-encoder
Repo	https://github.com/julianser/hed-dlg-truncated
Framework	none

Artistic style transfer for videos


Title	Artistic style transfer for videos
Authors	Manuel Ruder, Alexey Dosovitskiy, Thomas Brox
Abstract	In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer in still images and propose new initializations and loss functions applicable to videos. This allows us to generate consistent and stable stylized video sequences, even in cases with large motion and strong occlusion. We show that the proposed method clearly outperforms simpler baselines both qualitatively and quantitatively.
Tasks	Style Transfer
Published	2016-04-28
URL	http://arxiv.org/abs/1604.08610v2
PDF	http://arxiv.org/pdf/1604.08610v2.pdf
PWC	https://paperswithcode.com/paper/artistic-style-transfer-for-videos
Repo	https://github.com/manuelruder/artistic-videos
Framework	torch

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge


Title	Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Authors	Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan
Abstract	Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. Finally, given the recent surge of interest in this task, a competition was organized in 2015 using the newly released COCO dataset. We describe and analyze the various improvements we applied to our own baseline and show the resulting performance in the competition, which we won ex-aequo with a team from Microsoft Research, and provide an open source implementation in TensorFlow.
Tasks	Image Captioning
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06647v1
PDF	http://arxiv.org/pdf/1609.06647v1.pdf
PWC	https://paperswithcode.com/paper/show-and-tell-lessons-learned-from-the-2015
Repo	https://github.com/HughKu/Im2txt
Framework	tf

AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering


Title	AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering
Authors	Pin-Yu Chen, Thibaut Gensollen, Alfred O. Hero III
Abstract	One of the longstanding problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. In this paper, we propose AMOS, an automated model order selection algorithm for SGC. Based on a recent analysis of clustering reliability for SGC under the random interconnection model, AMOS works by incrementally increasing the number of clusters, estimating the quality of identified clusters, and providing a series of clustering reliability tests. Consequently, AMOS outputs clusters of minimal model order with statistical clustering reliability guarantees. Comparing to three other automated graph clustering methods on real-world datasets, AMOS shows superior performance in terms of multiple external and internal clustering metrics.
Tasks	Graph Clustering, Spectral Graph Clustering
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06457v1
PDF	http://arxiv.org/pdf/1609.06457v1.pdf
PWC	https://paperswithcode.com/paper/amos-an-automated-model-order-selection
Repo	https://github.com/tgensol/AMOS
Framework	none

Phase Transitions and a Model Order Selection Criterion for Spectral Graph Clustering


Title	Phase Transitions and a Model Order Selection Criterion for Spectral Graph Clustering
Authors	Pin-Yu Chen, Alfred O. Hero
Abstract	One of the longstanding open problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. We propose automated model order selection (AMOS), a solution to the SGC model selection problem under a random interconnection model (RIM) using a novel selection criterion that is based on an asymptotic phase transition analysis. AMOS can more generally be applied to discovering hidden block diagonal structure in symmetric non-negative matrices. Numerical experiments on simulated graphs validate the phase transition analysis, and real-world network data is used to validate the performance of the proposed model selection procedure.
Tasks	Graph Clustering, Model Selection, Spectral Graph Clustering
Published	2016-04-11
URL	http://arxiv.org/abs/1604.03159v4
PDF	http://arxiv.org/pdf/1604.03159v4.pdf
PWC	https://paperswithcode.com/paper/phase-transitions-and-a-model-order-selection
Repo	https://github.com/tgensol/AMOS
Framework	none

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks


Title	Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks
Authors	Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft
Abstract	We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing $\alpha$-divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine.
Tasks	Stochastic Optimization
Published	2016-05-23
URL	http://arxiv.org/abs/1605.07127v3
PDF	http://arxiv.org/pdf/1605.07127v3.pdf
PWC	https://paperswithcode.com/paper/learning-and-policy-search-in-stochastic
Repo	https://github.com/siemens/policy_search_bb-alpha
Framework	none

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images


Title	DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
Authors	Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Xiang Bai, Alan Yuille
Abstract	Object skeletons are useful for object representation and object detection. They are complementary to the object contour, and provide extra information, such as how object scale (thickness) varies among object parts. But object skeleton extraction from natural images is very challenging, because it requires the extractor to be able to capture both local and non-local image context in order to determine the scale of each skeleton pixel. In this paper, we present a novel fully convolutional network with multiple scale-associated side outputs to address this problem. By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network. The network is trained by multi-task learning, where one task is skeleton localization to classify whether a pixel is a skeleton pixel or not, and the other is skeleton scale prediction to regress the scale of each skeleton pixel. Supervision is imposed at different stages by guiding the scale-associated side outputs toward the groundtruth skeletons at the appropriate scales. The responses of the multiple scale-associated side outputs are then fused in a scale-specific way to detect skeleton pixels using multiple scales effectively. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors. Additionally, the usefulness of the obtained skeletons and scales (thickness) are verified on two object detection applications: Foreground object segmentation and object proposal detection.
Tasks	Multi-Task Learning, Object Detection, Semantic Segmentation
Published	2016-09-13
URL	http://arxiv.org/abs/1609.03659v3
PDF	http://arxiv.org/pdf/1609.03659v3.pdf
PWC	https://paperswithcode.com/paper/deepskeleton-learning-multi-task-scale
Repo	https://github.com/zeakey/skeleton
Framework	none

Multi-fidelity Gaussian Process Bandit Optimisation


Title	Multi-fidelity Gaussian Process Bandit Optimisation
Authors	Kirthevasan Kandasamy, Gautam Dasarathy, Junier B. Oliva, Jeff Schneider, Barnabas Poczos
Abstract	In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function $f$. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to $f$ may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of $f$ in a small but promising region and speedily identify the optimum. We formalise this task as a \emph{multi-fidelity} bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour, and achieves better regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.
Tasks
Published	2016-03-20
URL	http://arxiv.org/abs/1603.06288v4
PDF	http://arxiv.org/pdf/1603.06288v4.pdf
PWC	https://paperswithcode.com/paper/multi-fidelity-gaussian-process-bandit
Repo	https://github.com/kirthevasank/mf-gp-ucb
Framework	none

Product-based Neural Networks for User Response Prediction


Title	Product-based Neural Networks for User Response Prediction
Authors	Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, Jun Wang
Abstract	Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics.
Tasks	Click-Through Rate Prediction, Recommendation Systems
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00144v1
PDF	http://arxiv.org/pdf/1611.00144v1.pdf
PWC	https://paperswithcode.com/paper/product-based-neural-networks-for-user
Repo	https://github.com/Atomu2014/product-nets
Framework	tf

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks


Title	LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
Authors	Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu
Abstract	Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector. Depending on its position in the table, a word is jointly represented by two components: a row vector and a column vector. Since the words in the same row share the row vector and the words in the same column share the column vector, we only need $2 \sqrt{V}$ vectors to represent a vocabulary of $V$ unique words, which are far less than the $V$ vectors required by existing approaches. Based on the 2-Component shared embedding, we design a new RNN algorithm and evaluate it using the language modeling task on several benchmark datasets. The results show that our algorithm significantly reduces the model size and speeds up the training process, without sacrifice of accuracy (it achieves similar, if not better, perplexity as compared to state-of-the-art language models). Remarkably, on the One-Billion-Word benchmark Dataset, our algorithm achieves comparable perplexity to previous language models, whilst reducing the model size by a factor of 40-100, and speeding up the training process by a factor of 2. We name our proposed algorithm \emph{LightRNN} to reflect its very small model size and very high training speed.
Tasks	Language Modelling, Machine Translation
Published	2016-10-31
URL	http://arxiv.org/abs/1610.09893v1
PDF	http://arxiv.org/pdf/1610.09893v1.pdf
PWC	https://paperswithcode.com/paper/lightrnn-memory-and-computation-efficient
Repo	https://github.com/liuruoruo/lightrnn
Framework	tf

How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?


Title	How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?
Authors	Nicolas Goix
Abstract	When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of un-supervised anomaly detection algorithms. However , in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms. These criteria are based on existing Excess-Mass (EM) and Mass-Volume (MV) curves, which generally cannot be well estimated in large dimension. A methodology based on feature sub-sampling and aggregating is also described and tested, extending the use of these criteria to high-dimensional datasets and solving major drawbacks inherent to standard EM and MV curves.
Tasks	Anomaly Detection, Unsupervised Anomaly Detection
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01152v1
PDF	http://arxiv.org/pdf/1607.01152v1.pdf
PWC	https://paperswithcode.com/paper/how-to-evaluate-the-quality-of-unsupervised
Repo	https://github.com/bstienen/unsupervised-learning-metrics
Framework	none

gvnn: Neural Network Library for Geometric Computer Vision


Title	gvnn: Neural Network Library for Geometric Computer Vision
Authors	Ankur Handa, Michael Bloesch, Viorica Patraucean, Simon Stent, John McCormac, Andrew Davison
Abstract	We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between classic geometric computer vision and deep learning. Inspired by the recent success of Spatial Transformer Networks, we propose several new layers which are often used as parametric transformations on the data in geometric computer vision. These layers can be inserted within a neural network much in the spirit of the original spatial transformers and allow backpropagation to enable end-to-end learning of a network involving any domain knowledge in geometric computer vision. This opens up applications in learning invariance to 3D geometric transformation for place recognition, end-to-end visual odometry, depth estimation and unsupervised learning through warping with a parametric transformation for image reconstruction error.
Tasks	Image Reconstruction, Visual Odometry
Published	2016-07-25
URL	http://arxiv.org/abs/1607.07405v3
PDF	http://arxiv.org/pdf/1607.07405v3.pdf
PWC	https://paperswithcode.com/paper/gvnn-neural-network-library-for-geometric
Repo	https://github.com/ankurhanda/gvnn
Framework	torch

Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations


Title	Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations
Authors	Alexandre Salle, Marco Idiart, Aline Villavicencio
Abstract	In this paper, we propose LexVec, a new method for generating distributed word representations that uses low-rank, weighted factorization of the Positive Point-wise Mutual Information matrix via stochastic gradient descent, employing a weighting scheme that assigns heavier penalties for errors on frequent co-occurrences while still accounting for negative co-occurrence. Evaluation on word similarity and analogy tasks shows that LexVec matches and often outperforms state-of-the-art methods on many of these tasks.
Tasks
Published	2016-06-02
URL	http://arxiv.org/abs/1606.00819v2
PDF	http://arxiv.org/pdf/1606.00819v2.pdf
PWC	https://paperswithcode.com/paper/matrix-factorization-using-window-sampling
Repo	https://github.com/alexandres/lexvec
Framework	none

TRex: A Tomography Reconstruction Proximal Framework for Robust Sparse View X-Ray Applications


Title	TRex: A Tomography Reconstruction Proximal Framework for Robust Sparse View X-Ray Applications
Authors	Mohamed Aly, Guangming Zang, Wolfgang Heidrich, Peter Wonka
Abstract	We present TRex, a flexible and robust Tomographic Reconstruction framework using proximal algorithms. We provide an overview and perform an experimental comparison between the famous iterative reconstruction methods in terms of reconstruction quality in sparse view situations. We then derive the proximal operators for the four best methods. We show the flexibility of our framework by deriving solvers for two noise models: Gaussian and Poisson; and by plugging in three powerful regularizers. We compare our framework to state of the art methods, and show superior quality on both synthetic and real datasets.
Tasks
Published	2016-06-11
URL	http://arxiv.org/abs/1606.03601v1
PDF	http://arxiv.org/pdf/1606.03601v1.pdf
PWC	https://paperswithcode.com/paper/trex-a-tomography-reconstruction-proximal
Repo	https://github.com/mohamedadaly/TRex
Framework	none

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1


Title	Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
Authors	Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio
Abstract	We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At training-time the binary weights and activations are used for computing the parameters gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs we conduct two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line.
Tasks
Published	2016-02-09
URL	http://arxiv.org/abs/1602.02830v3
PDF	http://arxiv.org/pdf/1602.02830v3.pdf
PWC	https://paperswithcode.com/paper/binarized-neural-networks-training-deep
Repo	https://github.com/hpi-xnor/BMXNet
Framework	mxnet