Paper Group AWR 1
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. Artistic style transfer for videos. Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering. Phase Transitions and a Model Order Selection Criterion for Spectral Grap …
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Title | A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues |
Authors | Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio |
Abstract | Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context. |
Tasks | |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.06069v3 |
http://arxiv.org/pdf/1605.06069v3.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchical-latent-variable-encoder |
Repo | https://github.com/julianser/hed-dlg-truncated |
Framework | none |
Artistic style transfer for videos
Title | Artistic style transfer for videos |
Authors | Manuel Ruder, Alexey Dosovitskiy, Thomas Brox |
Abstract | In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer in still images and propose new initializations and loss functions applicable to videos. This allows us to generate consistent and stable stylized video sequences, even in cases with large motion and strong occlusion. We show that the proposed method clearly outperforms simpler baselines both qualitatively and quantitatively. |
Tasks | Style Transfer |
Published | 2016-04-28 |
URL | http://arxiv.org/abs/1604.08610v2 |
http://arxiv.org/pdf/1604.08610v2.pdf | |
PWC | https://paperswithcode.com/paper/artistic-style-transfer-for-videos |
Repo | https://github.com/manuelruder/artistic-videos |
Framework | torch |
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Title | Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge |
Authors | Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan |
Abstract | Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. Finally, given the recent surge of interest in this task, a competition was organized in 2015 using the newly released COCO dataset. We describe and analyze the various improvements we applied to our own baseline and show the resulting performance in the competition, which we won ex-aequo with a team from Microsoft Research, and provide an open source implementation in TensorFlow. |
Tasks | Image Captioning |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06647v1 |
http://arxiv.org/pdf/1609.06647v1.pdf | |
PWC | https://paperswithcode.com/paper/show-and-tell-lessons-learned-from-the-2015 |
Repo | https://github.com/HughKu/Im2txt |
Framework | tf |
AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering
Title | AMOS: An Automated Model Order Selection Algorithm for Spectral Graph Clustering |
Authors | Pin-Yu Chen, Thibaut Gensollen, Alfred O. Hero III |
Abstract | One of the longstanding problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. In this paper, we propose AMOS, an automated model order selection algorithm for SGC. Based on a recent analysis of clustering reliability for SGC under the random interconnection model, AMOS works by incrementally increasing the number of clusters, estimating the quality of identified clusters, and providing a series of clustering reliability tests. Consequently, AMOS outputs clusters of minimal model order with statistical clustering reliability guarantees. Comparing to three other automated graph clustering methods on real-world datasets, AMOS shows superior performance in terms of multiple external and internal clustering metrics. |
Tasks | Graph Clustering, Spectral Graph Clustering |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06457v1 |
http://arxiv.org/pdf/1609.06457v1.pdf | |
PWC | https://paperswithcode.com/paper/amos-an-automated-model-order-selection |
Repo | https://github.com/tgensol/AMOS |
Framework | none |
Phase Transitions and a Model Order Selection Criterion for Spectral Graph Clustering
Title | Phase Transitions and a Model Order Selection Criterion for Spectral Graph Clustering |
Authors | Pin-Yu Chen, Alfred O. Hero |
Abstract | One of the longstanding open problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. We propose automated model order selection (AMOS), a solution to the SGC model selection problem under a random interconnection model (RIM) using a novel selection criterion that is based on an asymptotic phase transition analysis. AMOS can more generally be applied to discovering hidden block diagonal structure in symmetric non-negative matrices. Numerical experiments on simulated graphs validate the phase transition analysis, and real-world network data is used to validate the performance of the proposed model selection procedure. |
Tasks | Graph Clustering, Model Selection, Spectral Graph Clustering |
Published | 2016-04-11 |
URL | http://arxiv.org/abs/1604.03159v4 |
http://arxiv.org/pdf/1604.03159v4.pdf | |
PWC | https://paperswithcode.com/paper/phase-transitions-and-a-model-order-selection |
Repo | https://github.com/tgensol/AMOS |
Framework | none |
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks
Title | Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks |
Authors | Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft |
Abstract | We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing $\alpha$-divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine. |
Tasks | Stochastic Optimization |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.07127v3 |
http://arxiv.org/pdf/1605.07127v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-and-policy-search-in-stochastic |
Repo | https://github.com/siemens/policy_search_bb-alpha |
Framework | none |
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
Title | DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images |
Authors | Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Xiang Bai, Alan Yuille |
Abstract | Object skeletons are useful for object representation and object detection. They are complementary to the object contour, and provide extra information, such as how object scale (thickness) varies among object parts. But object skeleton extraction from natural images is very challenging, because it requires the extractor to be able to capture both local and non-local image context in order to determine the scale of each skeleton pixel. In this paper, we present a novel fully convolutional network with multiple scale-associated side outputs to address this problem. By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network. The network is trained by multi-task learning, where one task is skeleton localization to classify whether a pixel is a skeleton pixel or not, and the other is skeleton scale prediction to regress the scale of each skeleton pixel. Supervision is imposed at different stages by guiding the scale-associated side outputs toward the groundtruth skeletons at the appropriate scales. The responses of the multiple scale-associated side outputs are then fused in a scale-specific way to detect skeleton pixels using multiple scales effectively. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors. Additionally, the usefulness of the obtained skeletons and scales (thickness) are verified on two object detection applications: Foreground object segmentation and object proposal detection. |
Tasks | Multi-Task Learning, Object Detection, Semantic Segmentation |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03659v3 |
http://arxiv.org/pdf/1609.03659v3.pdf | |
PWC | https://paperswithcode.com/paper/deepskeleton-learning-multi-task-scale |
Repo | https://github.com/zeakey/skeleton |
Framework | none |
Multi-fidelity Gaussian Process Bandit Optimisation
Title | Multi-fidelity Gaussian Process Bandit Optimisation |
Authors | Kirthevasan Kandasamy, Gautam Dasarathy, Junier B. Oliva, Jeff Schneider, Barnabas Poczos |
Abstract | In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function $f$. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to $f$ may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of $f$ in a small but promising region and speedily identify the optimum. We formalise this task as a \emph{multi-fidelity} bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour, and achieves better regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments. |
Tasks | |
Published | 2016-03-20 |
URL | http://arxiv.org/abs/1603.06288v4 |
http://arxiv.org/pdf/1603.06288v4.pdf | |
PWC | https://paperswithcode.com/paper/multi-fidelity-gaussian-process-bandit |
Repo | https://github.com/kirthevasank/mf-gp-ucb |
Framework | none |
Product-based Neural Networks for User Response Prediction
Title | Product-based Neural Networks for User Response Prediction |
Authors | Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, Jun Wang |
Abstract | Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics. |
Tasks | Click-Through Rate Prediction, Recommendation Systems |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00144v1 |
http://arxiv.org/pdf/1611.00144v1.pdf | |
PWC | https://paperswithcode.com/paper/product-based-neural-networks-for-user |
Repo | https://github.com/Atomu2014/product-nets |
Framework | tf |
LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
Title | LightRNN: Memory and Computation-Efficient Recurrent Neural Networks |
Authors | Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu |
Abstract | Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector. Depending on its position in the table, a word is jointly represented by two components: a row vector and a column vector. Since the words in the same row share the row vector and the words in the same column share the column vector, we only need $2 \sqrt{V}$ vectors to represent a vocabulary of $V$ unique words, which are far less than the $V$ vectors required by existing approaches. Based on the 2-Component shared embedding, we design a new RNN algorithm and evaluate it using the language modeling task on several benchmark datasets. The results show that our algorithm significantly reduces the model size and speeds up the training process, without sacrifice of accuracy (it achieves similar, if not better, perplexity as compared to state-of-the-art language models). Remarkably, on the One-Billion-Word benchmark Dataset, our algorithm achieves comparable perplexity to previous language models, whilst reducing the model size by a factor of 40-100, and speeding up the training process by a factor of 2. We name our proposed algorithm \emph{LightRNN} to reflect its very small model size and very high training speed. |
Tasks | Language Modelling, Machine Translation |
Published | 2016-10-31 |
URL | http://arxiv.org/abs/1610.09893v1 |
http://arxiv.org/pdf/1610.09893v1.pdf | |
PWC | https://paperswithcode.com/paper/lightrnn-memory-and-computation-efficient |
Repo | https://github.com/liuruoruo/lightrnn |
Framework | tf |
How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?
Title | How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms? |
Authors | Nicolas Goix |
Abstract | When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of un-supervised anomaly detection algorithms. However , in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms. These criteria are based on existing Excess-Mass (EM) and Mass-Volume (MV) curves, which generally cannot be well estimated in large dimension. A methodology based on feature sub-sampling and aggregating is also described and tested, extending the use of these criteria to high-dimensional datasets and solving major drawbacks inherent to standard EM and MV curves. |
Tasks | Anomaly Detection, Unsupervised Anomaly Detection |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01152v1 |
http://arxiv.org/pdf/1607.01152v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-evaluate-the-quality-of-unsupervised |
Repo | https://github.com/bstienen/unsupervised-learning-metrics |
Framework | none |
gvnn: Neural Network Library for Geometric Computer Vision
Title | gvnn: Neural Network Library for Geometric Computer Vision |
Authors | Ankur Handa, Michael Bloesch, Viorica Patraucean, Simon Stent, John McCormac, Andrew Davison |
Abstract | We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between classic geometric computer vision and deep learning. Inspired by the recent success of Spatial Transformer Networks, we propose several new layers which are often used as parametric transformations on the data in geometric computer vision. These layers can be inserted within a neural network much in the spirit of the original spatial transformers and allow backpropagation to enable end-to-end learning of a network involving any domain knowledge in geometric computer vision. This opens up applications in learning invariance to 3D geometric transformation for place recognition, end-to-end visual odometry, depth estimation and unsupervised learning through warping with a parametric transformation for image reconstruction error. |
Tasks | Image Reconstruction, Visual Odometry |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07405v3 |
http://arxiv.org/pdf/1607.07405v3.pdf | |
PWC | https://paperswithcode.com/paper/gvnn-neural-network-library-for-geometric |
Repo | https://github.com/ankurhanda/gvnn |
Framework | torch |
Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations
Title | Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations |
Authors | Alexandre Salle, Marco Idiart, Aline Villavicencio |
Abstract | In this paper, we propose LexVec, a new method for generating distributed word representations that uses low-rank, weighted factorization of the Positive Point-wise Mutual Information matrix via stochastic gradient descent, employing a weighting scheme that assigns heavier penalties for errors on frequent co-occurrences while still accounting for negative co-occurrence. Evaluation on word similarity and analogy tasks shows that LexVec matches and often outperforms state-of-the-art methods on many of these tasks. |
Tasks | |
Published | 2016-06-02 |
URL | http://arxiv.org/abs/1606.00819v2 |
http://arxiv.org/pdf/1606.00819v2.pdf | |
PWC | https://paperswithcode.com/paper/matrix-factorization-using-window-sampling |
Repo | https://github.com/alexandres/lexvec |
Framework | none |
TRex: A Tomography Reconstruction Proximal Framework for Robust Sparse View X-Ray Applications
Title | TRex: A Tomography Reconstruction Proximal Framework for Robust Sparse View X-Ray Applications |
Authors | Mohamed Aly, Guangming Zang, Wolfgang Heidrich, Peter Wonka |
Abstract | We present TRex, a flexible and robust Tomographic Reconstruction framework using proximal algorithms. We provide an overview and perform an experimental comparison between the famous iterative reconstruction methods in terms of reconstruction quality in sparse view situations. We then derive the proximal operators for the four best methods. We show the flexibility of our framework by deriving solvers for two noise models: Gaussian and Poisson; and by plugging in three powerful regularizers. We compare our framework to state of the art methods, and show superior quality on both synthetic and real datasets. |
Tasks | |
Published | 2016-06-11 |
URL | http://arxiv.org/abs/1606.03601v1 |
http://arxiv.org/pdf/1606.03601v1.pdf | |
PWC | https://paperswithcode.com/paper/trex-a-tomography-reconstruction-proximal |
Repo | https://github.com/mohamedadaly/TRex |
Framework | none |
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
Title | Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 |
Authors | Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio |
Abstract | We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At training-time the binary weights and activations are used for computing the parameters gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs we conduct two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.02830v3 |
http://arxiv.org/pdf/1602.02830v3.pdf | |
PWC | https://paperswithcode.com/paper/binarized-neural-networks-training-deep |
Repo | https://github.com/hpi-xnor/BMXNet |
Framework | mxnet |