Paper Group AWR 13
Coresets for Scalable Bayesian Logistic Regression. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. Hierarchical Character-Word Models for Language Identification. Detecting state of aggression in sentences using CNN. Steerable CNNs. Universal adversarial perturbations. Mastering 2048 with Delayed Temporal Cohe …
Coresets for Scalable Bayesian Logistic Regression
Title | Coresets for Scalable Bayesian Logistic Regression |
Authors | Jonathan H. Huggins, Trevor Campbell, Tamara Broderick |
Abstract | The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification. In this paper, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset – both for fixed, known datasets, and in expectation for a wide class of data generative models. Crucially, the proposed approach also permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size. Furthermore, constructing the coreset takes a negligible amount of time compared to that required to run MCMC on it. |
Tasks | Bayesian Inference |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06423v3 |
http://arxiv.org/pdf/1605.06423v3.pdf | |
PWC | https://paperswithcode.com/paper/coresets-for-scalable-bayesian-logistic |
Repo | https://github.com/trevorcampbell/bayesian-coresets |
Framework | none |
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
Title | Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis |
Authors | Chuan Li, Michael Wand |
Abstract | This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level. We apply the method to both photographic and non-photo-realistic (artwork) synthesis tasks. The MRF regularizer prevents over-excitation artifacts and reduces implausible feature mixtures common to previous dCNN inversion approaches, permitting synthezing photographic content with increased visual plausibility. Unlike standard MRF-based texture synthesis, the combined system can both match and adapt local features with considerable variability, yielding results far out of reach of classic generative MRF methods. |
Tasks | Image Generation, Texture Synthesis |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04589v1 |
http://arxiv.org/pdf/1601.04589v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-markov-random-fields-and |
Repo | https://github.com/paulwarkentin/pytorch-neural-doodle |
Framework | pytorch |
Hierarchical Character-Word Models for Language Identification
Title | Hierarchical Character-Word Models for Language Identification |
Authors | Aaron Jaech, George Mulcaire, Shobhit Hathi, Mari Ostendorf, Noah A. Smith |
Abstract | Social media messages’ brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching. |
Tasks | Language Identification |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03030v1 |
http://arxiv.org/pdf/1608.03030v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-character-word-models-for |
Repo | https://github.com/ajaech/twitter_langid |
Framework | tf |
Detecting state of aggression in sentences using CNN
Title | Detecting state of aggression in sentences using CNN |
Authors | Rodmonga Potapova, Denis Gordeev |
Abstract | In this article we study verbal expression of aggression and its detection using machine learning and neural networks methods. We test our results using our corpora of messages from anonymous imageboards. We also compare Random forest classifier with convolutional neural network for “Movie reviews with one sentence per review” corpus. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06650v1 |
http://arxiv.org/pdf/1604.06650v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-state-of-aggression-in-sentences |
Repo | https://github.com/jee1213/twitter_sentiment |
Framework | none |
Steerable CNNs
Title | Steerable CNNs |
Authors | Taco S. Cohen, Max Welling |
Abstract | It has long been recognized that the invariance and equivariance properties of a representation are critically important for success in many vision tasks. In this paper we present Steerable Convolutional Neural Networks, an efficient and flexible class of equivariant convolutional networks. We show that steerable CNNs achieve state of the art results on the CIFAR image classification benchmark. The mathematical theory of steerable representations reveals a type system in which any steerable representation is a composition of elementary feature types, each one associated with a particular kind of symmetry. We show how the parameter cost of a steerable filter bank depends on the types of the input and output features, and show how to use this knowledge to construct CNNs that utilize parameters effectively. |
Tasks | Image Classification |
Published | 2016-12-27 |
URL | http://arxiv.org/abs/1612.08498v1 |
http://arxiv.org/pdf/1612.08498v1.pdf | |
PWC | https://paperswithcode.com/paper/steerable-cnns |
Repo | https://github.com/QUVA-Lab/e2cnn |
Framework | pytorch |
Universal adversarial perturbations
Title | Universal adversarial perturbations |
Authors | Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard |
Abstract | Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. We further empirically analyze these universal perturbations and show, in particular, that they generalize very well across neural networks. The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images. |
Tasks | |
Published | 2016-10-26 |
URL | http://arxiv.org/abs/1610.08401v3 |
http://arxiv.org/pdf/1610.08401v3.pdf | |
PWC | https://paperswithcode.com/paper/universal-adversarial-perturbations |
Repo | https://github.com/LTS4/universal |
Framework | tf |
Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping
Title | Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping |
Authors | Wojciech Jaśkowski |
Abstract | 2048 is an engaging single-player, nondeterministic video puzzle game, which, thanks to the simple rules and hard-to-master gameplay, has gained massive popularity in recent years. As 2048 can be conveniently embedded into the discrete-state Markov decision processes framework, we treat it as a testbed for evaluating existing and new methods in reinforcement learning. With the aim to develop a strong 2048 playing program, we employ temporal difference learning with systematic n-tuple networks. We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding. In addition, we demonstrate how to take advantage of the characteristics of the n-tuple network, to improve the algorithmic effectiveness of the learning process by i) delaying the (decayed) update and applying lock-free optimistic parallelism to effortlessly make advantage of multiple CPU cores. This way, we were able to develop the best known 2048 playing program to date, which confirms the effectiveness of the introduced methods for discrete-state Markov decision problems. |
Tasks | |
Published | 2016-04-18 |
URL | http://arxiv.org/abs/1604.05085v3 |
http://arxiv.org/pdf/1604.05085v3.pdf | |
PWC | https://paperswithcode.com/paper/mastering-2048-with-delayed-temporal |
Repo | https://github.com/aszczepanski/2048 |
Framework | none |
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Title | Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text |
Authors | Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko |
Abstract | This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. We evaluate our approach on a collection of Youtube videos as well as two large movie description datasets showing significant improvements in grammaticality while modestly improving descriptive quality. |
Tasks | Language Modelling, Video Description |
Published | 2016-04-06 |
URL | http://arxiv.org/abs/1604.01729v2 |
http://arxiv.org/pdf/1604.01729v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-lstm-based-video-description-with |
Repo | https://github.com/beerzyp/ECAC-Bimodal_audiovisual_recognition |
Framework | none |
Image-embodied Knowledge Representation Learning
Title | Image-embodied Knowledge Representation Learning |
Authors | Ruobing Xie, Zhiyuan Liu, Huanbo Luan, Maosong Sun |
Abstract | Entity images could provide significant visual information for knowledge representation learning. Most conventional methods learn knowledge representations merely from structured triples, ignoring rich visual information extracted from entity images. In this paper, we propose a novel Image-embodied Knowledge Representation Learning model (IKRL), where knowledge representations are learned with both triple facts and images. More specifically, we first construct representations for all images of an entity with a neural image encoder. These image representations are then integrated into an aggregated image-based representation via an attention-based method. We evaluate our IKRL models on knowledge graph completion and triple classification. Experimental results demonstrate that our models outperform all baselines on both tasks, which indicates the significance of visual information for knowledge representations and the capability of our models in learning knowledge representations with images. |
Tasks | Knowledge Graph Completion, Representation Learning |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07028v2 |
http://arxiv.org/pdf/1609.07028v2.pdf | |
PWC | https://paperswithcode.com/paper/image-embodied-knowledge-representation |
Repo | https://github.com/thunlp/IKRL |
Framework | none |
Geometric adaptive Monte Carlo in random environment
Title | Geometric adaptive Monte Carlo in random environment |
Authors | Theodore Papamarkou, Alexey Lindo, Eric B. Ford |
Abstract | Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07986v3 |
http://arxiv.org/pdf/1608.07986v3.pdf | |
PWC | https://paperswithcode.com/paper/geometric-adaptive-monte-carlo-in-random |
Repo | https://github.com/scidom/MAMALASampler.jl |
Framework | none |
Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks
Title | Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks |
Authors | Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, Jiaying Liu |
Abstract | Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identifies the action type and localizes the action positions on the fly from the untrimmed stream data. In this paper, we study the problem of online action detection from streaming skeleton data. We propose a multi-task end-to-end Joint Classification-Regression Recurrent Neural Network to better explore the action type and temporal localization information. By employing a joint classification and regression optimization objective, this network is capable of automatically localizing the start and end points of actions more accurately. Specifically, by leveraging the merits of the deep Long Short-Term Memory (LSTM) subnetwork, the proposed model automatically captures the complex long-range temporal dynamics, which naturally avoids the typical sliding window design and thus ensures high computational efficiency. Furthermore, the subtask of regression optimization provides the ability to forecast the action prior to its occurrence. To evaluate our proposed model, we build a large streaming video dataset with annotations. Experimental results on our dataset and the public G3D dataset both demonstrate very promising performance of our scheme. |
Tasks | Action Detection, Temporal Action Localization, Temporal Localization |
Published | 2016-04-19 |
URL | http://arxiv.org/abs/1604.05633v2 |
http://arxiv.org/pdf/1604.05633v2.pdf | |
PWC | https://paperswithcode.com/paper/online-human-action-detection-using-joint |
Repo | https://github.com/seanmcgovern21/Machine-Learning-CS539 |
Framework | pytorch |
Long-term Temporal Convolutions for Action Recognition
Title | Long-term Temporal Convolutions for Action Recognition |
Authors | Gül Varol, Ivan Laptev, Cordelia Schmid |
Abstract | Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7%) and HMDB51 (67.2%). |
Tasks | Optical Flow Estimation, Temporal Action Localization |
Published | 2016-04-15 |
URL | http://arxiv.org/abs/1604.04494v2 |
http://arxiv.org/pdf/1604.04494v2.pdf | |
PWC | https://paperswithcode.com/paper/long-term-temporal-convolutions-for-action |
Repo | https://github.com/gulvarol/ltc |
Framework | torch |
Simple Online and Realtime Tracking
Title | Simple Online and Realtime Tracking |
Authors | Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, Ben Upcroft |
Abstract | This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tracking performance, where changing the detector can improve tracking by up to 18.9%. Despite only using a rudimentary combination of familiar techniques such as the Kalman Filter and Hungarian algorithm for the tracking components, this approach achieves an accuracy comparable to state-of-the-art online trackers. Furthermore, due to the simplicity of our tracking method, the tracker updates at a rate of 260 Hz which is over 20x faster than other state-of-the-art trackers. |
Tasks | Multiple Object Tracking, Object Tracking |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.00763v2 |
http://arxiv.org/pdf/1602.00763v2.pdf | |
PWC | https://paperswithcode.com/paper/simple-online-and-realtime-tracking |
Repo | https://github.com/cfotache/pytorch_objectdetecttrack |
Framework | pytorch |
Theano: A Python framework for fast computation of mathematical expressions
Title | Theano: A Python framework for fast computation of mathematical expressions |
Authors | The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang |
Abstract | Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it. |
Tasks | Dimensionality Reduction |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02688v1 |
http://arxiv.org/pdf/1605.02688v1.pdf | |
PWC | https://paperswithcode.com/paper/theano-a-python-framework-for-fast |
Repo | https://github.com/leonoverweel/bibtex-python-package-citations |
Framework | tf |
Low-rank passthrough neural networks
Title | Low-rank passthrough neural networks |
Authors | Antonio Valerio Miceli Barone |
Abstract | Various common deep learning architectures, such as LSTMs, GRUs, Resnets and Highway Networks, employ state passthrough connections that support training with high feed-forward depth or recurrence over many time steps. These “Passthrough Networks” architectures also enable the decoupling of the network state size from the number of parameters of the network, a possibility has been studied by \newcite{Sak2014} with their low-rank parametrization of the LSTM. In this work we extend this line of research, proposing effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. This is particularly beneficial in low-resource settings as it supports expressive models with a compact parametrization less susceptible to overfitting. We present competitive experimental results on several tasks, including language modeling and a near state of the art result on sequential randomly-permuted MNIST classification, a hard task on natural data. |
Tasks | Language Modelling |
Published | 2016-03-10 |
URL | http://arxiv.org/abs/1603.03116v3 |
http://arxiv.org/pdf/1603.03116v3.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-passthrough-neural-networks |
Repo | https://github.com/Avmb/lowrank-lstm |
Framework | torch |