May 7, 2019

2864 words 14 mins read

Paper Group AWR 13

Coresets for Scalable Bayesian Logistic Regression. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. Hierarchical Character-Word Models for Language Identification. Detecting state of aggression in sentences using CNN. Steerable CNNs. Universal adversarial perturbations. Mastering 2048 with Delayed Temporal Cohe …

Coresets for Scalable Bayesian Logistic Regression


Title	Coresets for Scalable Bayesian Logistic Regression
Authors	Jonathan H. Huggins, Trevor Campbell, Tamara Broderick
Abstract	The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification. In this paper, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset – both for fixed, known datasets, and in expectation for a wide class of data generative models. Crucially, the proposed approach also permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size. Furthermore, constructing the coreset takes a negligible amount of time compared to that required to run MCMC on it.
Tasks	Bayesian Inference
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06423v3
PDF	http://arxiv.org/pdf/1605.06423v3.pdf
PWC	https://paperswithcode.com/paper/coresets-for-scalable-bayesian-logistic
Repo	https://github.com/trevorcampbell/bayesian-coresets
Framework	none

Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis


Title	Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
Authors	Chuan Li, Michael Wand
Abstract	This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level. We apply the method to both photographic and non-photo-realistic (artwork) synthesis tasks. The MRF regularizer prevents over-excitation artifacts and reduces implausible feature mixtures common to previous dCNN inversion approaches, permitting synthezing photographic content with increased visual plausibility. Unlike standard MRF-based texture synthesis, the combined system can both match and adapt local features with considerable variability, yielding results far out of reach of classic generative MRF methods.
Tasks	Image Generation, Texture Synthesis
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04589v1
PDF	http://arxiv.org/pdf/1601.04589v1.pdf
PWC	https://paperswithcode.com/paper/combining-markov-random-fields-and
Repo	https://github.com/paulwarkentin/pytorch-neural-doodle
Framework	pytorch

Hierarchical Character-Word Models for Language Identification


Title	Hierarchical Character-Word Models for Language Identification
Authors	Aaron Jaech, George Mulcaire, Shobhit Hathi, Mari Ostendorf, Noah A. Smith
Abstract	Social media messages’ brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching.
Tasks	Language Identification
Published	2016-08-10
URL	http://arxiv.org/abs/1608.03030v1
PDF	http://arxiv.org/pdf/1608.03030v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-character-word-models-for
Repo	https://github.com/ajaech/twitter_langid
Framework	tf

Detecting state of aggression in sentences using CNN


Title	Detecting state of aggression in sentences using CNN
Authors	Rodmonga Potapova, Denis Gordeev
Abstract	In this article we study verbal expression of aggression and its detection using machine learning and neural networks methods. We test our results using our corpora of messages from anonymous imageboards. We also compare Random forest classifier with convolutional neural network for “Movie reviews with one sentence per review” corpus.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06650v1
PDF	http://arxiv.org/pdf/1604.06650v1.pdf
PWC	https://paperswithcode.com/paper/detecting-state-of-aggression-in-sentences
Repo	https://github.com/jee1213/twitter_sentiment
Framework	none

Steerable CNNs


Title	Steerable CNNs
Authors	Taco S. Cohen, Max Welling
Abstract	It has long been recognized that the invariance and equivariance properties of a representation are critically important for success in many vision tasks. In this paper we present Steerable Convolutional Neural Networks, an efficient and flexible class of equivariant convolutional networks. We show that steerable CNNs achieve state of the art results on the CIFAR image classification benchmark. The mathematical theory of steerable representations reveals a type system in which any steerable representation is a composition of elementary feature types, each one associated with a particular kind of symmetry. We show how the parameter cost of a steerable filter bank depends on the types of the input and output features, and show how to use this knowledge to construct CNNs that utilize parameters effectively.
Tasks	Image Classification
Published	2016-12-27
URL	http://arxiv.org/abs/1612.08498v1
PDF	http://arxiv.org/pdf/1612.08498v1.pdf
PWC	https://paperswithcode.com/paper/steerable-cnns
Repo	https://github.com/QUVA-Lab/e2cnn
Framework	pytorch

Universal adversarial perturbations


Title	Universal adversarial perturbations
Authors	Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard
Abstract	Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. We further empirically analyze these universal perturbations and show, in particular, that they generalize very well across neural networks. The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
Tasks
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08401v3
PDF	http://arxiv.org/pdf/1610.08401v3.pdf
PWC	https://paperswithcode.com/paper/universal-adversarial-perturbations
Repo	https://github.com/LTS4/universal
Framework	tf

Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping


Title	Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping
Authors	Wojciech Jaśkowski
Abstract	2048 is an engaging single-player, nondeterministic video puzzle game, which, thanks to the simple rules and hard-to-master gameplay, has gained massive popularity in recent years. As 2048 can be conveniently embedded into the discrete-state Markov decision processes framework, we treat it as a testbed for evaluating existing and new methods in reinforcement learning. With the aim to develop a strong 2048 playing program, we employ temporal difference learning with systematic n-tuple networks. We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding. In addition, we demonstrate how to take advantage of the characteristics of the n-tuple network, to improve the algorithmic effectiveness of the learning process by i) delaying the (decayed) update and applying lock-free optimistic parallelism to effortlessly make advantage of multiple CPU cores. This way, we were able to develop the best known 2048 playing program to date, which confirms the effectiveness of the introduced methods for discrete-state Markov decision problems.
Tasks
Published	2016-04-18
URL	http://arxiv.org/abs/1604.05085v3
PDF	http://arxiv.org/pdf/1604.05085v3.pdf
PWC	https://paperswithcode.com/paper/mastering-2048-with-delayed-temporal
Repo	https://github.com/aszczepanski/2048
Framework	none

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text


Title	Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Authors	Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko
Abstract	This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. We evaluate our approach on a collection of Youtube videos as well as two large movie description datasets showing significant improvements in grammaticality while modestly improving descriptive quality.
Tasks	Language Modelling, Video Description
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01729v2
PDF	http://arxiv.org/pdf/1604.01729v2.pdf
PWC	https://paperswithcode.com/paper/improving-lstm-based-video-description-with
Repo	https://github.com/beerzyp/ECAC-Bimodal_audiovisual_recognition
Framework	none

Image-embodied Knowledge Representation Learning


Title	Image-embodied Knowledge Representation Learning
Authors	Ruobing Xie, Zhiyuan Liu, Huanbo Luan, Maosong Sun
Abstract	Entity images could provide significant visual information for knowledge representation learning. Most conventional methods learn knowledge representations merely from structured triples, ignoring rich visual information extracted from entity images. In this paper, we propose a novel Image-embodied Knowledge Representation Learning model (IKRL), where knowledge representations are learned with both triple facts and images. More specifically, we first construct representations for all images of an entity with a neural image encoder. These image representations are then integrated into an aggregated image-based representation via an attention-based method. We evaluate our IKRL models on knowledge graph completion and triple classification. Experimental results demonstrate that our models outperform all baselines on both tasks, which indicates the significance of visual information for knowledge representations and the capability of our models in learning knowledge representations with images.
Tasks	Knowledge Graph Completion, Representation Learning
Published	2016-09-22
URL	http://arxiv.org/abs/1609.07028v2
PDF	http://arxiv.org/pdf/1609.07028v2.pdf
PWC	https://paperswithcode.com/paper/image-embodied-knowledge-representation
Repo	https://github.com/thunlp/IKRL
Framework	none

Geometric adaptive Monte Carlo in random environment


Title	Geometric adaptive Monte Carlo in random environment
Authors	Theodore Papamarkou, Alexey Lindo, Eric B. Ford
Abstract	Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model.
Tasks
Published	2016-08-29
URL	http://arxiv.org/abs/1608.07986v3
PDF	http://arxiv.org/pdf/1608.07986v3.pdf
PWC	https://paperswithcode.com/paper/geometric-adaptive-monte-carlo-in-random
Repo	https://github.com/scidom/MAMALASampler.jl
Framework	none

Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks


Title	Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks
Authors	Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, Jiaying Liu
Abstract	Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identifies the action type and localizes the action positions on the fly from the untrimmed stream data. In this paper, we study the problem of online action detection from streaming skeleton data. We propose a multi-task end-to-end Joint Classification-Regression Recurrent Neural Network to better explore the action type and temporal localization information. By employing a joint classification and regression optimization objective, this network is capable of automatically localizing the start and end points of actions more accurately. Specifically, by leveraging the merits of the deep Long Short-Term Memory (LSTM) subnetwork, the proposed model automatically captures the complex long-range temporal dynamics, which naturally avoids the typical sliding window design and thus ensures high computational efficiency. Furthermore, the subtask of regression optimization provides the ability to forecast the action prior to its occurrence. To evaluate our proposed model, we build a large streaming video dataset with annotations. Experimental results on our dataset and the public G3D dataset both demonstrate very promising performance of our scheme.
Tasks	Action Detection, Temporal Action Localization, Temporal Localization
Published	2016-04-19
URL	http://arxiv.org/abs/1604.05633v2
PDF	http://arxiv.org/pdf/1604.05633v2.pdf
PWC	https://paperswithcode.com/paper/online-human-action-detection-using-joint
Repo	https://github.com/seanmcgovern21/Machine-Learning-CS539
Framework	pytorch

Long-term Temporal Convolutions for Action Recognition


Title	Long-term Temporal Convolutions for Action Recognition
Authors	Gül Varol, Ivan Laptev, Cordelia Schmid
Abstract	Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7%) and HMDB51 (67.2%).
Tasks	Optical Flow Estimation, Temporal Action Localization
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04494v2
PDF	http://arxiv.org/pdf/1604.04494v2.pdf
PWC	https://paperswithcode.com/paper/long-term-temporal-convolutions-for-action
Repo	https://github.com/gulvarol/ltc
Framework	torch

Simple Online and Realtime Tracking


Title	Simple Online and Realtime Tracking
Authors	Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, Ben Upcroft
Abstract	This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tracking performance, where changing the detector can improve tracking by up to 18.9%. Despite only using a rudimentary combination of familiar techniques such as the Kalman Filter and Hungarian algorithm for the tracking components, this approach achieves an accuracy comparable to state-of-the-art online trackers. Furthermore, due to the simplicity of our tracking method, the tracker updates at a rate of 260 Hz which is over 20x faster than other state-of-the-art trackers.
Tasks	Multiple Object Tracking, Object Tracking
Published	2016-02-02
URL	http://arxiv.org/abs/1602.00763v2
PDF	http://arxiv.org/pdf/1602.00763v2.pdf
PWC	https://paperswithcode.com/paper/simple-online-and-realtime-tracking
Repo	https://github.com/cfotache/pytorch_objectdetecttrack
Framework	pytorch

Theano: A Python framework for fast computation of mathematical expressions


Title	Theano: A Python framework for fast computation of mathematical expressions
Authors	The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang
Abstract	Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Tasks	Dimensionality Reduction
Published	2016-05-09
URL	http://arxiv.org/abs/1605.02688v1
PDF	http://arxiv.org/pdf/1605.02688v1.pdf
PWC	https://paperswithcode.com/paper/theano-a-python-framework-for-fast
Repo	https://github.com/leonoverweel/bibtex-python-package-citations
Framework	tf

Low-rank passthrough neural networks


Title	Low-rank passthrough neural networks
Authors	Antonio Valerio Miceli Barone
Abstract	Various common deep learning architectures, such as LSTMs, GRUs, Resnets and Highway Networks, employ state passthrough connections that support training with high feed-forward depth or recurrence over many time steps. These “Passthrough Networks” architectures also enable the decoupling of the network state size from the number of parameters of the network, a possibility has been studied by \newcite{Sak2014} with their low-rank parametrization of the LSTM. In this work we extend this line of research, proposing effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. This is particularly beneficial in low-resource settings as it supports expressive models with a compact parametrization less susceptible to overfitting. We present competitive experimental results on several tasks, including language modeling and a near state of the art result on sequential randomly-permuted MNIST classification, a hard task on natural data.
Tasks	Language Modelling
Published	2016-03-10
URL	http://arxiv.org/abs/1603.03116v3
PDF	http://arxiv.org/pdf/1603.03116v3.pdf
PWC	https://paperswithcode.com/paper/low-rank-passthrough-neural-networks
Repo	https://github.com/Avmb/lowrank-lstm
Framework	torch