May 7, 2019

2864 words 14 mins read

Paper Group AWR 13

Paper Group AWR 13

Coresets for Scalable Bayesian Logistic Regression. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. Hierarchical Character-Word Models for Language Identification. Detecting state of aggression in sentences using CNN. Steerable CNNs. Universal adversarial perturbations. Mastering 2048 with Delayed Temporal Cohe …

Coresets for Scalable Bayesian Logistic Regression

Title Coresets for Scalable Bayesian Logistic Regression
Authors Jonathan H. Huggins, Trevor Campbell, Tamara Broderick
Abstract The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification. In this paper, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset – both for fixed, known datasets, and in expectation for a wide class of data generative models. Crucially, the proposed approach also permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size. Furthermore, constructing the coreset takes a negligible amount of time compared to that required to run MCMC on it.
Tasks Bayesian Inference
Published 2016-05-20
URL http://arxiv.org/abs/1605.06423v3
PDF http://arxiv.org/pdf/1605.06423v3.pdf
PWC https://paperswithcode.com/paper/coresets-for-scalable-bayesian-logistic
Repo https://github.com/trevorcampbell/bayesian-coresets
Framework none

Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis

Title Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
Authors Chuan Li, Michael Wand
Abstract This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level. We apply the method to both photographic and non-photo-realistic (artwork) synthesis tasks. The MRF regularizer prevents over-excitation artifacts and reduces implausible feature mixtures common to previous dCNN inversion approaches, permitting synthezing photographic content with increased visual plausibility. Unlike standard MRF-based texture synthesis, the combined system can both match and adapt local features with considerable variability, yielding results far out of reach of classic generative MRF methods.
Tasks Image Generation, Texture Synthesis
Published 2016-01-18
URL http://arxiv.org/abs/1601.04589v1
PDF http://arxiv.org/pdf/1601.04589v1.pdf
PWC https://paperswithcode.com/paper/combining-markov-random-fields-and
Repo https://github.com/paulwarkentin/pytorch-neural-doodle
Framework pytorch

Hierarchical Character-Word Models for Language Identification

Title Hierarchical Character-Word Models for Language Identification
Authors Aaron Jaech, George Mulcaire, Shobhit Hathi, Mari Ostendorf, Noah A. Smith
Abstract Social media messages’ brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching.
Tasks Language Identification
Published 2016-08-10
URL http://arxiv.org/abs/1608.03030v1
PDF http://arxiv.org/pdf/1608.03030v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-character-word-models-for
Repo https://github.com/ajaech/twitter_langid
Framework tf

Detecting state of aggression in sentences using CNN

Title Detecting state of aggression in sentences using CNN
Authors Rodmonga Potapova, Denis Gordeev
Abstract In this article we study verbal expression of aggression and its detection using machine learning and neural networks methods. We test our results using our corpora of messages from anonymous imageboards. We also compare Random forest classifier with convolutional neural network for “Movie reviews with one sentence per review” corpus.
Tasks
Published 2016-04-22
URL http://arxiv.org/abs/1604.06650v1
PDF http://arxiv.org/pdf/1604.06650v1.pdf
PWC https://paperswithcode.com/paper/detecting-state-of-aggression-in-sentences
Repo https://github.com/jee1213/twitter_sentiment
Framework none

Steerable CNNs

Title Steerable CNNs
Authors Taco S. Cohen, Max Welling
Abstract It has long been recognized that the invariance and equivariance properties of a representation are critically important for success in many vision tasks. In this paper we present Steerable Convolutional Neural Networks, an efficient and flexible class of equivariant convolutional networks. We show that steerable CNNs achieve state of the art results on the CIFAR image classification benchmark. The mathematical theory of steerable representations reveals a type system in which any steerable representation is a composition of elementary feature types, each one associated with a particular kind of symmetry. We show how the parameter cost of a steerable filter bank depends on the types of the input and output features, and show how to use this knowledge to construct CNNs that utilize parameters effectively.
Tasks Image Classification
Published 2016-12-27
URL http://arxiv.org/abs/1612.08498v1
PDF http://arxiv.org/pdf/1612.08498v1.pdf
PWC https://paperswithcode.com/paper/steerable-cnns
Repo https://github.com/QUVA-Lab/e2cnn
Framework pytorch

Universal adversarial perturbations

Title Universal adversarial perturbations
Authors Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard
Abstract Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. We further empirically analyze these universal perturbations and show, in particular, that they generalize very well across neural networks. The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
Tasks
Published 2016-10-26
URL http://arxiv.org/abs/1610.08401v3
PDF http://arxiv.org/pdf/1610.08401v3.pdf
PWC https://paperswithcode.com/paper/universal-adversarial-perturbations
Repo https://github.com/LTS4/universal
Framework tf
Title Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping
Authors Wojciech Jaśkowski
Abstract 2048 is an engaging single-player, nondeterministic video puzzle game, which, thanks to the simple rules and hard-to-master gameplay, has gained massive popularity in recent years. As 2048 can be conveniently embedded into the discrete-state Markov decision processes framework, we treat it as a testbed for evaluating existing and new methods in reinforcement learning. With the aim to develop a strong 2048 playing program, we employ temporal difference learning with systematic n-tuple networks. We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding. In addition, we demonstrate how to take advantage of the characteristics of the n-tuple network, to improve the algorithmic effectiveness of the learning process by i) delaying the (decayed) update and applying lock-free optimistic parallelism to effortlessly make advantage of multiple CPU cores. This way, we were able to develop the best known 2048 playing program to date, which confirms the effectiveness of the introduced methods for discrete-state Markov decision problems.
Tasks
Published 2016-04-18
URL http://arxiv.org/abs/1604.05085v3
PDF http://arxiv.org/pdf/1604.05085v3.pdf
PWC https://paperswithcode.com/paper/mastering-2048-with-delayed-temporal
Repo https://github.com/aszczepanski/2048
Framework none

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

Title Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Authors Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko
Abstract This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. We evaluate our approach on a collection of Youtube videos as well as two large movie description datasets showing significant improvements in grammaticality while modestly improving descriptive quality.
Tasks Language Modelling, Video Description
Published 2016-04-06
URL http://arxiv.org/abs/1604.01729v2
PDF http://arxiv.org/pdf/1604.01729v2.pdf
PWC https://paperswithcode.com/paper/improving-lstm-based-video-description-with
Repo https://github.com/beerzyp/ECAC-Bimodal_audiovisual_recognition
Framework none

Image-embodied Knowledge Representation Learning

Title Image-embodied Knowledge Representation Learning
Authors Ruobing Xie, Zhiyuan Liu, Huanbo Luan, Maosong Sun
Abstract Entity images could provide significant visual information for knowledge representation learning. Most conventional methods learn knowledge representations merely from structured triples, ignoring rich visual information extracted from entity images. In this paper, we propose a novel Image-embodied Knowledge Representation Learning model (IKRL), where knowledge representations are learned with both triple facts and images. More specifically, we first construct representations for all images of an entity with a neural image encoder. These image representations are then integrated into an aggregated image-based representation via an attention-based method. We evaluate our IKRL models on knowledge graph completion and triple classification. Experimental results demonstrate that our models outperform all baselines on both tasks, which indicates the significance of visual information for knowledge representations and the capability of our models in learning knowledge representations with images.
Tasks Knowledge Graph Completion, Representation Learning
Published 2016-09-22
URL http://arxiv.org/abs/1609.07028v2
PDF http://arxiv.org/pdf/1609.07028v2.pdf
PWC https://paperswithcode.com/paper/image-embodied-knowledge-representation
Repo https://github.com/thunlp/IKRL
Framework none

Geometric adaptive Monte Carlo in random environment

Title Geometric adaptive Monte Carlo in random environment
Authors Theodore Papamarkou, Alexey Lindo, Eric B. Ford
Abstract Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model.
Tasks
Published 2016-08-29
URL http://arxiv.org/abs/1608.07986v3
PDF http://arxiv.org/pdf/1608.07986v3.pdf
PWC https://paperswithcode.com/paper/geometric-adaptive-monte-carlo-in-random
Repo https://github.com/scidom/MAMALASampler.jl
Framework none

Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks

Title Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks
Authors Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, Jiaying Liu
Abstract Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identifies the action type and localizes the action positions on the fly from the untrimmed stream data. In this paper, we study the problem of online action detection from streaming skeleton data. We propose a multi-task end-to-end Joint Classification-Regression Recurrent Neural Network to better explore the action type and temporal localization information. By employing a joint classification and regression optimization objective, this network is capable of automatically localizing the start and end points of actions more accurately. Specifically, by leveraging the merits of the deep Long Short-Term Memory (LSTM) subnetwork, the proposed model automatically captures the complex long-range temporal dynamics, which naturally avoids the typical sliding window design and thus ensures high computational efficiency. Furthermore, the subtask of regression optimization provides the ability to forecast the action prior to its occurrence. To evaluate our proposed model, we build a large streaming video dataset with annotations. Experimental results on our dataset and the public G3D dataset both demonstrate very promising performance of our scheme.
Tasks Action Detection, Temporal Action Localization, Temporal Localization
Published 2016-04-19
URL http://arxiv.org/abs/1604.05633v2
PDF http://arxiv.org/pdf/1604.05633v2.pdf
PWC https://paperswithcode.com/paper/online-human-action-detection-using-joint
Repo https://github.com/seanmcgovern21/Machine-Learning-CS539
Framework pytorch

Long-term Temporal Convolutions for Action Recognition

Title Long-term Temporal Convolutions for Action Recognition
Authors Gül Varol, Ivan Laptev, Cordelia Schmid
Abstract Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7%) and HMDB51 (67.2%).
Tasks Optical Flow Estimation, Temporal Action Localization
Published 2016-04-15
URL http://arxiv.org/abs/1604.04494v2
PDF http://arxiv.org/pdf/1604.04494v2.pdf
PWC https://paperswithcode.com/paper/long-term-temporal-convolutions-for-action
Repo https://github.com/gulvarol/ltc
Framework torch

Simple Online and Realtime Tracking

Title Simple Online and Realtime Tracking
Authors Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, Ben Upcroft
Abstract This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tracking performance, where changing the detector can improve tracking by up to 18.9%. Despite only using a rudimentary combination of familiar techniques such as the Kalman Filter and Hungarian algorithm for the tracking components, this approach achieves an accuracy comparable to state-of-the-art online trackers. Furthermore, due to the simplicity of our tracking method, the tracker updates at a rate of 260 Hz which is over 20x faster than other state-of-the-art trackers.
Tasks Multiple Object Tracking, Object Tracking
Published 2016-02-02
URL http://arxiv.org/abs/1602.00763v2
PDF http://arxiv.org/pdf/1602.00763v2.pdf
PWC https://paperswithcode.com/paper/simple-online-and-realtime-tracking
Repo https://github.com/cfotache/pytorch_objectdetecttrack
Framework pytorch

Theano: A Python framework for fast computation of mathematical expressions

Title Theano: A Python framework for fast computation of mathematical expressions
Authors The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang
Abstract Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Tasks Dimensionality Reduction
Published 2016-05-09
URL http://arxiv.org/abs/1605.02688v1
PDF http://arxiv.org/pdf/1605.02688v1.pdf
PWC https://paperswithcode.com/paper/theano-a-python-framework-for-fast
Repo https://github.com/leonoverweel/bibtex-python-package-citations
Framework tf

Low-rank passthrough neural networks

Title Low-rank passthrough neural networks
Authors Antonio Valerio Miceli Barone
Abstract Various common deep learning architectures, such as LSTMs, GRUs, Resnets and Highway Networks, employ state passthrough connections that support training with high feed-forward depth or recurrence over many time steps. These “Passthrough Networks” architectures also enable the decoupling of the network state size from the number of parameters of the network, a possibility has been studied by \newcite{Sak2014} with their low-rank parametrization of the LSTM. In this work we extend this line of research, proposing effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. This is particularly beneficial in low-resource settings as it supports expressive models with a compact parametrization less susceptible to overfitting. We present competitive experimental results on several tasks, including language modeling and a near state of the art result on sequential randomly-permuted MNIST classification, a hard task on natural data.
Tasks Language Modelling
Published 2016-03-10
URL http://arxiv.org/abs/1603.03116v3
PDF http://arxiv.org/pdf/1603.03116v3.pdf
PWC https://paperswithcode.com/paper/low-rank-passthrough-neural-networks
Repo https://github.com/Avmb/lowrank-lstm
Framework torch
comments powered by Disqus