Paper Group ANR 468
Online Monotone Optimization. A Fast Factorization-based Approach to Robust PCA. Most Likely Separation of Intensity and Warping Effects in Image Registration. Stochastic Heavy Ball. An Attentive Neural Architecture for Fine-grained Entity Type Classification. Towards Information-Seeking Agents. Egocentric Activity Recognition with Multimodal Fishe …
Online Monotone Optimization
Title | Online Monotone Optimization |
Authors | Ian Gemp, Sridhar Mahadevan |
Abstract | This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems. The proposed framework generalizes the popular online convex optimization framework and extends it to its natural limit allowing it to capture a notion of regret that is intuitive for more general problems such as those encountered in game theory and variational inequalities. The framework hinges on a special choice of a system-wide loss function we have developed. Using this framework, we prove that a simple update scheme provides a no-regret algorithm for monotone systems. While previous results in game theory prove individual agents can enjoy unilateral no-regret guarantees, our result proves monotonicity sufficient for guaranteeing no-regret when considering the adjustments of multiple agent strategies in parallel. Furthermore, to our knowledge, this is the first framework to provide a suitable notion of regret for variational inequalities. Most importantly, our proposed framework ensures monotonicity a sufficient condition for employing multiple online learners safely in parallel. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07888v1 |
http://arxiv.org/pdf/1608.07888v1.pdf | |
PWC | https://paperswithcode.com/paper/online-monotone-optimization |
Repo | |
Framework | |
A Fast Factorization-based Approach to Robust PCA
Title | A Fast Factorization-based Approach to Robust PCA |
Authors | Chong Peng, Zhao Kang, Qiang Chen |
Abstract | Robust principal component analysis (RPCA) has been widely used for recovering low-rank matrices in many data mining and machine learning problems. It separates a data matrix into a low-rank part and a sparse part. The convex approach has been well studied in the literature. However, state-of-the-art algorithms for the convex approach usually have relatively high complexity due to the need of solving (partial) singular value decompositions of large matrices. A non-convex approach, AltProj, has also been proposed with lighter complexity and better scalability. Given the true rank $r$ of the underlying low rank matrix, AltProj has a complexity of $O(r^2dn)$, where $d\times n$ is the size of data matrix. In this paper, we propose a novel factorization-based model of RPCA, which has a complexity of $O(kdn)$, where $k$ is an upper bound of the true rank. Our method does not need the precise value of the true rank. From extensive experiments, we observe that AltProj can work only when $r$ is precisely known in advance; however, when the needed rank parameter $r$ is specified to a value different from the true rank, AltProj cannot fully separate the two parts while our method succeeds. Even when both work, our method is about 4 times faster than AltProj. Our method can be used as a light-weight, scalable tool for RPCA in the absence of the precise value of the true rank. |
Tasks | |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08677v1 |
http://arxiv.org/pdf/1609.08677v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-factorization-based-approach-to-robust |
Repo | |
Framework | |
Most Likely Separation of Intensity and Warping Effects in Image Registration
Title | Most Likely Separation of Intensity and Warping Effects in Image Registration |
Authors | Line Kühnel, Stefan Sommer, Akshay Pai, Lars Lau Raket |
Abstract | This paper introduces a class of mixed-effects models for joint modeling of spatially correlated intensity variation and warping variation in 2D images. Spatially correlated intensity variation and warp variation are modeled as random effects, resulting in a nonlinear mixed-effects model that enables simultaneous estimation of template and model parameters by optimization of the likelihood function. We propose an algorithm for fitting the model which alternates estimation of variance parameters and image registration. This approach avoids the potential estimation bias in the template estimate that arises when treating registration as a preprocessing step. We apply the model to datasets of facial images and 2D brain magnetic resonance images to illustrate the simultaneous estimation and prediction of intensity and warp effects. |
Tasks | Image Registration |
Published | 2016-04-18 |
URL | http://arxiv.org/abs/1604.05027v2 |
http://arxiv.org/pdf/1604.05027v2.pdf | |
PWC | https://paperswithcode.com/paper/most-likely-separation-of-intensity-and |
Repo | |
Framework | |
Stochastic Heavy Ball
Title | Stochastic Heavy Ball |
Authors | Sébastien Gadat, Fabien Panloup, Sofiane Saadane |
Abstract | This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-ball method differential equation, which was introduced by Polyak in the 1960s with his seminal contribution [Pol64]. The Heavy-ball method is a second-order dynamics that was investigated to minimize convex functions f . The family of second-order methods recently received a large amount of attention, until the famous contribution of Nesterov [Nes83], leading to the explosion of large-scale optimization problems. This work provides an in-depth description of the stochastic heavy-ball method, which is an adaptation of the deterministic one when only unbiased evalutions of the gradient are available and used throughout the iterations of the algorithm. We first describe some almost sure convergence results in the case of general non-convex coercive functions f . We then examine the situation of convex and strongly convex potentials and derive some non-asymptotic results about the stochastic heavy-ball method. We end our study with limit theorems on several rescaled algorithms. |
Tasks | Stochastic Optimization |
Published | 2016-09-14 |
URL | http://arxiv.org/abs/1609.04228v2 |
http://arxiv.org/pdf/1609.04228v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-heavy-ball |
Repo | |
Framework | |
An Attentive Neural Architecture for Fine-grained Entity Type Classification
Title | An Attentive Neural Architecture for Fine-grained Entity Type Classification |
Authors | Sonse Shimaoka, Pontus Stenetorp, Kentaro Inui, Sebastian Riedel |
Abstract | In this work we propose a novel attention-based neural network model for the task of fine-grained entity type classification that unlike previously proposed models recursively composes representations of entity mention contexts. Our model achieves state-of-the-art performance with 74.94% loose micro F1-score on the well-established FIGER dataset, a relative improvement of 2.59%. We also investigate the behavior of the attention mechanism of our model and observe that it can learn contextual linguistic expressions that indicate the fine-grained category memberships of an entity. |
Tasks | |
Published | 2016-04-19 |
URL | http://arxiv.org/abs/1604.05525v1 |
http://arxiv.org/pdf/1604.05525v1.pdf | |
PWC | https://paperswithcode.com/paper/an-attentive-neural-architecture-for-fine |
Repo | |
Framework | |
Towards Information-Seeking Agents
Title | Towards Information-Seeking Agents |
Authors | Philip Bachman, Alessandro Sordoni, Adam Trischler |
Abstract | We develop a general problem setting for training and testing the ability of agents to gather information efficiently. Specifically, we present a collection of tasks in which success requires searching through a partially-observed environment, for fragments of information which can be pieced together to accomplish various goals. We combine deep architectures with techniques from reinforcement learning to develop agents that solve our tasks. We shape the behavior of these agents by combining extrinsic and intrinsic rewards. We empirically demonstrate that these agents learn to search actively and intelligently for new information to reduce their uncertainty, and to exploit information they have already acquired. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02605v1 |
http://arxiv.org/pdf/1612.02605v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-information-seeking-agents |
Repo | |
Framework | |
Egocentric Activity Recognition with Multimodal Fisher Vector
Title | Egocentric Activity Recognition with Multimodal Fisher Vector |
Authors | Sibo Song, Ngai-Man Cheung, Vijay Chandrasekhar, Bappaditya Mandal, Jie Lin |
Abstract | With the increasing availability of wearable devices, research on egocentric activity recognition has received much attention recently. In this paper, we build a Multimodal Egocentric Activity dataset which includes egocentric videos and sensor data of 20 fine-grained and diverse activity categories. We present a novel strategy to extract temporal trajectory-like features from sensor data. We propose to apply the Fisher Kernel framework to fuse video and temporal enhanced sensor features. Experiment results show that with careful design of feature extraction and fusion algorithm, sensor data can enhance information-rich video data. We make publicly available the Multimodal Egocentric Activity dataset to facilitate future research. |
Tasks | Activity Recognition, Egocentric Activity Recognition |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06603v1 |
http://arxiv.org/pdf/1601.06603v1.pdf | |
PWC | https://paperswithcode.com/paper/egocentric-activity-recognition-with |
Repo | |
Framework | |
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
Title | Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning |
Authors | Mehdi Sajjadi, Mehran Javanmardi, Tolga Tasdizen |
Abstract | Effective convolutional neural networks are trained on large sets of labeled data. However, creating large labeled datasets is a very costly and time-consuming task. Semi-supervised learning uses unlabeled data to train a model with higher accuracy when there is a limited set of labeled data available. In this paper, we consider the problem of semi-supervised learning with convolutional neural networks. Techniques such as randomized data augmentation, dropout and random max-pooling provide better generalization and stability for classifiers that are trained using gradient descent. Multiple passes of an individual sample through the network might lead to different predictions due to the non-deterministic behavior of these techniques. We propose an unsupervised loss function that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network. We evaluate the proposed method on several benchmark datasets. |
Tasks | Data Augmentation, Semi-Supervised Image Classification |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04586v1 |
http://arxiv.org/pdf/1606.04586v1.pdf | |
PWC | https://paperswithcode.com/paper/regularization-with-stochastic |
Repo | |
Framework | |
Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets
Title | Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets |
Authors | Manuel Amthor, Erik Rodner, Joachim Denzler |
Abstract | We propose Impatient Deep Neural Networks (DNNs) which deal with dynamic time budgets during application. They allow for individual budgets given a priori for each test example and for anytime prediction, i.e., a possible interruption at multiple stages during inference while still providing output estimates. Our approach can therefore tackle the computational costs and energy demands of DNNs in an adaptive manner, a property essential for real-time applications. Our Impatient DNNs are based on a new general framework of learning dynamic budget predictors using risk minimization, which can be applied to current DNN architectures by adding early prediction and additional loss layers. A key aspect of our method is that all of the intermediate predictors are learned jointly. In experiments, we evaluate our approach for different budget distributions, architectures, and datasets. Our results show a significant gain in expected accuracy compared to common baselines. |
Tasks | |
Published | 2016-10-10 |
URL | http://arxiv.org/abs/1610.02850v1 |
http://arxiv.org/pdf/1610.02850v1.pdf | |
PWC | https://paperswithcode.com/paper/impatient-dnns-deep-neural-networks-with |
Repo | |
Framework | |
Adversarial Ladder Networks
Title | Adversarial Ladder Networks |
Authors | Juan Maroñas Molano, Alberto Albiol Colomer, Roberto Paredes Palacios |
Abstract | The use of unsupervised data in addition to supervised data in training discriminative neural networks has improved the performance of this clas- sification scheme. However, the best results were achieved with a training process that is divided in two parts: first an unsupervised pre-training step is done for initializing the weights of the network and after these weights are refined with the use of supervised data. On the other hand adversarial noise has improved the results of clas- sical supervised learning. Recently, a new neural network topology called Ladder Network, where the key idea is based in some properties of hierar- chichal latent variable models, has been proposed as a technique to train a neural network using supervised and unsupervised data at the same time with what is called semi-supervised learning. This technique has reached state of the art classification. In this work we add adversarial noise to the ladder network and get state of the art classification, with several important conclusions on how adversarial noise can help in addition with new possible lines of investi- gation. We also propose an alternative to add adversarial noise to unsu- pervised data. |
Tasks | Latent Variable Models |
Published | 2016-11-07 |
URL | http://arxiv.org/abs/1611.02320v3 |
http://arxiv.org/pdf/1611.02320v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-ladder-networks |
Repo | |
Framework | |
Grounding Recursive Aggregates: Preliminary Report
Title | Grounding Recursive Aggregates: Preliminary Report |
Authors | Martin Gebser, Roland Kaminski, Torsten Schaub |
Abstract | Problem solving in Answer Set Programming consists of two steps, a first grounding phase, systematically replacing all variables by terms, and a second solving phase computing the stable models of the obtained ground program. An intricate part of both phases is the treatment of aggregates, which are popular language constructs that allow for expressing properties over sets. In this paper, we elaborate upon the treatment of aggregates during grounding in Gringo series 4. Consequently, our approach is applicable to grounding based on semi-naive database evaluation techniques. In particular, we provide a series of algorithms detailing the treatment of recursive aggregates and illustrate this by a running example. |
Tasks | |
Published | 2016-03-12 |
URL | http://arxiv.org/abs/1603.03884v1 |
http://arxiv.org/pdf/1603.03884v1.pdf | |
PWC | https://paperswithcode.com/paper/grounding-recursive-aggregates-preliminary |
Repo | |
Framework | |
Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema
Title | Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema |
Authors | Patrick Verga, Arvind Neelakantan, Andrew McCallum |
Abstract | Universal schema predicts the types of entities and relations in a knowledge base (KB) by jointly embedding the union of all available schema types—not only types from multiple structured databases (such as Freebase or Wikipedia infoboxes), but also types expressed as textual patterns from raw text. This prediction is typically modeled as a matrix completion problem, with one type per column, and either one or two entities per row (in the case of entity types or binary relation types, respectively). Factorizing this sparsely observed matrix yields a learned vector embedding for each row and each column. In this paper we explore the problem of making predictions for entities or entity-pairs unseen at training time (and hence without a pre-learned row embedding). We propose an approach having no per-row parameters at all; rather we produce a row vector on the fly using a learned aggregation function of the vectors of the observed columns for that row. We experiment with various aggregation functions, including neural network attention models. Our approach can be understood as a natural language database, in that questions about KB entities are answered by attending to textual or database evidence. In experiments predicting both relations and entity types, we demonstrate that despite having an order of magnitude fewer parameters than traditional universal schema, we can match the accuracy of the traditional model, and more importantly, we can now make predictions about unseen rows with nearly the same accuracy as rows available at training time. |
Tasks | Matrix Completion |
Published | 2016-06-18 |
URL | http://arxiv.org/abs/1606.05804v2 |
http://arxiv.org/pdf/1606.05804v2.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-to-unseen-entities-and-entity |
Repo | |
Framework | |
Does V-NIR based Image Enhancement Come with Better Features?
Title | Does V-NIR based Image Enhancement Come with Better Features? |
Authors | Vivek Sharma, Luc Van Gool |
Abstract | Image enhancement using the visible (V) and near-infrared (NIR) usually enhances useful image details. The enhanced images are evaluated by observers perception, instead of quantitative feature evaluation. Thus, can we say that these enhanced images using NIR information has better features in comparison to the computed features in the Red, Green, and Blue color channels directly? In this work, we present a new method to enhance the visible images using NIR information via edge-preserving filters, and also investigate which method performs best from a image features standpoint. We then show that our proposed enhancement method produces more stable features than the existing state-of-the-art methods. |
Tasks | Image Enhancement |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06521v2 |
http://arxiv.org/pdf/1608.06521v2.pdf | |
PWC | https://paperswithcode.com/paper/does-v-nir-based-image-enhancement-come-with |
Repo | |
Framework | |
Learning Diverse Image Colorization
Title | Learning Diverse Image Colorization |
Authors | Aditya Deshpande, Jiajun Lu, Mao-Chuang Yeh, Min Jin Chong, David Forsyth |
Abstract | Colorization is an ambiguous problem, with multiple viable colorizations for a single grey-level image. However, previous methods only produce the single most probable colorization. Our goal is to model the diversity intrinsic to the problem of colorization and produce multiple colorizations that display long-scale spatial co-ordination. We learn a low dimensional embedding of color fields using a variational autoencoder (VAE). We construct loss terms for the VAE decoder that avoid blurry outputs and take into account the uneven distribution of pixel colors. Finally, we build a conditional model for the multi-modal distribution between grey-level image and the color field embeddings. Samples from this conditional model result in diverse colorization. We demonstrate that our method obtains better diverse colorizations than a standard conditional variational autoencoder (CVAE) model, as well as a recently proposed conditional generative adversarial network (cGAN). |
Tasks | Colorization |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01958v2 |
http://arxiv.org/pdf/1612.01958v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-diverse-image-colorization |
Repo | |
Framework | |
Smoothing Effects of Bagging: Von Mises Expansions of Bagged Statistical Functionals
Title | Smoothing Effects of Bagging: Von Mises Expansions of Bagged Statistical Functionals |
Authors | Andreas Buja, Werner Stuetzle |
Abstract | Bagging is a device intended for reducing the prediction error of learning algorithms. In its simplest form, bagging draws bootstrap samples from the training sample, applies the learning algorithm to each bootstrap sample, and then averages the resulting prediction rules. We extend the definition of bagging from statistics to statistical functionals and study the von Mises expansion of bagged statistical functionals. We show that the expansion is related to the Efron-Stein ANOVA expansion of the raw (unbagged) functional. The basic observation is that a bagged functional is always smooth in the sense that the von Mises expansion exists and is finite of length 1 + resample size $M$. This holds even if the raw functional is rough or unstable. The resample size $M$ acts as a smoothing parameter, where a smaller $M$ means more smoothing. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02528v1 |
http://arxiv.org/pdf/1612.02528v1.pdf | |
PWC | https://paperswithcode.com/paper/smoothing-effects-of-bagging-von-mises |
Repo | |
Framework | |