July 26, 2019

3006 words 15 mins read

Paper Group ANR 786

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems. Analogy Mining for Specific Design Needs. Improving Accuracy of Nonparametric Transfer Learning via Vector Segmentation. Modular Continual Learning in a Unified Visual Environment. Atypicality for Heart Rate Variability Using a Pattern-Tree Weighting Method. When Kernel Methods meet Fe …

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems


Title	Optimism-Based Adaptive Regulation of Linear-Quadratic Systems
Authors	Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis
Abstract	The main challenge for adaptive regulation of linear-quadratic systems is the trade-off between identification and control. An adaptive policy needs to address both the estimation of unknown dynamics parameters (exploration), as well as the regulation of the underlying system (exploitation). To this end, optimism-based methods which bias the identification in favor of optimistic approximations of the true parameter are employed in the literature. A number of asymptotic results have been established, but their finite time counterparts are few, with important restrictions. This study establishes results for the worst-case regret of optimism-based adaptive policies. The presented high probability upper bounds are optimal up to logarithmic factors. The non-asymptotic analysis of this work requires very mild assumptions; (i) stabilizability of the system’s dynamics, and (ii) limiting the degree of heaviness of the noise distribution. To establish such bounds, certain novel techniques are developed to comprehensively address the probabilistic behavior of dependent random matrices with heavy-tailed distributions.
Tasks
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07230v3
PDF	http://arxiv.org/pdf/1711.07230v3.pdf
PWC	https://paperswithcode.com/paper/regret-analysis-for-adaptive-linear-quadratic
Repo
Framework

Analogy Mining for Specific Design Needs


Title	Analogy Mining for Specific Design Needs
Authors	Karni Gilon, Felicia Y Ng, Joel Chan, Hila Lifshitz Assaf, Aniket Kittur, Dafna Shahaf
Abstract	Finding analogical inspirations in distant domains is a powerful way of solving problems. However, as the number of inspirations that could be matched and the dimensions on which that matching could occur grow, it becomes challenging for designers to find inspirations relevant to their needs. Furthermore, designers are often interested in exploring specific aspects of a product– for example, one designer might be interested in improving the brewing capability of an outdoor coffee maker, while another might wish to optimize for portability. In this paper we introduce a novel system for targeting analogical search for specific needs. Specifically, we contribute a novel analogical search engine for expressing and abstracting specific design needs that returns more distant yet relevant inspirations than alternate approaches.
Tasks
Published	2017-12-19
URL	http://arxiv.org/abs/1712.06880v1
PDF	http://arxiv.org/pdf/1712.06880v1.pdf
PWC	https://paperswithcode.com/paper/analogy-mining-for-specific-design-needs
Repo
Framework

Improving Accuracy of Nonparametric Transfer Learning via Vector Segmentation


Title	Improving Accuracy of Nonparametric Transfer Learning via Vector Segmentation
Authors	Vincent Gripon, Ghouthi B. Hacene, Matthias Löwe, Franck Vermet
Abstract	Transfer learning using deep neural networks as feature extractors has become increasingly popular over the past few years. It allows to obtain state-of-the-art accuracy on datasets too small to train a deep neural network on its own, and it provides cutting edge descriptors that, combined with nonparametric learning methods, allow rapid and flexible deployment of performing solutions in computationally restricted settings. In this paper, we are interested in showing that the features extracted using deep neural networks have specific properties which can be used to improve accuracy of downstream nonparametric learning methods. Namely, we demonstrate that for some distributions where information is embedded in a few coordinates, segmenting feature vectors can lead to better accuracy. We show how this model can be applied to real datasets by performing experiments using three mainstream deep neural network feature extractors and four databases, in vision and audio.
Tasks	Transfer Learning
Published	2017-10-24
URL	http://arxiv.org/abs/1710.08637v1
PDF	http://arxiv.org/pdf/1710.08637v1.pdf
PWC	https://paperswithcode.com/paper/improving-accuracy-of-nonparametric-transfer
Repo
Framework

Modular Continual Learning in a Unified Visual Environment


Title	Modular Continual Learning in a Unified Visual Environment
Authors	Kevin T. Feigelis, Blue Sheffer, Daniel L. K. Yamins
Abstract	A core aspect of human intelligence is the ability to learn new tasks quickly and switch between them flexibly. Here, we describe a modular continual reinforcement learning paradigm inspired by these abilities. We first introduce a visual interaction environment that allows many types of tasks to be unified in a single framework. We then describe a reward map prediction scheme that learns new tasks robustly in the very large state and action spaces required by such an environment. We investigate how properties of module architecture influence efficiency of task learning, showing that a module motif incorporating specific design principles (e.g. early bottlenecks, low-order polynomial nonlinearities, and symmetry) significantly outperforms more standard neural network motifs, needing fewer training examples and fewer neurons to achieve high levels of performance. Finally, we present a meta-controller architecture for task switching based on a dynamic neural voting scheme, which allows new modules to use information learned from previously-seen tasks to substantially improve their own learning efficiency.
Tasks	Continual Learning
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07425v2
PDF	http://arxiv.org/pdf/1711.07425v2.pdf
PWC	https://paperswithcode.com/paper/modular-continual-learning-in-a-unified
Repo
Framework

Atypicality for Heart Rate Variability Using a Pattern-Tree Weighting Method


Title	Atypicality for Heart Rate Variability Using a Pattern-Tree Weighting Method
Authors	Elyas Sabeti, Anders Høst-Madsen
Abstract	Heart rate variability (HRV) is a vital measure of the autonomic nervous system functionality and a key indicator of cardiovascular condition. This paper proposes a novel method, called pattern tree which is an extension of Willem’s context tree to real-valued data, to investigate HRV via an atypicality framework. In a previous paper atypicality was developed as method for mining and discovery in “Big Data,” which requires a universal approach. Using the proposed pattern tree as a universal source coder in this framework led to discovery of arrhythmias and unknown patterns in HRV Holter Monitoring.
Tasks	Heart Rate Variability
Published	2017-10-12
URL	http://arxiv.org/abs/1710.07319v1
PDF	http://arxiv.org/pdf/1710.07319v1.pdf
PWC	https://paperswithcode.com/paper/atypicality-for-heart-rate-variability-using
Repo
Framework

When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data


Title	When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Authors	Jacopo Cavazza, Pietro Morerio, Vittorio Murino
Abstract	Human action recognition from skeletal data is a hot research topic and important in many open domain applications of computer vision, thanks to recently introduced 3D sensors. In the literature, naive methods simply transfer off-the-shelf techniques from video to the skeletal representation. However, the current state-of-the-art is contended between to different paradigms: kernel-based methods and feature learning with (recurrent) neural networks. Both approaches show strong performances, yet they exhibit heavy, but complementary, drawbacks. Motivated by this fact, our work aims at combining together the best of the two paradigms, by proposing an approach where a shallow network is fed with a covariance representation. Our intuition is that, as long as the dynamics is effectively modeled, there is no need for the classification network to be deep nor recurrent in order to score favorably. We validate this hypothesis in a broad experimental analysis over 6 publicly available datasets.
Tasks	Temporal Action Localization
Published	2017-08-03
URL	http://arxiv.org/abs/1708.01022v1
PDF	http://arxiv.org/pdf/1708.01022v1.pdf
PWC	https://paperswithcode.com/paper/when-kernel-methods-meet-feature-learning-log
Repo
Framework

Counterfactual Multi-Agent Policy Gradients


Title	Counterfactual Multi-Agent Policy Gradients
Authors	Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson
Abstract	Cooperative multi-agent systems can be naturally used to model many real world problems, such as network packet routing and the coordination of autonomous vehicles. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents’ policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent’s action, while keeping the other agents’ actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.
Tasks	Autonomous Vehicles, Starcraft
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08926v2
PDF	http://arxiv.org/pdf/1705.08926v2.pdf
PWC	https://paperswithcode.com/paper/counterfactual-multi-agent-policy-gradients
Repo
Framework

Sparse Stochastic Bandits


Title	Sparse Stochastic Bandits
Authors	Joon Kwon, Vianney Perchet, Claire Vernade
Abstract	In the classical multi-armed bandit problem, d arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with d (or with sqrt(d) in the minimax sense). We here consider the sparse case of this classical problem in the sense that only a small number of arms, namely s < d, have a positive expected reward. We are able to leverage this additional assumption to provide an algorithm whose regret scales with s instead of d. Moreover, we prove that this algorithm is optimal by providing a matching lower bound - at least for a wide and pertinent range of parameters that we determine - and by evaluating its performance on simulated data.
Tasks
Published	2017-06-05
URL	http://arxiv.org/abs/1706.01383v1
PDF	http://arxiv.org/pdf/1706.01383v1.pdf
PWC	https://paperswithcode.com/paper/sparse-stochastic-bandits
Repo
Framework

Tracking in Aerial Hyperspectral Videos using Deep Kernelized Correlation Filters


Title	Tracking in Aerial Hyperspectral Videos using Deep Kernelized Correlation Filters
Authors	Burak Uzkent, Aneesh Rangnekar, Matthew J. Hoffman
Abstract	Hyperspectral imaging holds enormous potential to improve the state-of-the-art in aerial vehicle tracking with low spatial and temporal resolutions. Recently, adaptive multi-modal hyperspectral sensors have attracted growing interest due to their ability to record extended data quickly from aerial platforms. In this study, we apply popular concepts from traditional object tracking, namely (1) Kernelized Correlation Filters (KCF) and (2) Deep Convolutional Neural Network (CNN) features to aerial tracking in hyperspectral domain. We propose the Deep Hyperspectral Kernelized Correlation Filter based tracker (DeepHKCF) to efficiently track aerial vehicles using an adaptive multi-modal hyperspectral sensor. We address low temporal resolution by designing a single KCF-in-multiple Regions-of-Interest (ROIs) approach to cover a reasonably large area. To increase the speed of deep convolutional features extraction from multiple ROIs, we design an effective ROI mapping strategy. The proposed tracker also provides flexibility to couple with the more advanced correlation filter trackers. The DeepHKCF tracker performs exceptionally well with deep features set up in a synthetic hyperspectral video generated by the Digital Imaging and Remote Sensing Image Generation (DIRSIG) software. Additionally, we generate a large, synthetic, single-channel dataset using DIRSIG to perform vehicle classification in the Wide Area Motion Imagery (WAMI) platform. This way, the high-fidelity of the DIRSIG software is proved and a large scale aerial vehicle classification dataset is released to support studies on vehicle detection and tracking in the WAMI platform.
Tasks	Image Generation, Object Tracking
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07235v3
PDF	http://arxiv.org/pdf/1711.07235v3.pdf
PWC	https://paperswithcode.com/paper/tracking-in-aerial-hyperspectral-videos-using
Repo
Framework

Accelerated Reinforcement Learning


Title	Accelerated Reinforcement Learning
Authors	K. Lakshmanan
Abstract	Policy gradient methods are widely used in reinforcement learning algorithms to search for better policies in the parameterized policy space. They do gradient search in the policy space and are known to converge very slowly. Nesterov developed an accelerated gradient search algorithm for convex optimization problems. This has been recently extended for non-convex and also stochastic optimization. We use Nesterov’s acceleration for policy gradient search in the well-known actor-critic algorithm and show the convergence using ODE method. We tested this algorithm on a scheduling problem. Here an incoming job is scheduled into one of the four queues based on the queue lengths. We see from experimental results that algorithm using Nesterov’s acceleration has significantly better performance compared to algorithm which do not use acceleration. To the best of our knowledge this is the first time Nesterov’s acceleration has been used with actor-critic algorithm.
Tasks	Policy Gradient Methods, Stochastic Optimization
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08070v2
PDF	http://arxiv.org/pdf/1710.08070v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-reinforcement-learning
Repo
Framework

Face Parsing via Recurrent Propagation


Title	Face Parsing via Recurrent Propagation
Authors	Sifei Liu, Jianping Shi, Ji Liang, Ming-Hsuan Yang
Abstract	Face parsing is an important problem in computer vision that finds numerous applications including recognition and editing. Recently, deep convolutional neural networks (CNNs) have been applied to image parsing and segmentation with the state-of-the-art performance. In this paper, we propose a face parsing algorithm that combines hierarchical representations learned by a CNN, and accurate label propagations achieved by a spatially variant recurrent neural network (RNN). The RNN-based propagation approach enables efficient inference over a global space with the guidance of semantic edges generated by a local convolutional model. Since the convolutional architecture can be shallow and the spatial RNN can have few parameters, the framework is much faster and more light-weighted than the state-of-the-art CNNs for the same task. We apply the proposed model to coarse-grained and fine-grained face parsing. For fine-grained face parsing, we develop a two-stage approach by first identifying the main regions and then segmenting the detail components, which achieves better performance in terms of accuracy and efficiency. With a single GPU, the proposed algorithm parses face images accurately at 300 frames per second, which facilitates real-time applications.
Tasks
Published	2017-08-06
URL	http://arxiv.org/abs/1708.01936v1
PDF	http://arxiv.org/pdf/1708.01936v1.pdf
PWC	https://paperswithcode.com/paper/face-parsing-via-recurrent-propagation
Repo
Framework

Ultimate Intelligence Part III: Measures of Intelligence, Perception and Intelligent Agents


Title	Ultimate Intelligence Part III: Measures of Intelligence, Perception and Intelligent Agents
Authors	Eray Özkural
Abstract	We propose that operator induction serves as an adequate model of perception. We explain how to reduce universal agent models to operator induction. We propose a universal measure of operator induction fitness, and show how it can be used in a reinforcement learning model and a homeostasis (self-preserving) agent based on the free energy principle. We show that the action of the homeostasis agent can be explained by the operator induction model.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.03879v1
PDF	http://arxiv.org/pdf/1709.03879v1.pdf
PWC	https://paperswithcode.com/paper/ultimate-intelligence-part-iii-measures-of
Repo
Framework

Colour Terms: a Categorisation Model Inspired by Visual Cortex Neurons


Title	Colour Terms: a Categorisation Model Inspired by Visual Cortex Neurons
Authors	Arash Akbarinia, C. Alejandro Parraga
Abstract	Although it seems counter-intuitive, categorical colours do not exist as external physical entities but are very much the product of our brains. Our cortical machinery segments the world and associate objects to specific colour terms, which is not only convenient for communication but also increases the efficiency of visual processing by reducing the dimensionality of input scenes. Although the neural substrate for this phenomenon is unknown, a recent study of cortical colour processing has discovered a set of neurons that are isoresponsive to stimuli in the shape of 3D-ellipsoidal surfaces in colour-opponent space. We hypothesise that these neurons might help explain the underlying mechanisms of colour naming in the visual cortex. Following this, we propose a biologically-inspired colour naming model where each colour term - e.g. red, green, blue, yellow, etc. - is represented through an ellipsoid in 3D colour-opponent space. This paradigm is also supported by previous psychophysical colour categorisation experiments whose results resemble such shapes. “Belongingness” of each pixel to different colour categories is computed by a non-linear sigmoidal logistic function. The final colour term for a given pixel is calculated by a maximum pooling mechanism. The simplicity of our model allows its parameters to be learnt from a handful of segmented images. It also offers a straightforward extension to include further colour terms. Additionally, ellipsoids of proposed model can adapt to image contents offering a dynamical solution in order to address phenomenon of colour constancy. Our results on the Munsell chart and two datasets of real-world images show an overall improvement comparing to state-of-the-art algorithms.
Tasks
Published	2017-09-19
URL	http://arxiv.org/abs/1709.06300v1
PDF	http://arxiv.org/pdf/1709.06300v1.pdf
PWC	https://paperswithcode.com/paper/colour-terms-a-categorisation-model-inspired
Repo
Framework

Auto-Differentiating Linear Algebra


Title	Auto-Differentiating Linear Algebra
Authors	Matthias Seeger, Asmus Hetzel, Zhenwen Dai, Eric Meissner, Neil D. Lawrence
Abstract	Development systems for deep learning (DL), such as Theano, Torch, TensorFlow, or MXNet, are easy-to-use tools for creating complex neural network models. Since gradient computations are automatically baked in, and execution is mapped to high performance hardware, these models can be trained end-to-end on large amounts of data. However, it is currently not easy to implement many basic machine learning primitives in these systems (such as Gaussian processes, least squares estimation, principal components analysis, Kalman smoothing), mainly because they lack efficient support of linear algebra primitives as differentiable operators. We detail how a number of matrix decompositions (Cholesky, LQ, symmetric eigen) can be implemented as differentiable operators. We have implemented these primitives in MXNet, running on CPU and GPU in single and double precision. We sketch use cases of these new operators, learning Gaussian process and Bayesian linear regression models, where we demonstrate very substantial reductions in implementation complexity and running time compared to previous codes. Our MXNet extension allows end-to-end learning of hybrid models, which combine deep neural networks (DNNs) with Bayesian concepts, with applications in advanced Gaussian process models, scalable Bayesian optimization, and Bayesian active learning.
Tasks	Active Learning, Gaussian Processes
Published	2017-10-24
URL	https://arxiv.org/abs/1710.08717v5
PDF	https://arxiv.org/pdf/1710.08717v5.pdf
PWC	https://paperswithcode.com/paper/auto-differentiating-linear-algebra
Repo
Framework

Learning Solving Procedure for Artificial Neural Network


Title	Learning Solving Procedure for Artificial Neural Network
Authors	Ju-Hong Lee, Moon-Ju Kang, Bumghi Choi
Abstract	It is expected that progress toward true artificial intelligence will be achieved through the emergence of a system that integrates representation learning and complex reasoning (LeCun et al. 2015). In response to this prediction, research has been conducted on implementing the symbolic reasoning of a von Neumann computer in an artificial neural network (Graves et al. 2016; Graves et al. 2014; Reed et al. 2015). However, these studies have many limitations in realizing neural-symbolic integration (Jaeger. 2016). Here, we present a new learning paradigm: a learning solving procedure (LSP) that learns the procedure for solving complex problems. This is not accomplished merely by learning input-output data, but by learning algorithms through a solving procedure that obtains the output as a sequence of tasks for a given input problem. The LSP neural network system not only learns simple problems of addition and multiplication, but also the algorithms of complicated problems, such as complex arithmetic expression, sorting, and Hanoi Tower. To realize this, the LSP neural network structure consists of a deep neural network and long short-term memory, which are recursively combined. Through experimentation, we demonstrate the efficiency and scalability of LSP and its validity as a mechanism of complex reasoning.
Tasks	Representation Learning
Published	2017-11-06
URL	http://arxiv.org/abs/1711.01754v1
PDF	http://arxiv.org/pdf/1711.01754v1.pdf
PWC	https://paperswithcode.com/paper/learning-solving-procedure-for-artificial
Repo
Framework