October 16, 2019

3465 words 17 mins read

Paper Group ANR 1066

Multi-Strategy Coevolving Aging Particle Optimization. Latent Space Non-Linear Statistics. CactusNets: Layer Applicability as a Metric for Transfer Learning. Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity. Graph Convolutions on Spectral Embeddings: Learning of Cortical Surface Data. Deceptive Games. Si …

Multi-Strategy Coevolving Aging Particle Optimization


Title	Multi-Strategy Coevolving Aging Particle Optimization
Authors	Giovanni Iacca, Fabio Caraffini, Ferrante Neri
Abstract	We propose Multi-Strategy Coevolving Aging Particles (MS-CAP), a novel population-based algorithm for black-box optimization. In a memetic fashion, MS-CAP combines two components with complementary algorithm logics. In the first stage, each particle is perturbed independently along each dimension with a progressively shrinking (decaying) radius, and attracted towards the current best solution with an increasing force. In the second phase, the particles are mutated and recombined according to a multi-strategy approach in the fashion of the ensemble of mutation strategies in Differential Evolution. The proposed algorithm is tested, at different dimensionalities, on two complete black-box optimization benchmarks proposed at the Congress on Evolutionary Computation 2010 and 2013. To demonstrate the applicability of the approach, we also test MS-CAP to train a Feedforward Neural Network modelling the kinematics of an 8-link robot manipulator. The numerical results show that MS-CAP, for the setting considered in this study, tends to outperform the state-of-the-art optimization algorithms on a large set of problems, thus resulting in a robust and versatile optimizer.
Tasks
Published	2018-10-11
URL	http://arxiv.org/abs/1810.05018v1
PDF	http://arxiv.org/pdf/1810.05018v1.pdf
PWC	https://paperswithcode.com/paper/multi-strategy-coevolving-aging-particle
Repo
Framework

Latent Space Non-Linear Statistics


Title	Latent Space Non-Linear Statistics
Authors	Line Kuhnel, Tom Fletcher, Sarang Joshi, Stefan Sommer
Abstract	Given data, deep generative models, such as variational autoencoders (VAE) and generative adversarial networks (GAN), train a lower dimensional latent representation of the data space. The linear Euclidean geometry of data space pulls back to a nonlinear Riemannian geometry on the latent space. The latent space thus provides a low-dimensional nonlinear representation of data and classical linear statistical techniques are no longer applicable. In this paper we show how statistics of data in their latent space representation can be performed using techniques from the field of nonlinear manifold statistics. Nonlinear manifold statistics provide generalizations of Euclidean statistical notions including means, principal component analysis, and maximum likelihood fits of parametric probability distributions. We develop new techniques for maximum likelihood inference in latent space, and adress the computational complexity of using geometric algorithms with high-dimensional data by training a separate neural network to approximate the Riemannian metric and cometric tensor capturing the shape of the learned data manifold.
Tasks
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07632v1
PDF	http://arxiv.org/pdf/1805.07632v1.pdf
PWC	https://paperswithcode.com/paper/latent-space-non-linear-statistics
Repo
Framework

CactusNets: Layer Applicability as a Metric for Transfer Learning


Title	CactusNets: Layer Applicability as a Metric for Transfer Learning
Authors	Edward Collier, Robert DiBiano, Supratik Mukhopadhyay
Abstract	Deep neural networks trained over large datasets learn features that are both generic to the whole dataset, and specific to individual classes in the dataset. Learned features tend towards generic in the lower layers and specific in the higher layers of a network. Methods like fine-tuning are made possible because of the ability for one filter to apply to multiple target classes. Much like the human brain this behavior, can also be used to cluster and separate classes. However, to the best of our knowledge there is no metric for how applicable learned features are to specific classes. In this paper we propose a definition and metric for measuring the applicability of learned features to individual classes, and use this applicability metric to estimate input applicability and produce a new method of unsupervised learning we call the CactusNet.
Tasks	Transfer Learning
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07846v1
PDF	http://arxiv.org/pdf/1804.07846v1.pdf
PWC	https://paperswithcode.com/paper/cactusnets-layer-applicability-as-a-metric
Repo
Framework

Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity


Title	Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity
Authors	Lin Chen, Christopher Harshaw, Hamed Hassani, Amin Karbasi
Abstract	Online optimization has been a successful framework for solving large-scale problems under computational constraints and partial information. Current methods for online convex optimization require either a projection or exact gradient computation at each step, both of which can be prohibitively expensive for large-scale applications. At the same time, there is a growing trend of non-convex optimization in machine learning community and a need for online methods. Continuous DR-submodular functions, which exhibit a natural diminishing returns condition, have recently been proposed as a broad class of non-convex functions which may be efficiently optimized. Although online methods have been introduced, they suffer from similar problems. In this work, we propose Meta-Frank-Wolfe, the first online projection-free algorithm that uses stochastic gradient estimates. The algorithm relies on a careful sampling of gradients in each round and achieves the optimal $O( \sqrt{T})$ adversarial regret bounds for convex and continuous submodular optimization. We also propose One-Shot Frank-Wolfe, a simpler algorithm which requires only a single stochastic gradient estimate in each round and achieves an $O(T^{2/3})$ stochastic regret bound for convex and continuous submodular optimization. We apply our methods to develop a novel “lifting” framework for the online discrete submodular maximization and also see that they outperform current state-of-the-art techniques on various experiments.
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.08183v4
PDF	http://arxiv.org/pdf/1802.08183v4.pdf
PWC	https://paperswithcode.com/paper/projection-free-online-optimization-with
Repo
Framework

Graph Convolutions on Spectral Embeddings: Learning of Cortical Surface Data


Title	Graph Convolutions on Spectral Embeddings: Learning of Cortical Surface Data
Authors	Karthik Gopinath, Christian Desrosiers, Herve Lombaert
Abstract	Neuronal cell bodies mostly reside in the cerebral cortex. The study of this thin and highly convoluted surface is essential for understanding how the brain works. The analysis of surface data is, however, challenging due to the high variability of the cortical geometry. This paper presents a novel approach for learning and exploiting surface data directly across surface domains. Current approaches rely on geometrical simplifications, such as spherical inflations, a popular but costly process. For instance, the widely used FreeSurfer takes about 3 hours to parcellate brain surfaces on a standard machine. Direct learning of surface data via graph convolutions would provide a new family of fast algorithms for processing brain surfaces. However, the current limitation of existing state-of-the-art approaches is their inability to compare surface data across different surface domains. Surface bases are indeed incompatible between brain geometries. This paper leverages recent advances in spectral graph matching to transfer surface data across aligned spectral domains. This novel approach enables a direct learning of surface data across compatible surface bases. It exploits spectral filters over intrinsic representations of surface neighborhoods. We illustrate the benefits of this approach with an application to brain parcellation. We validate the algorithm over 101 manually labeled brain surfaces. The results show a significant improvement in labeling accuracy over recent Euclidean approaches, while gaining a drastic speed improvement over conventional methods.
Tasks	Graph Matching
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10336v1
PDF	http://arxiv.org/pdf/1803.10336v1.pdf
PWC	https://paperswithcode.com/paper/graph-convolutions-on-spectral-embeddings
Repo
Framework

Deceptive Games


Title	Deceptive Games
Authors	Damien Anderson, Matthew Stephenson, Julian Togelius, Christian Salge, John Levine, Jochen Renz
Abstract	Deceptive games are games where the reward structure or other aspects of the game are designed to lead the agent away from a globally optimal policy. While many games are already deceptive to some extent, we designed a series of games in the Video Game Description Language (VGDL) implementing specific types of deception, classified by the cognitive biases they exploit. VGDL games can be run in the General Video Game Artificial Intelligence (GVGAI) Framework, making it possible to test a variety of existing AI agents that have been submitted to the GVGAI Competition on these deceptive games. Our results show that all tested agents are vulnerable to several kinds of deception, but that different agents have different weaknesses. This suggests that we can use deception to understand the capabilities of a game-playing algorithm, and game-playing algorithms to characterize the deception displayed by a game.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00048v2
PDF	http://arxiv.org/pdf/1802.00048v2.pdf
PWC	https://paperswithcode.com/paper/deceptive-games
Repo
Framework

Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks


Title	Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks
Authors	Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, Konstantinos Bousmalis
Abstract	Real world data, especially in the domain of robotics, is notoriously costly to collect. One way to circumvent this can be to leverage the power of simulation to produce large amounts of labelled data. However, training models on simulated images does not readily transfer to real-world ones. Using domain adaptation methods to cross this “reality gap” requires a large amount of unlabelled real-world data, whilst domain randomization alone can waste modeling power. In this paper, we present Randomized-to-Canonical Adaptation Networks (RCANs), a novel approach to crossing the visual reality gap that uses no real-world data. Our method learns to translate randomized rendered images into their equivalent non-randomized, canonical versions. This in turn allows for real images to also be translated into canonical sim images. We demonstrate the effectiveness of this sim-to-real approach by training a vision-based closed-loop grasping reinforcement learning agent in simulation, and then transferring it to the real world to attain 70% zero-shot grasp success on unseen objects, a result that almost doubles the success of learning the same task directly on domain randomization alone. Additionally, by joint finetuning in the real-world with only 5,000 real-world grasps, our method achieves 91%, attaining comparable performance to a state-of-the-art system trained with 580,000 real-world grasps, resulting in a reduction of real-world data by more than 99%.
Tasks	Domain Adaptation, Robotic Grasping
Published	2018-12-18
URL	https://arxiv.org/abs/1812.07252v3
PDF	https://arxiv.org/pdf/1812.07252v3.pdf
PWC	https://paperswithcode.com/paper/sim-to-real-via-sim-to-sim-data-efficient
Repo
Framework

Matched Filters for Noisy Induced Subgraph Detection


Title	Matched Filters for Noisy Induced Subgraph Detection
Authors	Daniel L. Sussman, Youngser Park, Carey E. Priebe, Vince Lyzinski
Abstract	The problem of finding the vertex correspondence between two noisy graphs with different number of vertices where the smaller graph is still large has many applications in social networks, neuroscience, and computer vision. We propose a solution to this problem via a graph matching matched filter: centering and padding the smaller adjacency matrix and applying graph matching methods to align it to the larger network. The centering and padding schemes can be incorporated into any algorithm that matches using adjacency matrices. Under a statistical model for correlated pairs of graphs, which yields a noisy copy of the small graph within the larger graph, the resulting optimization problem can be guaranteed to recover the true vertex correspondence between the networks. However, there are currently no efficient algorithms for solving this problem. To illustrate the possibilities and challenges of such problems, we use an algorithm that can exploit a partially known correspondence and show via varied simulations and applications to {\it Drosophila} and human connectomes that this approach can achieve good performance.
Tasks	Graph Matching
Published	2018-03-06
URL	https://arxiv.org/abs/1803.02423v3
PDF	https://arxiv.org/pdf/1803.02423v3.pdf
PWC	https://paperswithcode.com/paper/matched-filters-for-noisy-induced-subgraph
Repo
Framework

Sparse PCA from Sparse Linear Regression


Title	Sparse PCA from Sparse Linear Regression
Authors	Guy Bresler, Sung Min Park, Madalina Persu
Abstract	Sparse Principal Component Analysis (SPCA) and Sparse Linear Regression (SLR) have a wide range of applications and have attracted a tremendous amount of attention in the last two decades as canonical examples of statistical problems in high dimension. A variety of algorithms have been proposed for both SPCA and SLR, but an explicit connection between the two had not been made. We show how to efficiently transform a black-box solver for SLR into an algorithm for SPCA: assuming the SLR solver satisfies prediction error guarantees achieved by existing efficient algorithms such as those based on the Lasso, the SPCA algorithm derived from it achieves near state of the art guarantees for testing and for support recovery for the single spiked covariance model as obtained by the current best polynomialtime algorithms. Our reduction not only highlights the inherent similarity between the two problems, but also, from a practical standpoint, allows one to obtain a collection of algorithms for SPCA directly from known algorithms for SLR. We provide experimental results on simulated data comparing our proposed framework to other algorithms for SPCA.
Tasks
Published	2018-11-25
URL	http://arxiv.org/abs/1811.10106v1
PDF	http://arxiv.org/pdf/1811.10106v1.pdf
PWC	https://paperswithcode.com/paper/sparse-pca-from-sparse-linear-regression
Repo
Framework

Cross-Resolution Person Re-identification with Deep Antithetical Learning


Title	Cross-Resolution Person Re-identification with Deep Antithetical Learning
Authors	Zijie Zhuang, Haizhou Ai, Long Chen, Chong Shang
Abstract	Images with different resolutions are ubiquitous in public person re-identification (ReID) datasets and real-world scenes, it is thus crucial for a person ReID model to handle the image resolution variations for improving its generalization ability. However, most existing person ReID methods pay little attention to this resolution discrepancy problem. One paradigm to deal with this problem is to use some complicated methods for mapping all images into an artificial image space, which however will disrupt the natural image distribution and requires heavy image preprocessing. In this paper, we analyze the deficiencies of several widely-used objective functions handling image resolution discrepancies and propose a new framework called deep antithetical learning that directly learns from the natural image space rather than creating an arbitrary one. We first quantify and categorize original training images according to their resolutions. Then we create an antithetical training set and make sure that original training images have counterparts with antithetical resolutions in this new set. At last, a novel Contrastive Center Loss(CCL) is proposed to learn from images with different resolutions without being interfered by their resolution discrepancies. Extensive experimental analyses and evaluations indicate that the proposed framework, even using a vanilla deep ReID network, exhibits remarkable performance improvements. Without bells and whistles, our approach outperforms previous state-of-the-art methods by a large margin.
Tasks	Person Re-Identification
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10221v1
PDF	http://arxiv.org/pdf/1810.10221v1.pdf
PWC	https://paperswithcode.com/paper/cross-resolution-person-re-identification
Repo
Framework

Efficient Hierarchical Graph-Based Segmentation of RGBD Videos


Title	Efficient Hierarchical Graph-Based Segmentation of RGBD Videos
Authors	Steven Hickson, Stan Birchfield, Irfan Essa, Henrik Christensen
Abstract	We present an efficient and scalable algorithm for segmenting 3D RGBD point clouds by combining depth, color, and temporal information using a multistage, hierarchical graph-based approach. Our algorithm processes a moving window over several point clouds to group similar regions over a graph, resulting in an initial over-segmentation. These regions are then merged to yield a dendrogram using agglomerative clustering via a minimum spanning tree algorithm. Bipartite graph matching at a given level of the hierarchical tree yields the final segmentation of the point clouds by maintaining region identities over arbitrarily long periods of time. We show that a multistage segmentation with depth then color yields better results than a linear combination of depth and color. Due to its incremental processing, our algorithm can process videos of any length and in a streaming pipeline. The algorithm’s ability to produce robust, efficient segmentation is demonstrated with numerous experimental results on challenging sequences from our own as well as public RGBD data sets.
Tasks	Graph Matching
Published	2018-01-26
URL	http://arxiv.org/abs/1801.08981v1
PDF	http://arxiv.org/pdf/1801.08981v1.pdf
PWC	https://paperswithcode.com/paper/efficient-hierarchical-graph-based
Repo
Framework

Calibrating Uncertainties in Object Localization Task


Title	Calibrating Uncertainties in Object Localization Task
Authors	Buu Phan, Rick Salay, Krzysztof Czarnecki, Vahdat Abdelzad, Taylor Denouden, Sachin Vernekar
Abstract	In many safety-critical applications such as autonomous driving and surgical robots, it is desirable to obtain prediction uncertainties from object detection modules to help support safe decision-making. Specifically, such modules need to estimate the probability of each predicted object in a given region and the confidence interval for its bounding box. While recent Bayesian deep learning methods provide a principled way to estimate this uncertainty, the estimates for the bounding boxes obtained using these methods are uncalibrated. In this paper, we address this problem for the single-object localization task by adapting an existing technique for calibrating regression models. We show, experimentally, that the resulting calibrated model obtains more reliable uncertainty estimates.
Tasks	Autonomous Driving, Decision Making, Object Detection, Object Localization
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11210v1
PDF	http://arxiv.org/pdf/1811.11210v1.pdf
PWC	https://paperswithcode.com/paper/calibrating-uncertainties-in-object
Repo
Framework

Take a Look Around: Using Street View and Satellite Images to Estimate House Prices


Title	Take a Look Around: Using Street View and Satellite Images to Estimate House Prices
Authors	Stephen Law, Brooks Paige, Chris Russell
Abstract	When an individual purchases a home, they simultaneously purchase its structural features, its accessibility to work, and the neighborhood amenities. Some amenities, such as air quality, are measurable while others, such as the prestige or the visual impression of a neighborhood, are difficult to quantify. Despite the well-known impacts intangible housing features have on house prices, limited attention has been given to systematically quantifying these difficult to measure amenities. Two issues have led to this neglect. Not only do few quantitative methods exist that can measure the urban environment, but that the collection of such data is both costly and subjective. We show that street image and satellite image data can capture these urban qualities and improve the estimation of house prices. We propose a pipeline that uses a deep neural network model to automatically extract visual features from images to estimate house prices in London, UK. We make use of traditional housing features such as age, size, and accessibility as well as visual features from Google Street View images and Bing aerial images in estimating the house price model. We find encouraging results where learning to characterize the urban quality of a neighborhood improves house price prediction, even when generalizing to previously unseen London boroughs. We explore the use of non-linear vs. linear methods to fuse these cues with conventional models of house pricing, and show how the interpretability of linear models allows us to directly extract proxy variables for visual desirability of neighborhoods that are both of interest in their own right, and could be used as inputs to other econometric methods. This is particularly valuable as once the network has been trained with the training data, it can be applied elsewhere, allowing us to generate vivid dense maps of the visual appeal of London streets.
Tasks
Published	2018-07-18
URL	https://arxiv.org/abs/1807.07155v2
PDF	https://arxiv.org/pdf/1807.07155v2.pdf
PWC	https://paperswithcode.com/paper/take-a-look-around-using-street-view-and
Repo
Framework

MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description


Title	MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description
Authors	Oliver Nina, Washington Garcia, Scott Clouse, Alper Yilmaz
Abstract	Learning visual feature representations for video analysis is a daunting task that requires a large amount of training samples and a proper generalization framework. Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. In this paper, we introduce a novel multitask encoder-decoder framework for automatic semantic description and captioning of video sequences. In contrast to current approaches, our method relies on distinct decoders that train a visual encoder in a multitask fashion. Our system does not depend solely on multiple labels and allows for a lack of training data working even with datasets where only one single annotation is viable per video. Our method shows improved performance over current state of the art methods in several metrics on multi-caption and single-caption datasets. To the best of our knowledge, our method is the first method to use a multitask approach for encoding video features. Our method demonstrates its robustness on the Large Scale Movie Description Challenge (LSMDC) 2017 where our method won the movie description task and its results were ranked among other competitors as the most helpful for the visually impaired.
Tasks	Video Captioning
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07257v1
PDF	http://arxiv.org/pdf/1809.07257v1.pdf
PWC	https://paperswithcode.com/paper/mtle-a-multitask-learning-encoder-of-visual
Repo
Framework

Mean Local Group Average Precision (mLGAP): A New Performance Metric for Hashing-based Retrieval


Title	Mean Local Group Average Precision (mLGAP): A New Performance Metric for Hashing-based Retrieval
Authors	Pak Lun Kevin Ding, Yikang Li, Baoxin Li
Abstract	The research on hashing techniques for visual data is gaining increased attention in recent years due to the need for compact representations supporting efficient search/retrieval in large-scale databases such as online images. Among many possibilities, Mean Average Precision(mAP) has emerged as the dominant performance metric for hashing-based retrieval. One glaring shortcoming of mAP is its inability in balancing retrieval accuracy and utilization of hash codes: pushing a system to attain higher mAP will inevitably lead to poorer utilization of the hash codes. Poor utilization of the hash codes hinders good retrieval because of increased collision of samples in the hash space. This means that a model giving a higher mAP values does not necessarily do a better job in retrieval. In this paper, we introduce a new metric named Mean Local Group Average Precision (mLGAP) for better evaluation of the performance of hashing-based retrieval. The new metric provides a retrieval performance measure that also reconciles the utilization of hash codes, leading to a more practically meaningful performance metric than conventional ones like mAP. To this end, we start by mathematical analysis of the deficiencies of mAP for hashing-based retrieval. We then propose mLGAP and show why it is more appropriate for hashing-based retrieval. Experiments on image retrieval are used to demonstrate the effectiveness of the proposed metric.
Tasks	Image Retrieval
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09763v1
PDF	http://arxiv.org/pdf/1811.09763v1.pdf
PWC	https://paperswithcode.com/paper/mean-local-group-average-precision-mlgap-a
Repo
Framework