May 7, 2019

3102 words 15 mins read

Paper Group ANR 150

Paper Group ANR 150

To Fall Or Not To Fall: A Visual Approach to Physical Stability Prediction. Low-rank tensor completion: a Riemannian manifold preconditioning approach. Gambler’s Ruin Bandit Problem. Image-level Classification in Hyperspectral Images using Feature Descriptors, with Application to Face Recognition. How Useful is Region-based Classification of Remote …

To Fall Or Not To Fall: A Visual Approach to Physical Stability Prediction

Title To Fall Or Not To Fall: A Visual Approach to Physical Stability Prediction
Authors Wenbin Li, Seyedmajid Azimi, Aleš Leonardis, Mario Fritz
Abstract Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel object and their configurations. Developmental psychology has shown that such skills are acquired by infants from observations at a very early stage. In this paper, we contrast a more traditional approach of taking a model-based route with explicit 3D representations and physical simulation by an end-to-end approach that directly predicts stability and related quantities from appearance. We ask the question if and to what extent and quality such a skill can directly be acquired in a data-driven way bypassing the need for an explicit simulation. We present a learning-based approach based on simulated data that predicts stability of towers comprised of wooden blocks under different conditions and quantities related to the potential fall of the towers. The evaluation is carried out on synthetic data and compared to human judgments on the same stimuli.
Tasks
Published 2016-03-31
URL http://arxiv.org/abs/1604.00066v1
PDF http://arxiv.org/pdf/1604.00066v1.pdf
PWC https://paperswithcode.com/paper/to-fall-or-not-to-fall-a-visual-approach-to
Repo
Framework

Low-rank tensor completion: a Riemannian manifold preconditioning approach

Title Low-rank tensor completion: a Riemannian manifold preconditioning approach
Authors Hiroyuki Kasai, Bamdev Mishra
Abstract We propose a novel Riemannian manifold preconditioning approach for the tensor completion problem with rank constraint. A novel Riemannian metric or inner product is proposed that exploits the least-squares structure of the cost function and takes into account the structured symmetry that exists in Tucker decomposition. The specific metric allows to use the versatile framework of Riemannian optimization on quotient manifolds to develop preconditioned nonlinear conjugate gradient and stochastic gradient descent algorithms for batch and online setups, respectively. Concrete matrix representations of various optimization-related ingredients are listed. Numerical comparisons suggest that our proposed algorithms robustly outperform state-of-the-art algorithms across different synthetic and real-world datasets.
Tasks
Published 2016-05-26
URL http://arxiv.org/abs/1605.08257v1
PDF http://arxiv.org/pdf/1605.08257v1.pdf
PWC https://paperswithcode.com/paper/low-rank-tensor-completion-a-riemannian
Repo
Framework

Gambler’s Ruin Bandit Problem

Title Gambler’s Ruin Bandit Problem
Authors Nima Akbarzadeh, Cem Tekin
Abstract In this paper, we propose a new multi-armed bandit problem called the Gambler’s Ruin Bandit Problem (GRBP). In the GRBP, the learner proceeds in a sequence of rounds, where each round is a Markov Decision Process (MDP) with two actions (arms): a continuation action that moves the learner randomly over the state space around the current state; and a terminal action that moves the learner directly into one of the two terminal states (goal and dead-end state). The current round ends when a terminal state is reached, and the learner incurs a positive reward only when the goal state is reached. The objective of the learner is to maximize its long-term reward (expected number of times the goal state is reached), without having any prior knowledge on the state transition probabilities. We first prove a result on the form of the optimal policy for the GRBP. Then, we define the regret of the learner with respect to an omnipotent oracle, which acts optimally in each round, and prove that it increases logarithmically over rounds. We also identify a condition under which the learner’s regret is bounded. A potential application of the GRBP is optimal medical treatment assignment, in which the continuation action corresponds to a conservative treatment and the terminal action corresponds to a risky treatment such as surgery.
Tasks
Published 2016-05-21
URL http://arxiv.org/abs/1605.06651v3
PDF http://arxiv.org/pdf/1605.06651v3.pdf
PWC https://paperswithcode.com/paper/gamblers-ruin-bandit-problem
Repo
Framework

Image-level Classification in Hyperspectral Images using Feature Descriptors, with Application to Face Recognition

Title Image-level Classification in Hyperspectral Images using Feature Descriptors, with Application to Face Recognition
Authors Vivek Sharma, Luc Van Gool
Abstract In this paper, we proposed a novel pipeline for image-level classification in the hyperspectral images. By doing this, we show that the discriminative spectral information at image-level features lead to significantly improved performance in a face recognition task. We also explored the potential of traditional feature descriptors in the hyperspectral images. From our evaluations, we observe that SIFT features outperform the state-of-the-art hyperspectral face recognition methods, and also the other descriptors. With the increasing deployment of hyperspectral sensors in a multitude of applications, we believe that our approach can effectively exploit the spectral information in hyperspectral images, thus beneficial to more accurate classification.
Tasks Face Recognition
Published 2016-05-11
URL http://arxiv.org/abs/1605.03428v1
PDF http://arxiv.org/pdf/1605.03428v1.pdf
PWC https://paperswithcode.com/paper/image-level-classification-in-hyperspectral
Repo
Framework

How Useful is Region-based Classification of Remote Sensing Images in a Deep Learning Framework?

Title How Useful is Region-based Classification of Remote Sensing Images in a Deep Learning Framework?
Authors Nicolas Audebert, Bertrand Le Saux, Sébastien Lefèvre
Abstract In this paper, we investigate the impact of segmentation algorithms as a preprocessing step for classification of remote sensing images in a deep learning framework. Especially, we address the issue of segmenting the image into regions to be classified using pre-trained deep neural networks as feature extractors for an SVM-based classifier. An efficient segmentation as a preprocessing step helps learning by adding a spatially-coherent structure to the data. Therefore, we compare algorithms producing superpixels with more traditional remote sensing segmentation algorithms and measure the variation in terms of classification accuracy. We establish that superpixel algorithms allow for a better classification accuracy as a homogenous and compact segmentation favors better generalization of the training samples.
Tasks
Published 2016-09-22
URL http://arxiv.org/abs/1609.06861v1
PDF http://arxiv.org/pdf/1609.06861v1.pdf
PWC https://paperswithcode.com/paper/how-useful-is-region-based-classification-of
Repo
Framework

Sparsity-driven weighted ensemble classifier

Title Sparsity-driven weighted ensemble classifier
Authors Atilla Ozgur, Hamit Erdem, Fatih Nar
Abstract In this study, a novel sparsity-driven weighted ensemble classifier (SDWEC) that improves classification accuracy and minimizes the number of classifiers is proposed. Using pre-trained classifiers, an ensemble in which base classifiers votes according to assigned weights is formed. These assigned weights directly affect classifier accuracy. In the proposed method, ensemble weights finding problem is modeled as a cost function with the following terms: (a) a data fidelity term aiming to decrease misclassification rate, (b) a sparsity term aiming to decrease the number of classifiers, and (c) a non-negativity constraint on the weights of the classifiers. As the proposed cost function is non-convex thus hard to solve, convex relaxation techniques and novel approximations are employed to obtain a numerically efficient solution. Sparsity term of cost function allows trade-off between accuracy and testing time when needed. The efficiency of SDWEC was tested on 11 datasets and compared with the state-of-the art classifier ensemble methods. The results show that SDWEC provides better or similar accuracy levels using fewer classifiers and reduces testing time for ensemble.
Tasks
Published 2016-10-02
URL https://arxiv.org/abs/1610.00270v3
PDF https://arxiv.org/pdf/1610.00270v3.pdf
PWC https://paperswithcode.com/paper/sparsity-driven-weighted-ensemble-classifier
Repo
Framework

Fuzzy Logic in Narrow Sense with Hedges

Title Fuzzy Logic in Narrow Sense with Hedges
Authors Van Hung Le
Abstract Classical logic has a serious limitation in that it cannot cope with the issues of vagueness and uncertainty into which fall most modes of human reasoning. In order to provide a foundation for human knowledge representation and reasoning in the presence of vagueness, imprecision, and uncertainty, fuzzy logic should have the ability to deal with linguistic hedges, which play a very important role in the modification of fuzzy predicates. In this paper, we extend fuzzy logic in narrow sense with graded syntax, introduced by Novak et al., with many hedge connectives. In one case, each hedge does not have any dual one. In the other case, each hedge can have its own dual one. The resulting logics are shown to also have the Pavelka-style completeness
Tasks
Published 2016-08-29
URL http://arxiv.org/abs/1608.08033v1
PDF http://arxiv.org/pdf/1608.08033v1.pdf
PWC https://paperswithcode.com/paper/fuzzy-logic-in-narrow-sense-with-hedges
Repo
Framework

Robust Active Perception via Data-association aware Belief Space planning

Title Robust Active Perception via Data-association aware Belief Space planning
Authors Shashank Pathak, Antony Thomas, Asaf Feniger, Vadim Indelman
Abstract We develop a belief space planning (BSP) approach that advances the state of the art by incorporating reasoning about data association (DA) within planning, while considering additional sources of uncertainty. Existing BSP approaches typically assume data association is given and perfect, an assumption that can be harder to justify while operating, in the presence of localization uncertainty, in ambiguous and perceptually aliased environments. In contrast, our data association aware belief space planning (DA-BSP) approach explicitly reasons about DA within belief evolution, and as such can better accommodate these challenging real world scenarios. In particular, we show that due to perceptual aliasing, the posterior belief becomes a mixture of probability distribution functions, and design cost functions that measure the expected level of ambiguity and posterior uncertainty. Using these and standard costs (e.g.~control penalty, distance to goal) within the objective function, yields a general framework that reliably represents action impact, and in particular, capable of active disambiguation. Our approach is thus applicable to robust active perception and autonomous navigation in perceptually aliased environments. We demonstrate key aspects in basic and realistic simulations.
Tasks Autonomous Navigation
Published 2016-06-16
URL http://arxiv.org/abs/1606.05124v1
PDF http://arxiv.org/pdf/1606.05124v1.pdf
PWC https://paperswithcode.com/paper/robust-active-perception-via-data-association
Repo
Framework

Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons

Title Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons
Authors Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille
Abstract Deep Convolutional Neural Networks (CNNs) are playing important roles in state-of-the-art visual recognition. This paper focuses on modeling the spatial co-occurrence of neuron responses, which is less studied in the previous work. For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them. The idea that grouping neural words into neural phrases is borrowed from the Bag-of-Visual-Words (BoVW) model. Next, the Geometric Neural Phrase Pooling (GNPP) algorithm is proposed to efficiently encode these neural phrases. GNPP acts as a new type of hidden layer, which punishes the isolated neuron responses after convolution, and can be inserted into a CNN model with little extra computational overhead. Experimental results show that GNPP produces significant and consistent accuracy gain in image classification.
Tasks Image Classification
Published 2016-07-21
URL http://arxiv.org/abs/1607.06514v1
PDF http://arxiv.org/pdf/1607.06514v1.pdf
PWC https://paperswithcode.com/paper/geometric-neural-phrase-pooling-modeling-the
Repo
Framework

Toward a general, scaleable framework for Bayesian teaching with applications to topic models

Title Toward a general, scaleable framework for Bayesian teaching with applications to topic models
Authors Baxter S. Eaves Jr, Patrick Shafto
Abstract Machines, not humans, are the world’s dominant knowledge accumulators but humans remain the dominant decision makers. Interpreting and disseminating the knowledge accumulated by machines requires expertise, time, and is prone to failure. The problem of how best to convey accumulated knowledge from computers to humans is a critical bottleneck in the broader application of machine learning. We propose an approach based on human teaching where the problem is formalized as selecting a small subset of the data that will, with high probability, lead the human user to the correct inference. This approach, though successful for modeling human learning in simple laboratory experiments, has failed to achieve broader relevance due to challenges in formulating general and scalable algorithms. We propose general-purpose teaching via pseudo-marginal sampling and demonstrate the algorithm by teaching topic models. Simulation results show our sampling-based approach: effectively approximates the probability where ground-truth is possible via enumeration, results in data that are markedly different from those expected by random sampling, and speeds learning especially for small amounts of data. Application to movie synopsis data illustrates differences between teaching and random sampling for teaching distributions and specific topics, and demonstrates gains in scalability and applicability to real-world problems.
Tasks Topic Models
Published 2016-05-25
URL http://arxiv.org/abs/1605.07999v1
PDF http://arxiv.org/pdf/1605.07999v1.pdf
PWC https://paperswithcode.com/paper/toward-a-general-scaleable-framework-for
Repo
Framework

Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach

Title Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach
Authors Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmid
Abstract Convolutional neural networks (CNNs) have recently received a lot of attention due to their ability to model local stationary structures in natural images in a multi-scale fashion, when learning all model parameters with supervision. While excellent performance was achieved for image classification when large amounts of labeled visual data are available, their success for un-supervised tasks such as image retrieval has been moderate so far. Our paper focuses on this latter setting and explores several methods for learning patch descriptors without supervision with application to matching and instance-level retrieval. To that effect, we propose a new family of convolutional descriptors for patch representation , based on the recently introduced convolutional kernel networks. We show that our descriptor, named Patch-CKN, performs better than SIFT as well as other convolutional networks learned by artificially introducing supervision and is significantly faster to train. To demonstrate its effectiveness, we perform an extensive evaluation on standard benchmarks for patch and image retrieval where we obtain state-of-the-art results. We also introduce a new dataset called RomePatches, which allows to simultaneously study descriptor performance for patch and image retrieval.
Tasks Image Classification, Image Retrieval
Published 2016-03-01
URL http://arxiv.org/abs/1603.00438v1
PDF http://arxiv.org/pdf/1603.00438v1.pdf
PWC https://paperswithcode.com/paper/convolutional-patch-representations-for-image
Repo
Framework

Exploiting Structure Sparsity for Covariance-based Visual Representation

Title Exploiting Structure Sparsity for Covariance-based Visual Representation
Authors Jianjia Zhang, Lei Wang, Luping Zhou, Wanqing Li
Abstract The past few years have witnessed increasing research interest on covariance-based feature representation. A variety of methods have been proposed to boost its efficacy, with some recent ones resorting to nonlinear kernel technique. Noting that the essence of this feature representation is to characterise the underlying structure of visual features, this paper argues that an equally, if not more, important approach to boosting its efficacy shall be to improve the quality of this characterisation. Following this idea, we propose to exploit the structure sparsity of visual features in skeletal human action recognition, and compute sparse inverse covariance estimate (SICE) as feature representation. We discuss the advantage of this new representation on dealing with small sample, high dimensionality, and modelling capability. Furthermore, utilising the monotonicity property of SICE, we efficiently generate a hierarchy of SICE matrices to characterise the structure of visual features at different sparsity levels, and two discriminative learning algorithms are then developed to adaptively integrate them to perform recognition. As demonstrated by extensive experiments, the proposed representation leads to significantly improved recognition performance over the state-of-the-art comparable methods. In particular, as a method fully based on linear technique, it is comparable or even better than those employing nonlinear kernel technique. This result well demonstrates the value of exploiting structure sparsity for covariance-based feature representation.
Tasks Temporal Action Localization
Published 2016-10-27
URL http://arxiv.org/abs/1610.08619v2
PDF http://arxiv.org/pdf/1610.08619v2.pdf
PWC https://paperswithcode.com/paper/exploiting-structure-sparsity-for-covariance
Repo
Framework

Efficient Likelihood Bayesian Constrained Local Model

Title Efficient Likelihood Bayesian Constrained Local Model
Authors Hailiang Li, Kin-Man Lam, Man-Yau Chiu, Kangheng Wu, Zhibin Lei
Abstract The constrained local model (CLM) proposes a paradigm that the locations of a set of local landmark detectors are constrained to lie in a subspace, spanned by a shape point distribution model (PDM). Fitting the model to an object involves two steps. A response map, which represents the likelihood of the location of a landmark, is first computed for each landmark using local-texture detectors. Then, an optimal PDM is determined by jointly maximizing all the response maps simultaneously, with a global shape constraint. This global optimization can be considered as a Bayesian inference problem, where the posterior distribution of the shape parameters, as well as the pose parameters, can be inferred using maximum a posteriori (MAP). In this paper, we present a cascaded face-alignment approach, which employs random-forest regressors to estimate the positions of each landmark, as a likelihood term, efficiently in the CLM model. Interpretation from CLM framework, this algorithm is named as an efficient likelihood Bayesian constrained local model (elBCLM). Furthermore, in each stage of the regressors, the PDM non-rigid parameters of previous stage can work as shape clues for training each stage regressors. Experimental results on benchmarks show our approach achieve about 3 to 5 times speed-up compared with CLM models and improve around 10% on fitting quality compare with the same setting regression models.
Tasks Bayesian Inference, Face Alignment
Published 2016-11-30
URL http://arxiv.org/abs/1611.09956v1
PDF http://arxiv.org/pdf/1611.09956v1.pdf
PWC https://paperswithcode.com/paper/efficient-likelihood-bayesian-constrained
Repo
Framework

Cascaded Face Alignment via Intimacy Definition Feature

Title Cascaded Face Alignment via Intimacy Definition Feature
Authors Hailiang Li, Kin-Man Lam, Edmond M. Y. Chiu, Kangheng Wu, Zhibin Lei
Abstract In this paper, we present a random-forest based fast cascaded regression model for face alignment, via a novel local feature. Our proposed local lightweight feature, namely intimacy definition feature (IDF), is more discriminative than landmark pose-indexed feature, more efficient than histogram of oriented gradients (HOG) feature and scale-invariant feature transform (SIFT) feature, and more compact than the local binary feature (LBF). Experimental results show that our approach achieves state-of-the-art performance when tested on the most challenging datasets. Compared with an LBF-based algorithm, our method can achieve about two times the speed-up and more than 20% improvement, in terms of alignment accuracy measurement, and save an order of magnitude of memory requirement.
Tasks Face Alignment
Published 2016-11-21
URL http://arxiv.org/abs/1611.06642v2
PDF http://arxiv.org/pdf/1611.06642v2.pdf
PWC https://paperswithcode.com/paper/cascaded-face-alignment-via-intimacy
Repo
Framework

Mask-off: Synthesizing Face Images in the Presence of Head-mounted Displays

Title Mask-off: Synthesizing Face Images in the Presence of Head-mounted Displays
Authors Yajie Zhao, Qingguo Xu, Xinyu Huang, Ruigang Yang
Abstract A head-mounted display (HMD) could be an important component of augmented reality system. However, as the upper face region is seriously occluded by the device, the user experience could be affected in applications such as telecommunication and multi-player video games. In this paper, we first present a novel experimental setup that consists of two near-infrared (NIR) cameras to point to the eye regions and one visible-light RGB camera to capture the visible face region. The main purpose of this paper is to synthesize realistic face images without occlusions based on the images captured by these cameras. To this end, we propose a novel synthesis framework that contains four modules: 3D head reconstruction, face alignment and tracking, face synthesis, and eye synthesis. In face synthesis, we propose a novel algorithm that can robustly align and track a personalized 3D head model given a face that is severely occluded by the HMD. In eye synthesis, in order to generate accurate eye movements and dynamic wrinkle variations around eye regions, we propose another novel algorithm to colorize the NIR eye images and further remove the “red eye” effects caused by the colorization. Results show that both hardware setup and system framework are robust to synthesize realistic face images in video sequences.
Tasks Colorization, Face Alignment, Face Generation
Published 2016-10-26
URL http://arxiv.org/abs/1610.08481v2
PDF http://arxiv.org/pdf/1610.08481v2.pdf
PWC https://paperswithcode.com/paper/mask-off-synthesizing-face-images-in-the
Repo
Framework
comments powered by Disqus