May 6, 2019

3247 words 16 mins read

Paper Group ANR 307

Network Maximal Correlation. Improved Image Boundaries for Better Video Segmentation. Latent Constrained Correlation Filters for Object Localization. Refining Geometry from Depth Sensors using IR Shading Images. A Unified Tensor-based Active Appearance Face Model. Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic. …

Network Maximal Correlation


Title	Network Maximal Correlation
Authors	Soheil Feizi, Ali Makhdoumi, Ken Duffy, Muriel Medard, Manolis Kellis
Abstract	We introduce Network Maximal Correlation (NMC) as a multivariate measure of nonlinear association among random variables. NMC is defined via an optimization that infers transformations of variables by maximizing aggregate inner products between transformed variables. For finite discrete and jointly Gaussian random variables, we characterize a solution of the NMC optimization using basis expansion of functions over appropriate basis functions. For finite discrete variables, we propose an algorithm based on alternating conditional expectation to determine NMC. Moreover we propose a distributed algorithm to compute an approximation of NMC for large and dense graphs using graph partitioning. For finite discrete variables, we show that the probability of discrepancy greater than any given level between NMC and NMC computed using empirical distributions decays exponentially fast as the sample size grows. For jointly Gaussian variables, we show that under some conditions the NMC optimization is an instance of the Max-Cut problem. We then illustrate an application of NMC in inference of graphical model for bijective functions of jointly Gaussian variables. Finally, we show NMC’s utility in a data application of learning nonlinear dependencies among genes in a cancer dataset.
Tasks	graph partitioning
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04789v2
PDF	http://arxiv.org/pdf/1606.04789v2.pdf
PWC	https://paperswithcode.com/paper/network-maximal-correlation
Repo
Framework

Improved Image Boundaries for Better Video Segmentation


Title	Improved Image Boundaries for Better Video Segmentation
Authors	Anna Khoreva, Rodrigo Benenson, Fabio Galasso, Matthias Hein, Bernt Schiele
Abstract	Graph-based video segmentation methods rely on superpixels as starting point. While most previous work has focused on the construction of the graph edges and weights as well as solving the graph partitioning problem, this paper focuses on better superpixels for video segmentation. We demonstrate by a comparative analysis that superpixels extracted from boundaries perform best, and show that boundary estimation can be significantly improved via image and time domain cues. With superpixels generated from our better boundaries we observe consistent improvement for two video segmentation methods in two different datasets.
Tasks	graph partitioning, Video Semantic Segmentation
Published	2016-05-12
URL	http://arxiv.org/abs/1605.03718v2
PDF	http://arxiv.org/pdf/1605.03718v2.pdf
PWC	https://paperswithcode.com/paper/improved-image-boundaries-for-better-video
Repo
Framework

Latent Constrained Correlation Filters for Object Localization


Title	Latent Constrained Correlation Filters for Object Localization
Authors	Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen
Abstract	There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling. We consider this observation to be important because having the solution sampling available makes the variable distribution estimation, which is a problem in many learning-related applications, more tractable. In this paper, we implement this idea on correlation filter, which has attracted much attention in the past few years due to its high performance with a low computational cost. More specifically, we propose a new method, named latent constrained correlation filters (LCCF) by mapping the correlation filters to a given latent subspace, in which we establish a new learning framework that embeds distribution-related constraints into the original problem. We further introduce a subspace based alternating direction method of multipliers (SADMM) to efficiently solve the optimization problem, which is proved to converge at the saddle point. Our approach is successfully applied to two different tasks inclduing eye localization and car detection. Extensive experiments demonstrate that LCCF outperforms the state-of-the-art methods when samples are suffered from noise and occlusion.
Tasks	Object Localization
Published	2016-06-07
URL	http://arxiv.org/abs/1606.02170v2
PDF	http://arxiv.org/pdf/1606.02170v2.pdf
PWC	https://paperswithcode.com/paper/latent-constrained-correlation-filters-for
Repo
Framework

Refining Geometry from Depth Sensors using IR Shading Images


Title	Refining Geometry from Depth Sensors using IR Shading Images
Authors	Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon
Abstract	We propose a method to refine geometry of 3D meshes from a consumer level depth camera, e.g. Kinect, by exploiting shading cues captured from an infrared (IR) camera. A major benefit to using an IR camera instead of an RGB camera is that the IR images captured are narrow band images that filter out most undesired ambient light, which makes our system robust against natural indoor illumination. Moreover, for many natural objects with colorful textures in the visible spectrum, the subjects appear to have a uniform albedo in the IR spectrum. Based on our analyses on the IR projector light of the Kinect, we define a near light source IR shading model that describes the captured intensity as a function of surface normals, albedo, lighting direction, and distance between light source and surface points. To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion. Our approach directly operates on the mesh model for geometry refinement. We ran experiments on our algorithm for geometries captured by both the Kinect I and Kinect II, as the depth acquisition in Kinect I is based on a structured-light technique and that of the Kinect II is based on a time-of-flight (ToF) technology. The effectiveness of our approach is demonstrated through several challenging real-world examples. We have also performed a user study to evaluate the quality of the mesh models before and after our refinements.
Tasks
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05204v1
PDF	http://arxiv.org/pdf/1608.05204v1.pdf
PWC	https://paperswithcode.com/paper/refining-geometry-from-depth-sensors-using-ir
Repo
Framework

A Unified Tensor-based Active Appearance Face Model


Title	A Unified Tensor-based Active Appearance Face Model
Authors	Zhen-Hua Feng, Josef Kittler, William Christmas, Xiao-Jun Wu
Abstract	Appearance variations result in many difficulties in face image analysis. To deal with this challenge, we present a Unified Tensor-based Active Appearance Model (UT-AAM) for jointly modelling the geometry and texture information of 2D faces. For each type of face information, namely shape and texture, we construct a unified tensor model capturing all relevant appearance variations. This contrasts with the variation-specific models of the classical tensor AAM. To achieve the unification across pose variations, a strategy for dealing with self-occluded faces is proposed to obtain consistent shape and texture representations of pose-varied faces. In addition, our UT-AAM is capable of constructing the model from an incomplete training dataset, using tensor completion methods. Last, we use an effective cascaded-regression-based method for UT-AAM fitting. With these advancements, the utility of UT-AAM in practice is considerably enhanced. As an example, we demonstrate the improvements in training facial landmark detectors through the use of UT-AAM to synthesise a large number of virtual samples. Experimental results obtained using the Multi-PIE and 300-W face datasets demonstrate the merits of the proposed approach.
Tasks
Published	2016-12-30
URL	http://arxiv.org/abs/1612.09548v2
PDF	http://arxiv.org/pdf/1612.09548v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-tensor-based-active-appearance-face
Repo
Framework

Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic


Title	Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic
Authors	Somak Aditya, Yezhou Yang, Chitta Baral, Yiannis Aloimonos
Abstract	In this work, we explore a genre of puzzles (“image riddles”) which involves a set of images and a question. Answering these puzzles require both capabilities involving visual detection (including object, activity recognition) and, knowledge-based or commonsense reasoning. We compile a dataset of over 3k riddles where each riddle consists of 4 images and a groundtruth answer. The annotations are validated using crowd-sourced evaluation. We also define an automatic evaluation metric to track future progress. Our task bears similarity with the commonly known IQ tasks such as analogy solving, sequence filling that are often used to test intelligence. We develop a Probabilistic Reasoning-based approach that utilizes probabilistic commonsense knowledge to answer these riddles with a reasonable accuracy. We demonstrate the results of our approach using both automatic and human evaluations. Our approach achieves some promising results for these riddles and provides a strong baseline for future attempts. We make the entire dataset and related materials publicly available to the community in ImageRiddle Website (http://bit.ly/22f9Ala).
Tasks	Activity Recognition, Question Answering
Published	2016-11-17
URL	http://arxiv.org/abs/1611.05896v1
PDF	http://arxiv.org/pdf/1611.05896v1.pdf
PWC	https://paperswithcode.com/paper/answering-image-riddles-using-vision-and
Repo
Framework

Network structure, metadata and the prediction of missing nodes and annotations


Title	Network structure, metadata and the prediction of missing nodes and annotations
Authors	Darko Hric, Tiago P. Peixoto, Santo Fortunato
Abstract	The empirical validation of community detection methods is often based on available annotations on the nodes that serve as putative indicators of the large-scale network structure. Most often, the suitability of the annotations as topological descriptors itself is not assessed, and without this it is not possible to ultimately distinguish between actual shortcomings of the community detection algorithms on one hand, and the incompleteness, inaccuracy or structured nature of the data annotations themselves on the other. In this work we present a principled method to access both aspects simultaneously. We construct a joint generative model for the data and metadata, and a nonparametric Bayesian framework to infer its parameters from annotated datasets. We assess the quality of the metadata not according to its direct alignment with the network communities, but rather in its capacity to predict the placement of edges in the network. We also show how this feature can be used to predict the connections to missing nodes when only the metadata is available, as well as missing metadata. By investigating a wide range of datasets, we show that while there are seldom exact agreements between metadata tokens and the inferred data groups, the metadata is often informative of the network structure nevertheless, and can improve the prediction of missing nodes. This shows that the method uncovers meaningful patterns in both the data and metadata, without requiring or expecting a perfect agreement between the two.
Tasks	Community Detection
Published	2016-04-01
URL	http://arxiv.org/abs/1604.00255v2
PDF	http://arxiv.org/pdf/1604.00255v2.pdf
PWC	https://paperswithcode.com/paper/network-structure-metadata-and-the-prediction
Repo
Framework

Measuring and modeling the perception of natural and unconstrained gaze in humans and machines


Title	Measuring and modeling the perception of natural and unconstrained gaze in humans and machines
Authors	Daniel Harari, Tao Gao, Nancy Kanwisher, Joshua Tenenbaum, Shimon Ullman
Abstract	Humans are remarkably adept at interpreting the gaze direction of other individuals in their surroundings. This skill is at the core of the ability to engage in joint visual attention, which is essential for establishing social interactions. How accurate are humans in determining the gaze direction of others in lifelike scenes, when they can move their heads and eyes freely, and what are the sources of information for the underlying perceptual processes? These questions pose a challenge from both empirical and computational perspectives, due to the complexity of the visual input in real-life situations. Here we measure empirically human accuracy in perceiving the gaze direction of others in lifelike scenes, and study computationally the sources of information and representations underlying this cognitive capacity. We show that humans perform better in face-to-face conditions compared with recorded conditions, and that this advantage is not due to the availability of input dynamics. We further show that humans are still performing well when only the eyes-region is visible, rather than the whole face. We develop a computational model, which replicates the pattern of human performance, including the finding that the eyes-region contains on its own, the required information for estimating both head orientation and direction of gaze. Consistent with neurophysiological findings on task-specific face regions in the brain, the learned computational representations reproduce perceptual effects such as the Wollaston illusion, when trained to estimate direction of gaze, but not when trained to recognize objects or faces.
Tasks
Published	2016-11-29
URL	http://arxiv.org/abs/1611.09819v1
PDF	http://arxiv.org/pdf/1611.09819v1.pdf
PWC	https://paperswithcode.com/paper/measuring-and-modeling-the-perception-of
Repo
Framework

Learning Sparse Additive Models with Interactions in High Dimensions


Title	Learning Sparse Additive Models with Interactions in High Dimensions
Authors	Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause
Abstract	A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}\phi_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $\mathcal{S} \ll d$. Assuming $\phi_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum_{p \in \mathcal{S}_1}\phi_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}\phi_{(l,l^{\prime})} (x_{l},x_{l^{\prime}}).$$ Assuming $\phi_{p},\phi_{(l,l^{\prime})}$, $\mathcal{S}_1$ and, $\mathcal{S}_2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}_1,\mathcal{S}_2$. Consequently, this also enables us to estimate the underlying $\phi_p, \phi_{(l,l^{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise – either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings.
Tasks
Published	2016-04-18
URL	http://arxiv.org/abs/1604.05307v1
PDF	http://arxiv.org/pdf/1604.05307v1.pdf
PWC	https://paperswithcode.com/paper/learning-sparse-additive-models-with
Repo
Framework

Universal probability-free prediction


Title	Universal probability-free prediction
Authors	Vladimir Vovk, Dusko Pavlovic
Abstract	We construct universal prediction systems in the spirit of Popper’s falsifiability and Kolmogorov complexity and randomness. These prediction systems do not depend on any statistical assumptions (but under the IID assumption they dominate, to within the usual accuracy, conformal prediction). Our constructions give rise to a theory of algorithmic complexity and randomness of time containing analogues of several notions and results of the classical theory of Kolmogorov complexity and randomness.
Tasks
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04283v2
PDF	http://arxiv.org/pdf/1603.04283v2.pdf
PWC	https://paperswithcode.com/paper/universal-probability-free-prediction
Repo
Framework

Reweighted Low-Rank Tensor Decomposition based on t-SVD and its Applications in Video Denoising


Title	Reweighted Low-Rank Tensor Decomposition based on t-SVD and its Applications in Video Denoising
Authors	M. Baburaj, Sudhish N. George
Abstract	The t-SVD based Tensor Robust Principal Component Analysis (TRPCA) decomposes low rank multi-linear signal corrupted by gross errors into low multi-rank and sparse component by simultaneously minimizing tensor nuclear norm and l 1 norm. But if the multi-rank of the signal is considerably large and/or large amount of noise is present, the performance of TRPCA deteriorates. To overcome this problem, this paper proposes a new efficient iterative reweighted tensor decomposition scheme based on t-SVD which significantly improves tensor multi-rank in TRPCA. Further, the sparse component of the tensor is also recovered by reweighted l 1 norm which enhances the accuracy of decomposition. The effectiveness of the proposed method is established by applying it to the video denoising problem and the experimental results reveal that the proposed algorithm outperforms its counterparts.
Tasks	Denoising, Video Denoising
Published	2016-11-18
URL	http://arxiv.org/abs/1611.05963v4
PDF	http://arxiv.org/pdf/1611.05963v4.pdf
PWC	https://paperswithcode.com/paper/reweighted-low-rank-tensor-decomposition
Repo
Framework

Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression


Title	Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression
Authors	Ning Xu, Jian Hong, Timothy C. G. Fisher
Abstract	We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalization errors (GE). We derive upper bounds for the GE, which depend on sample sizes, model complexity and the distribution of the loss function. The upper bounds can be used to evaluate the GA of a model, ex ante. We propose using generalization error minimization (GEM) as a framework for model selection. Using GEM, we are able to unify a big class of penalized regression estimators, including lasso, ridge and bridge, under the same set of assumptions. We establish finite-sample and asymptotic properties (including $\mathcal{L}_2$-consistency) of the GEM estimator for both the $n \geqslant p$ and the $n < p$ cases. We also derive the $\mathcal{L}_2$-distance between the penalized and corresponding unpenalized regression estimates. In practice, GEM can be implemented by validation or cross-validation. We show that the GE bounds can be used for selecting the optimal number of folds in $K$-fold cross-validation. We propose a variant of $R^2$, the $GR^2$, as a measure of GA, which considers both both in-sample and out-of-sample goodness of fit. Simulations are used to demonstrate our key results.
Tasks	Model Selection
Published	2016-10-18
URL	http://arxiv.org/abs/1610.05448v1
PDF	http://arxiv.org/pdf/1610.05448v1.pdf
PWC	https://paperswithcode.com/paper/generalization-error-minimization-a-new
Repo
Framework

Learning Binary Codes and Binary Weights for Efficient Classification


Title	Learning Binary Codes and Binary Weights for Efficient Classification
Authors	Fumin Shen, Yadong Mu, Wei Liu, Yang Yang, Heng Tao Shen
Abstract	This paper proposes a generic formulation that significantly expedites the training and deployment of image classification models, particularly under the scenarios of many image categories and high feature dimensions. As a defining property, our method represents both the images and learned classifiers using binary hash codes, which are simultaneously learned from the training data. Classifying an image thereby reduces to computing the Hamming distance between the binary codes of the image and classifiers and selecting the class with minimal Hamming distance. Conventionally, compact hash codes are primarily used for accelerating image search. Our work is first of its kind to represent classifiers using binary codes. Specifically, we formulate multi-class image classification as an optimization problem over binary variables. The optimization alternatively proceeds over the binary classifiers and image hash codes. Profiting from the special property of binary codes, we show that the sub-problems can be efficiently solved through either a binary quadratic program (BQP) or linear program. In particular, for attacking the BQP problem, we propose a novel bit-flipping procedure which enjoys high efficacy and local optimality guarantee. Our formulation supports a large family of empirical loss functions and is here instantiated by exponential / hinge losses. Comprehensive evaluations are conducted on several representative image benchmarks. The experiments consistently observe reduced complexities of model training and deployment, without sacrifice of accuracies.
Tasks	Image Classification, Image Retrieval
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04116v1
PDF	http://arxiv.org/pdf/1603.04116v1.pdf
PWC	https://paperswithcode.com/paper/learning-binary-codes-and-binary-weights-for
Repo
Framework

Multi-agent evolutionary systems for the generation of complex virtual worlds


Title	Multi-agent evolutionary systems for the generation of complex virtual worlds
Authors	Jan Kruse, Andy M. Connor
Abstract	Modern films, games and virtual reality applications are dependent on convincing computer graphics. Highly complex models are a requirement for the successful delivery of many scenes and environments. While workflows such as rendering, compositing and animation have been streamlined to accommodate increasing demands, modelling complex models is still a laborious task. This paper introduces the computational benefits of an Interactive Genetic Algorithm (IGA) to computer graphics modelling while compensating the effects of user fatigue, a common issue with Interactive Evolutionary Computation. An intelligent agent is used in conjunction with an IGA that offers the potential to reduce the effects of user fatigue by learning from the choices made by the human designer and directing the search accordingly. This workflow accelerates the layout and distribution of basic elements to form complex models. It captures the designer’s intent through interaction, and encourages playful discovery.
Tasks
Published	2016-04-20
URL	http://arxiv.org/abs/1604.05792v1
PDF	http://arxiv.org/pdf/1604.05792v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-evolutionary-systems-for-the
Repo
Framework

Biobjective Performance Assessment with the COCO Platform


Title	Biobjective Performance Assessment with the COCO Platform
Authors	Dimo Brockhoff, Tea Tušar, Dejan Tušar, Tobias Wagner, Nikolaus Hansen, Anne Auger
Abstract	This document details the rationales behind assessing the performance of numerical black-box optimizers on multi-objective problems within the COCO platform and in particular on the biobjective test suite bbob-biobj. The evaluation is based on a hypervolume of all non-dominated solutions in the archive of candidate solutions and measures the runtime until the hypervolume value succeeds prescribed target values.
Tasks
Published	2016-05-05
URL	http://arxiv.org/abs/1605.01746v1
PDF	http://arxiv.org/pdf/1605.01746v1.pdf
PWC	https://paperswithcode.com/paper/biobjective-performance-assessment-with-the
Repo
Framework