Paper Group ANR 307
![Paper Group ANR 307](/2016/images/pwc/paper-arxiv_hu144ec288a26b3e360d673e256787de3e_28623_900x500_fit_q75_box.jpg)
Network Maximal Correlation. Improved Image Boundaries for Better Video Segmentation. Latent Constrained Correlation Filters for Object Localization. Refining Geometry from Depth Sensors using IR Shading Images. A Unified Tensor-based Active Appearance Face Model. Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic. …
Network Maximal Correlation
Title | Network Maximal Correlation |
Authors | Soheil Feizi, Ali Makhdoumi, Ken Duffy, Muriel Medard, Manolis Kellis |
Abstract | We introduce Network Maximal Correlation (NMC) as a multivariate measure of nonlinear association among random variables. NMC is defined via an optimization that infers transformations of variables by maximizing aggregate inner products between transformed variables. For finite discrete and jointly Gaussian random variables, we characterize a solution of the NMC optimization using basis expansion of functions over appropriate basis functions. For finite discrete variables, we propose an algorithm based on alternating conditional expectation to determine NMC. Moreover we propose a distributed algorithm to compute an approximation of NMC for large and dense graphs using graph partitioning. For finite discrete variables, we show that the probability of discrepancy greater than any given level between NMC and NMC computed using empirical distributions decays exponentially fast as the sample size grows. For jointly Gaussian variables, we show that under some conditions the NMC optimization is an instance of the Max-Cut problem. We then illustrate an application of NMC in inference of graphical model for bijective functions of jointly Gaussian variables. Finally, we show NMC’s utility in a data application of learning nonlinear dependencies among genes in a cancer dataset. |
Tasks | graph partitioning |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04789v2 |
http://arxiv.org/pdf/1606.04789v2.pdf | |
PWC | https://paperswithcode.com/paper/network-maximal-correlation |
Repo | |
Framework | |
Improved Image Boundaries for Better Video Segmentation
Title | Improved Image Boundaries for Better Video Segmentation |
Authors | Anna Khoreva, Rodrigo Benenson, Fabio Galasso, Matthias Hein, Bernt Schiele |
Abstract | Graph-based video segmentation methods rely on superpixels as starting point. While most previous work has focused on the construction of the graph edges and weights as well as solving the graph partitioning problem, this paper focuses on better superpixels for video segmentation. We demonstrate by a comparative analysis that superpixels extracted from boundaries perform best, and show that boundary estimation can be significantly improved via image and time domain cues. With superpixels generated from our better boundaries we observe consistent improvement for two video segmentation methods in two different datasets. |
Tasks | graph partitioning, Video Semantic Segmentation |
Published | 2016-05-12 |
URL | http://arxiv.org/abs/1605.03718v2 |
http://arxiv.org/pdf/1605.03718v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-image-boundaries-for-better-video |
Repo | |
Framework | |
Latent Constrained Correlation Filters for Object Localization
Title | Latent Constrained Correlation Filters for Object Localization |
Authors | Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen |
Abstract | There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling. We consider this observation to be important because having the solution sampling available makes the variable distribution estimation, which is a problem in many learning-related applications, more tractable. In this paper, we implement this idea on correlation filter, which has attracted much attention in the past few years due to its high performance with a low computational cost. More specifically, we propose a new method, named latent constrained correlation filters (LCCF) by mapping the correlation filters to a given latent subspace, in which we establish a new learning framework that embeds distribution-related constraints into the original problem. We further introduce a subspace based alternating direction method of multipliers (SADMM) to efficiently solve the optimization problem, which is proved to converge at the saddle point. Our approach is successfully applied to two different tasks inclduing eye localization and car detection. Extensive experiments demonstrate that LCCF outperforms the state-of-the-art methods when samples are suffered from noise and occlusion. |
Tasks | Object Localization |
Published | 2016-06-07 |
URL | http://arxiv.org/abs/1606.02170v2 |
http://arxiv.org/pdf/1606.02170v2.pdf | |
PWC | https://paperswithcode.com/paper/latent-constrained-correlation-filters-for |
Repo | |
Framework | |
Refining Geometry from Depth Sensors using IR Shading Images
Title | Refining Geometry from Depth Sensors using IR Shading Images |
Authors | Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon |
Abstract | We propose a method to refine geometry of 3D meshes from a consumer level depth camera, e.g. Kinect, by exploiting shading cues captured from an infrared (IR) camera. A major benefit to using an IR camera instead of an RGB camera is that the IR images captured are narrow band images that filter out most undesired ambient light, which makes our system robust against natural indoor illumination. Moreover, for many natural objects with colorful textures in the visible spectrum, the subjects appear to have a uniform albedo in the IR spectrum. Based on our analyses on the IR projector light of the Kinect, we define a near light source IR shading model that describes the captured intensity as a function of surface normals, albedo, lighting direction, and distance between light source and surface points. To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion. Our approach directly operates on the mesh model for geometry refinement. We ran experiments on our algorithm for geometries captured by both the Kinect I and Kinect II, as the depth acquisition in Kinect I is based on a structured-light technique and that of the Kinect II is based on a time-of-flight (ToF) technology. The effectiveness of our approach is demonstrated through several challenging real-world examples. We have also performed a user study to evaluate the quality of the mesh models before and after our refinements. |
Tasks | |
Published | 2016-08-18 |
URL | http://arxiv.org/abs/1608.05204v1 |
http://arxiv.org/pdf/1608.05204v1.pdf | |
PWC | https://paperswithcode.com/paper/refining-geometry-from-depth-sensors-using-ir |
Repo | |
Framework | |
A Unified Tensor-based Active Appearance Face Model
Title | A Unified Tensor-based Active Appearance Face Model |
Authors | Zhen-Hua Feng, Josef Kittler, William Christmas, Xiao-Jun Wu |
Abstract | Appearance variations result in many difficulties in face image analysis. To deal with this challenge, we present a Unified Tensor-based Active Appearance Model (UT-AAM) for jointly modelling the geometry and texture information of 2D faces. For each type of face information, namely shape and texture, we construct a unified tensor model capturing all relevant appearance variations. This contrasts with the variation-specific models of the classical tensor AAM. To achieve the unification across pose variations, a strategy for dealing with self-occluded faces is proposed to obtain consistent shape and texture representations of pose-varied faces. In addition, our UT-AAM is capable of constructing the model from an incomplete training dataset, using tensor completion methods. Last, we use an effective cascaded-regression-based method for UT-AAM fitting. With these advancements, the utility of UT-AAM in practice is considerably enhanced. As an example, we demonstrate the improvements in training facial landmark detectors through the use of UT-AAM to synthesise a large number of virtual samples. Experimental results obtained using the Multi-PIE and 300-W face datasets demonstrate the merits of the proposed approach. |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09548v2 |
http://arxiv.org/pdf/1612.09548v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-tensor-based-active-appearance-face |
Repo | |
Framework | |
Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic
Title | Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic |
Authors | Somak Aditya, Yezhou Yang, Chitta Baral, Yiannis Aloimonos |
Abstract | In this work, we explore a genre of puzzles (“image riddles”) which involves a set of images and a question. Answering these puzzles require both capabilities involving visual detection (including object, activity recognition) and, knowledge-based or commonsense reasoning. We compile a dataset of over 3k riddles where each riddle consists of 4 images and a groundtruth answer. The annotations are validated using crowd-sourced evaluation. We also define an automatic evaluation metric to track future progress. Our task bears similarity with the commonly known IQ tasks such as analogy solving, sequence filling that are often used to test intelligence. We develop a Probabilistic Reasoning-based approach that utilizes probabilistic commonsense knowledge to answer these riddles with a reasonable accuracy. We demonstrate the results of our approach using both automatic and human evaluations. Our approach achieves some promising results for these riddles and provides a strong baseline for future attempts. We make the entire dataset and related materials publicly available to the community in ImageRiddle Website (http://bit.ly/22f9Ala). |
Tasks | Activity Recognition, Question Answering |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05896v1 |
http://arxiv.org/pdf/1611.05896v1.pdf | |
PWC | https://paperswithcode.com/paper/answering-image-riddles-using-vision-and |
Repo | |
Framework | |
Network structure, metadata and the prediction of missing nodes and annotations
Title | Network structure, metadata and the prediction of missing nodes and annotations |
Authors | Darko Hric, Tiago P. Peixoto, Santo Fortunato |
Abstract | The empirical validation of community detection methods is often based on available annotations on the nodes that serve as putative indicators of the large-scale network structure. Most often, the suitability of the annotations as topological descriptors itself is not assessed, and without this it is not possible to ultimately distinguish between actual shortcomings of the community detection algorithms on one hand, and the incompleteness, inaccuracy or structured nature of the data annotations themselves on the other. In this work we present a principled method to access both aspects simultaneously. We construct a joint generative model for the data and metadata, and a nonparametric Bayesian framework to infer its parameters from annotated datasets. We assess the quality of the metadata not according to its direct alignment with the network communities, but rather in its capacity to predict the placement of edges in the network. We also show how this feature can be used to predict the connections to missing nodes when only the metadata is available, as well as missing metadata. By investigating a wide range of datasets, we show that while there are seldom exact agreements between metadata tokens and the inferred data groups, the metadata is often informative of the network structure nevertheless, and can improve the prediction of missing nodes. This shows that the method uncovers meaningful patterns in both the data and metadata, without requiring or expecting a perfect agreement between the two. |
Tasks | Community Detection |
Published | 2016-04-01 |
URL | http://arxiv.org/abs/1604.00255v2 |
http://arxiv.org/pdf/1604.00255v2.pdf | |
PWC | https://paperswithcode.com/paper/network-structure-metadata-and-the-prediction |
Repo | |
Framework | |
Measuring and modeling the perception of natural and unconstrained gaze in humans and machines
Title | Measuring and modeling the perception of natural and unconstrained gaze in humans and machines |
Authors | Daniel Harari, Tao Gao, Nancy Kanwisher, Joshua Tenenbaum, Shimon Ullman |
Abstract | Humans are remarkably adept at interpreting the gaze direction of other individuals in their surroundings. This skill is at the core of the ability to engage in joint visual attention, which is essential for establishing social interactions. How accurate are humans in determining the gaze direction of others in lifelike scenes, when they can move their heads and eyes freely, and what are the sources of information for the underlying perceptual processes? These questions pose a challenge from both empirical and computational perspectives, due to the complexity of the visual input in real-life situations. Here we measure empirically human accuracy in perceiving the gaze direction of others in lifelike scenes, and study computationally the sources of information and representations underlying this cognitive capacity. We show that humans perform better in face-to-face conditions compared with recorded conditions, and that this advantage is not due to the availability of input dynamics. We further show that humans are still performing well when only the eyes-region is visible, rather than the whole face. We develop a computational model, which replicates the pattern of human performance, including the finding that the eyes-region contains on its own, the required information for estimating both head orientation and direction of gaze. Consistent with neurophysiological findings on task-specific face regions in the brain, the learned computational representations reproduce perceptual effects such as the Wollaston illusion, when trained to estimate direction of gaze, but not when trained to recognize objects or faces. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09819v1 |
http://arxiv.org/pdf/1611.09819v1.pdf | |
PWC | https://paperswithcode.com/paper/measuring-and-modeling-the-perception-of |
Repo | |
Framework | |
Learning Sparse Additive Models with Interactions in High Dimensions
Title | Learning Sparse Additive Models with Interactions in High Dimensions |
Authors | Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause |
Abstract | A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}\phi_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $\mathcal{S} \ll d$. Assuming $\phi_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum_{p \in \mathcal{S}_1}\phi_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}\phi_{(l,l^{\prime})} (x_{l},x_{l^{\prime}}).$$ Assuming $\phi_{p},\phi_{(l,l^{\prime})}$, $\mathcal{S}_1$ and, $\mathcal{S}_2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}_1,\mathcal{S}_2$. Consequently, this also enables us to estimate the underlying $\phi_p, \phi_{(l,l^{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise – either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings. |
Tasks | |
Published | 2016-04-18 |
URL | http://arxiv.org/abs/1604.05307v1 |
http://arxiv.org/pdf/1604.05307v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-sparse-additive-models-with |
Repo | |
Framework | |
Universal probability-free prediction
Title | Universal probability-free prediction |
Authors | Vladimir Vovk, Dusko Pavlovic |
Abstract | We construct universal prediction systems in the spirit of Popper’s falsifiability and Kolmogorov complexity and randomness. These prediction systems do not depend on any statistical assumptions (but under the IID assumption they dominate, to within the usual accuracy, conformal prediction). Our constructions give rise to a theory of algorithmic complexity and randomness of time containing analogues of several notions and results of the classical theory of Kolmogorov complexity and randomness. |
Tasks | |
Published | 2016-03-14 |
URL | http://arxiv.org/abs/1603.04283v2 |
http://arxiv.org/pdf/1603.04283v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-probability-free-prediction |
Repo | |
Framework | |
Reweighted Low-Rank Tensor Decomposition based on t-SVD and its Applications in Video Denoising
Title | Reweighted Low-Rank Tensor Decomposition based on t-SVD and its Applications in Video Denoising |
Authors | M. Baburaj, Sudhish N. George |
Abstract | The t-SVD based Tensor Robust Principal Component Analysis (TRPCA) decomposes low rank multi-linear signal corrupted by gross errors into low multi-rank and sparse component by simultaneously minimizing tensor nuclear norm and l 1 norm. But if the multi-rank of the signal is considerably large and/or large amount of noise is present, the performance of TRPCA deteriorates. To overcome this problem, this paper proposes a new efficient iterative reweighted tensor decomposition scheme based on t-SVD which significantly improves tensor multi-rank in TRPCA. Further, the sparse component of the tensor is also recovered by reweighted l 1 norm which enhances the accuracy of decomposition. The effectiveness of the proposed method is established by applying it to the video denoising problem and the experimental results reveal that the proposed algorithm outperforms its counterparts. |
Tasks | Denoising, Video Denoising |
Published | 2016-11-18 |
URL | http://arxiv.org/abs/1611.05963v4 |
http://arxiv.org/pdf/1611.05963v4.pdf | |
PWC | https://paperswithcode.com/paper/reweighted-low-rank-tensor-decomposition |
Repo | |
Framework | |
Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression
Title | Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression |
Authors | Ning Xu, Jian Hong, Timothy C. G. Fisher |
Abstract | We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalization errors (GE). We derive upper bounds for the GE, which depend on sample sizes, model complexity and the distribution of the loss function. The upper bounds can be used to evaluate the GA of a model, ex ante. We propose using generalization error minimization (GEM) as a framework for model selection. Using GEM, we are able to unify a big class of penalized regression estimators, including lasso, ridge and bridge, under the same set of assumptions. We establish finite-sample and asymptotic properties (including $\mathcal{L}_2$-consistency) of the GEM estimator for both the $n \geqslant p$ and the $n < p$ cases. We also derive the $\mathcal{L}_2$-distance between the penalized and corresponding unpenalized regression estimates. In practice, GEM can be implemented by validation or cross-validation. We show that the GE bounds can be used for selecting the optimal number of folds in $K$-fold cross-validation. We propose a variant of $R^2$, the $GR^2$, as a measure of GA, which considers both both in-sample and out-of-sample goodness of fit. Simulations are used to demonstrate our key results. |
Tasks | Model Selection |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05448v1 |
http://arxiv.org/pdf/1610.05448v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-error-minimization-a-new |
Repo | |
Framework | |
Learning Binary Codes and Binary Weights for Efficient Classification
Title | Learning Binary Codes and Binary Weights for Efficient Classification |
Authors | Fumin Shen, Yadong Mu, Wei Liu, Yang Yang, Heng Tao Shen |
Abstract | This paper proposes a generic formulation that significantly expedites the training and deployment of image classification models, particularly under the scenarios of many image categories and high feature dimensions. As a defining property, our method represents both the images and learned classifiers using binary hash codes, which are simultaneously learned from the training data. Classifying an image thereby reduces to computing the Hamming distance between the binary codes of the image and classifiers and selecting the class with minimal Hamming distance. Conventionally, compact hash codes are primarily used for accelerating image search. Our work is first of its kind to represent classifiers using binary codes. Specifically, we formulate multi-class image classification as an optimization problem over binary variables. The optimization alternatively proceeds over the binary classifiers and image hash codes. Profiting from the special property of binary codes, we show that the sub-problems can be efficiently solved through either a binary quadratic program (BQP) or linear program. In particular, for attacking the BQP problem, we propose a novel bit-flipping procedure which enjoys high efficacy and local optimality guarantee. Our formulation supports a large family of empirical loss functions and is here instantiated by exponential / hinge losses. Comprehensive evaluations are conducted on several representative image benchmarks. The experiments consistently observe reduced complexities of model training and deployment, without sacrifice of accuracies. |
Tasks | Image Classification, Image Retrieval |
Published | 2016-03-14 |
URL | http://arxiv.org/abs/1603.04116v1 |
http://arxiv.org/pdf/1603.04116v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-binary-codes-and-binary-weights-for |
Repo | |
Framework | |
Multi-agent evolutionary systems for the generation of complex virtual worlds
Title | Multi-agent evolutionary systems for the generation of complex virtual worlds |
Authors | Jan Kruse, Andy M. Connor |
Abstract | Modern films, games and virtual reality applications are dependent on convincing computer graphics. Highly complex models are a requirement for the successful delivery of many scenes and environments. While workflows such as rendering, compositing and animation have been streamlined to accommodate increasing demands, modelling complex models is still a laborious task. This paper introduces the computational benefits of an Interactive Genetic Algorithm (IGA) to computer graphics modelling while compensating the effects of user fatigue, a common issue with Interactive Evolutionary Computation. An intelligent agent is used in conjunction with an IGA that offers the potential to reduce the effects of user fatigue by learning from the choices made by the human designer and directing the search accordingly. This workflow accelerates the layout and distribution of basic elements to form complex models. It captures the designer’s intent through interaction, and encourages playful discovery. |
Tasks | |
Published | 2016-04-20 |
URL | http://arxiv.org/abs/1604.05792v1 |
http://arxiv.org/pdf/1604.05792v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-evolutionary-systems-for-the |
Repo | |
Framework | |
Biobjective Performance Assessment with the COCO Platform
Title | Biobjective Performance Assessment with the COCO Platform |
Authors | Dimo Brockhoff, Tea Tušar, Dejan Tušar, Tobias Wagner, Nikolaus Hansen, Anne Auger |
Abstract | This document details the rationales behind assessing the performance of numerical black-box optimizers on multi-objective problems within the COCO platform and in particular on the biobjective test suite bbob-biobj. The evaluation is based on a hypervolume of all non-dominated solutions in the archive of candidate solutions and measures the runtime until the hypervolume value succeeds prescribed target values. |
Tasks | |
Published | 2016-05-05 |
URL | http://arxiv.org/abs/1605.01746v1 |
http://arxiv.org/pdf/1605.01746v1.pdf | |
PWC | https://paperswithcode.com/paper/biobjective-performance-assessment-with-the |
Repo | |
Framework | |