May 6, 2019

2831 words 14 mins read

Paper Group ANR 318

SEMBED: Semantic Embedding of Egocentric Action Videos. Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness. Better Conditional Density Estimation for Neural Networks. Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method. Kernel functions based on …

SEMBED: Semantic Embedding of Egocentric Action Videos


Title	SEMBED: Semantic Embedding of Egocentric Action Videos
Authors	Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen
Abstract	We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels. When object interactions are annotated using unbounded choice of verbs, we embrace the wealth and ambiguity of these labels by capturing the semantic relationships as well as the visual similarities over motion and appearance features. We show how SEMBED can interpret a challenging dataset of 1225 freely annotated egocentric videos, outperforming SVM classification by more than 5%.
Tasks
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08414v2
PDF	http://arxiv.org/pdf/1607.08414v2.pdf
PWC	https://paperswithcode.com/paper/sembed-semantic-embedding-of-egocentric
Repo
Framework

Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness


Title	Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness
Authors	Shuzhe Wu, Meina Kan, Zhenliang He, Shiguang Shan, Xilin Chen
Abstract	Multi-view face detection in open environment is a challenging task due to diverse variations of face appearances and shapes. Most multi-view face detectors depend on multiple models and organize them in parallel, pyramid or tree structure, which compromise between the accuracy and time-cost. Aiming at a more favorable multi-view face detector, we propose a novel funnel-structured cascade (FuSt) detection framework. In a coarse-to-fine flavor, our FuSt consists of, from top to bottom, 1) multiple view-specific fast LAB cascade for extremely quick face proposal, 2) multiple coarse MLP cascade for further candidate window verification, and 3) a unified fine MLP cascade with shape-indexed features for accurate face detection. Compared with other structures, on the one hand, the proposed one uses multiple computationally efficient distributed classifiers to propose a small number of candidate windows but with a high recall of multi-view faces. On the other hand, by using a unified MLP cascade to examine proposals of all views in a centralized style, it provides a favorable solution for multi-view face detection with high accuracy and low time-cost. Besides, the FuSt detector is alignment-aware and performs a coarse facial part prediction which is beneficial for subsequent face alignment. Extensive experiments on two challenging datasets, FDDB and AFW, demonstrate the effectiveness of our FuSt detector in both accuracy and speed.
Tasks	Face Alignment, Face Detection
Published	2016-09-23
URL	http://arxiv.org/abs/1609.07304v1
PDF	http://arxiv.org/pdf/1609.07304v1.pdf
PWC	https://paperswithcode.com/paper/funnel-structured-cascade-for-multi-view-face
Repo
Framework

Better Conditional Density Estimation for Neural Networks


Title	Better Conditional Density Estimation for Neural Networks
Authors	Wesley Tansey, Karl Pichotta, James G. Scott
Abstract	The vast majority of the neural network literature focuses on predicting point values for a given set of response variables, conditioned on a feature vector. In many cases we need to model the full joint conditional distribution over the response variables rather than simply making point predictions. In this paper, we present two novel approaches to such conditional density estimation (CDE): Multiscale Nets (MSNs) and CDE Trend Filtering. Multiscale nets transform the CDE regression task into a hierarchical classification task by decomposing the density into a series of half-spaces and learning boolean probabilities of each split. CDE Trend Filtering applies a k-th order graph trend filtering penalty to the unnormalized logits of a multinomial classifier network, with each edge in the graph corresponding to a neighboring point on a discretized version of the density. We compare both methods against plain multinomial classifier networks and mixture density networks (MDNs) on a simulated dataset and three real-world datasets. The results suggest the two methods are complementary: MSNs work well in a high-data-per-feature regime and CDE-TF is well suited for few-samples-per-feature scenarios where overfitting is a primary concern.
Tasks	Density Estimation
Published	2016-06-07
URL	http://arxiv.org/abs/1606.02321v1
PDF	http://arxiv.org/pdf/1606.02321v1.pdf
PWC	https://paperswithcode.com/paper/better-conditional-density-estimation-for
Repo
Framework

Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method


Title	Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method
Authors	Nija Mani, Gursaran, Ashish Mani
Abstract	Quantum inspired Evolutionary Algorithms were proposed more than a decade ago and have been employed for solving a wide range of difficult search and optimization problems. A number of changes have been proposed to improve performance of canonical QEA. However, canonical QEA is one of the few evolutionary algorithms, which uses a search operator with relatively large number of parameters. It is well known that performance of evolutionary algorithms is dependent on specific value of parameters for a given problem. The advantage of having large number of parameters in an operator is that the search process can be made more powerful even with a single operator without requiring a combination of other operators for exploration and exploitation. However, the tuning of operators with large number of parameters is complex and computationally expensive. This paper proposes a novel heuristic method for tuning parameters of canonical QEA. The tuned QEA outperforms canonical QEA on a class of discrete combinatorial optimization problems which, validates the design of the proposed parameter tuning framework. The proposed framework can be used for tuning other algorithms with both large and small number of tunable parameters.
Tasks	Combinatorial Optimization
Published	2016-12-23
URL	https://arxiv.org/abs/1612.08109v2
PDF	https://arxiv.org/pdf/1612.08109v2.pdf
PWC	https://paperswithcode.com/paper/solving-combinatorial-optimization-problems
Repo
Framework

Kernel functions based on triplet comparisons


Title	Kernel functions based on triplet comparisons
Authors	Matthäus Kleindessner, Ulrike von Luxburg
Abstract	Given only information in the form of similarity triplets “Object A is more similar to object B than to object C” about a data set, we propose two ways of defining a kernel function on the data set. While previous approaches construct a low-dimensional Euclidean embedding of the data set that reflects the given similarity triplets, we aim at defining kernel functions that correspond to high-dimensional embeddings. These kernel functions can subsequently be used to apply any kernel method to the data set.
Tasks
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08456v2
PDF	http://arxiv.org/pdf/1607.08456v2.pdf
PWC	https://paperswithcode.com/paper/kernel-functions-based-on-triplet-comparisons
Repo
Framework

Face Alignment In-the-Wild: A Survey


Title	Face Alignment In-the-Wild: A Survey
Authors	Xin Jin, Xiaoyang Tan
Abstract	Over the last two decades, face alignment or localizing fiducial facial points has received increasing attention owing to its comprehensive applications in automatic face analysis. However, such a task has proven extremely challenging in unconstrained environments due to many confounding factors, such as pose, occlusions, expression and illumination. While numerous techniques have been developed to address these challenges, this problem is still far away from being solved. In this survey, we present an up-to-date critical review of the existing literatures on face alignment, focusing on those methods addressing overall difficulties and challenges of this topic under uncontrolled conditions. Specifically, we categorize existing face alignment techniques, present detailed descriptions of the prominent algorithms within each category, and discuss their advantages and disadvantages. Furthermore, we organize special discussions on the practical aspects of face alignment in-the-wild, towards the development of a robust face alignment system. In addition, we show performance statistics of the state of the art, and conclude this paper with several promising directions for future research.
Tasks	Face Alignment, Robust Face Alignment
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04188v1
PDF	http://arxiv.org/pdf/1608.04188v1.pdf
PWC	https://paperswithcode.com/paper/face-alignment-in-the-wild-a-survey
Repo
Framework

Graph-based Predictable Feature Analysis


Title	Graph-based Predictable Feature Analysis
Authors	Björn Weghenkel, Asja Fischer, Laurenz Wiskott
Abstract	We propose graph-based predictable feature analysis (GPFA), a new method for unsupervised learning of predictable features from high-dimensional time series, where high predictability is understood very generically as low variance in the distribution of the next data point given the previous ones. We show how this measure of predictability can be understood in terms of graph embedding as well as how it relates to the information-theoretic measure of predictive information in special cases. We confirm the effectiveness of GPFA on different datasets, comparing it to three existing algorithms with similar objectives—namely slow feature analysis, forecastable component analysis, and predictable feature analysis—to which GPFA shows very competitive results.
Tasks	Graph Embedding, Time Series
Published	2016-02-01
URL	http://arxiv.org/abs/1602.00554v2
PDF	http://arxiv.org/pdf/1602.00554v2.pdf
PWC	https://paperswithcode.com/paper/graph-based-predictable-feature-analysis
Repo
Framework

Learning from Non-Stationary Stream Data in Multiobjective Evolutionary Algorithm


Title	Learning from Non-Stationary Stream Data in Multiobjective Evolutionary Algorithm
Authors	Jianyong Sun, Hu Zhang, Aimin Zhou, Qingfu Zhang
Abstract	Evolutionary algorithms (EAs) have been well acknowledged as a promising paradigm for solving optimisation problems with multiple conflicting objectives in the sense that they are able to locate a set of diverse approximations of Pareto optimal solutions in a single run. EAs drive the search for approximated solutions through maintaining a diverse population of solutions and by recombining promising solutions selected from the population. Combining machine learning techniques has shown great potentials since the intrinsic structure of the Pareto optimal solutions of an multiobjective optimisation problem can be learned and used to guide for effective recombination. However, existing multiobjective EAs (MOEAs) based on structure learning spend too much computational resources on learning. To address this problem, we propose to use an online learning scheme. Based on the fact that offsprings along evolution are streamy, dependent and non-stationary (which implies that the intrinsic structure, if any, is temporal and scale-variant), an online agglomerative clustering algorithm is applied to adaptively discover the intrinsic structure of the Pareto optimal solution set; and to guide effective offspring recombination. Experimental results have shown significant improvement over five state-of-the-art MOEAs on a set of well-known benchmark problems with complicated Pareto sets and complex Pareto fronts.
Tasks
Published	2016-06-16
URL	http://arxiv.org/abs/1606.05169v1
PDF	http://arxiv.org/pdf/1606.05169v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-non-stationary-stream-data-in
Repo
Framework

Learning from Maps: Visual Common Sense for Autonomous Driving


Title	Learning from Maps: Visual Common Sense for Autonomous Driving
Authors	Ari Seff, Jianxiong Xiao
Abstract	Today’s autonomous vehicles rely extensively on high-definition 3D maps to navigate the environment. While this approach works well when these maps are completely up-to-date, safe autonomous vehicles must be able to corroborate the map’s information via a real time sensor-based system. Our goal in this work is to develop a model for road layout inference given imagery from on-board cameras, without any reliance on high-definition maps. However, no sufficient dataset for training such a model exists. Here, we leverage the availability of standard navigation maps and corresponding street view images to construct an automatically labeled, large-scale dataset for this complex scene understanding problem. By matching road vectors and metadata from navigation maps with Google Street View images, we can assign ground truth road layout attributes (e.g., distance to an intersection, one-way vs. two-way street) to the images. We then train deep convolutional networks to predict these road layout attributes given a single monocular RGB image. Experimental evaluation demonstrates that our model learns to correctly infer the road attributes using only panoramas captured by car-mounted cameras as input. Additionally, our results indicate that this method may be suitable to the novel application of recommending safety improvements to infrastructure (e.g., suggesting an alternative speed limit for a street).
Tasks	Autonomous Driving, Autonomous Vehicles, Common Sense Reasoning, Scene Understanding
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08583v2
PDF	http://arxiv.org/pdf/1611.08583v2.pdf
PWC	https://paperswithcode.com/paper/learning-from-maps-visual-common-sense-for
Repo
Framework

Region Graph Based Method for Multi-Object Detection and Tracking using Depth Cameras


Title	Region Graph Based Method for Multi-Object Detection and Tracking using Depth Cameras
Authors	Sachin Mehta, Balakrishnan Prabhakaran
Abstract	In this paper, we propose a multi-object detection and tracking method using depth cameras. Depth maps are very noisy and obscure in object detection. We first propose a region-based method to suppress high magnitude noise which cannot be filtered using spatial filters. Second, the proposed method detect Region of Interests by temporal learning which are then tracked using weighted graph-based approach. We demonstrate the performance of the proposed method on standard depth camera datasets with and without object occlusions. Experimental results show that the proposed method is able to suppress high magnitude noise in depth maps and detect/track the objects (with and without occlusion).
Tasks	Object Detection
Published	2016-03-11
URL	http://arxiv.org/abs/1603.03783v1
PDF	http://arxiv.org/pdf/1603.03783v1.pdf
PWC	https://paperswithcode.com/paper/region-graph-based-method-for-multi-object
Repo
Framework

Associative Memories to Accelerate Approximate Nearest Neighbor Search


Title	Associative Memories to Accelerate Approximate Nearest Neighbor Search
Authors	Vincent Gripon, Matthias Löwe, Franck Vermet
Abstract	Nearest neighbor search is a very active field in machine learning for it appears in many application cases, including classification and object retrieval. In its canonical version, the complexity of the search is linear with both the dimension and the cardinal of the collection of vectors the search is performed in. Recently many works have focused on reducing the dimension of vectors using quantization techniques or hashing, while providing an approximate result. In this paper we focus instead on tackling the cardinal of the collection of vectors. Namely, we introduce a technique that partitions the collection of vectors and stores each part in its own associative memory. When a query vector is given to the system, associative memories are polled to identify which one contain the closest match. Then an exhaustive search is conducted only on the part of vectors stored in the selected associative memory. We study the effectiveness of the system when messages to store are generated from i.i.d. uniform $\pm$1 random variables or 0-1 sparse i.i.d. random variables. We also conduct experiment on both synthetic data and real data and show it is possible to achieve interesting trade-offs between complexity and accuracy.
Tasks	Quantization
Published	2016-11-10
URL	http://arxiv.org/abs/1611.05898v2
PDF	http://arxiv.org/pdf/1611.05898v2.pdf
PWC	https://paperswithcode.com/paper/associative-memories-to-accelerate
Repo
Framework

Causal Effect Identification in Acyclic Directed Mixed Graphs and Gated Models


Title	Causal Effect Identification in Acyclic Directed Mixed Graphs and Gated Models
Authors	Jose M. Peña, Marcus Bendtsen
Abstract	We introduce a new family of graphical models that consists of graphs with possibly directed, undirected and bidirected edges but without directed cycles. We show that these models are suitable for representing causal models with additive error terms. We provide a set of sufficient graphical criteria for the identification of arbitrary causal effects when the new models contain directed and undirected edges but no bidirected edge. We also provide a necessary and sufficient graphical criterion for the identification of the causal effect of a single variable on the rest of the variables. Moreover, we develop an exact algorithm for learning the new models from observational and interventional data via answer set programming. Finally, we introduce gated models for causal effect identification, a new family of graphical models that exploits context specific independences to identify additional causal effects.
Tasks
Published	2016-12-22
URL	http://arxiv.org/abs/1612.07512v2
PDF	http://arxiv.org/pdf/1612.07512v2.pdf
PWC	https://paperswithcode.com/paper/causal-effect-identification-in-acyclic
Repo
Framework

3D zigzag for multislicing, multiband and video processing


Title	3D zigzag for multislicing, multiband and video processing
Authors	Mario Mastriani
Abstract	We present a 3D zigzag rafter (first in literature) which allows us to obtain the exact sequence of spectral components after application of Discrete Cosine Transform 3D (DCT-2D) over a cube. Such cube represents part of a video or eventually a group of images such as multislicing (e.g., Magnetic Resonance or Computed Tomography imaging) and multi or hyperspectral imagery (optical satellites). Besides, we present a new version of the traditional 2D zigzag, including the case of rectangular blocks. Finally, all the attached code is done in MATLAB, and that code serves both blocks of pixels or blocks of blocks.
Tasks
Published	2016-06-16
URL	http://arxiv.org/abs/1606.05255v1
PDF	http://arxiv.org/pdf/1606.05255v1.pdf
PWC	https://paperswithcode.com/paper/3d-zigzag-for-multislicing-multiband-and
Repo
Framework

Learning deep representation from coarse to fine for face alignment


Title	Learning deep representation from coarse to fine for face alignment
Authors	Zhiwen Shao, Shouhong Ding, Yiru Zhao, Qinchuan Zhang, Lizhuang Ma
Abstract	In this paper, we propose a novel face alignment method that trains deep convolutional network from coarse to fine. It divides given landmarks into principal subset and elaborate subset. We firstly keep a large weight for principal subset to make our network primarily predict their locations while slightly take elaborate subset into account. Next the weight of principal subset is gradually decreased until two subsets have equivalent weights. This process contributes to learn a good initial model and search the optimal model smoothly to avoid missing fairly good intermediate models in subsequent procedures. On the challenging COFW dataset [1], our method achieves 6.33% mean error with a reduction of 21.37% compared with the best previous result [2].
Tasks	Face Alignment
Published	2016-07-31
URL	http://arxiv.org/abs/1608.00207v1
PDF	http://arxiv.org/pdf/1608.00207v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-representation-from-coarse-to
Repo
Framework

Dense Motion Estimation for Smoke


Title	Dense Motion Estimation for Smoke
Authors	Da Chen, Wenbin Li, Peter Hall
Abstract	Motion estimation for highly dynamic phenomena such as smoke is an open challenge for Computer Vision. Traditional dense motion estimation algorithms have difficulties with non-rigid and large motions, both of which are frequently observed in smoke motion. We propose an algorithm for dense motion estimation of smoke. Our algorithm is robust, fast, and has better performance over different types of smoke compared to other dense motion estimation algorithms, including state of the art and neural network approaches. The key to our contribution is to use skeletal flow, without explicit point matching, to provide a sparse flow. This sparse flow is upgraded to a dense flow. In this paper we describe our algorithm in greater detail, and provide experimental evidence to support our claims.
Tasks	Motion Estimation
Published	2016-09-07
URL	http://arxiv.org/abs/1609.02001v2
PDF	http://arxiv.org/pdf/1609.02001v2.pdf
PWC	https://paperswithcode.com/paper/dense-motion-estimation-for-smoke
Repo
Framework