Paper Group ANR 318
SEMBED: Semantic Embedding of Egocentric Action Videos. Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness. Better Conditional Density Estimation for Neural Networks. Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method. Kernel functions based on …
SEMBED: Semantic Embedding of Egocentric Action Videos
Title | SEMBED: Semantic Embedding of Egocentric Action Videos |
Authors | Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen |
Abstract | We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels. When object interactions are annotated using unbounded choice of verbs, we embrace the wealth and ambiguity of these labels by capturing the semantic relationships as well as the visual similarities over motion and appearance features. We show how SEMBED can interpret a challenging dataset of 1225 freely annotated egocentric videos, outperforming SVM classification by more than 5%. |
Tasks | |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08414v2 |
http://arxiv.org/pdf/1607.08414v2.pdf | |
PWC | https://paperswithcode.com/paper/sembed-semantic-embedding-of-egocentric |
Repo | |
Framework | |
Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness
Title | Funnel-Structured Cascade for Multi-View Face Detection with Alignment-Awareness |
Authors | Shuzhe Wu, Meina Kan, Zhenliang He, Shiguang Shan, Xilin Chen |
Abstract | Multi-view face detection in open environment is a challenging task due to diverse variations of face appearances and shapes. Most multi-view face detectors depend on multiple models and organize them in parallel, pyramid or tree structure, which compromise between the accuracy and time-cost. Aiming at a more favorable multi-view face detector, we propose a novel funnel-structured cascade (FuSt) detection framework. In a coarse-to-fine flavor, our FuSt consists of, from top to bottom, 1) multiple view-specific fast LAB cascade for extremely quick face proposal, 2) multiple coarse MLP cascade for further candidate window verification, and 3) a unified fine MLP cascade with shape-indexed features for accurate face detection. Compared with other structures, on the one hand, the proposed one uses multiple computationally efficient distributed classifiers to propose a small number of candidate windows but with a high recall of multi-view faces. On the other hand, by using a unified MLP cascade to examine proposals of all views in a centralized style, it provides a favorable solution for multi-view face detection with high accuracy and low time-cost. Besides, the FuSt detector is alignment-aware and performs a coarse facial part prediction which is beneficial for subsequent face alignment. Extensive experiments on two challenging datasets, FDDB and AFW, demonstrate the effectiveness of our FuSt detector in both accuracy and speed. |
Tasks | Face Alignment, Face Detection |
Published | 2016-09-23 |
URL | http://arxiv.org/abs/1609.07304v1 |
http://arxiv.org/pdf/1609.07304v1.pdf | |
PWC | https://paperswithcode.com/paper/funnel-structured-cascade-for-multi-view-face |
Repo | |
Framework | |
Better Conditional Density Estimation for Neural Networks
Title | Better Conditional Density Estimation for Neural Networks |
Authors | Wesley Tansey, Karl Pichotta, James G. Scott |
Abstract | The vast majority of the neural network literature focuses on predicting point values for a given set of response variables, conditioned on a feature vector. In many cases we need to model the full joint conditional distribution over the response variables rather than simply making point predictions. In this paper, we present two novel approaches to such conditional density estimation (CDE): Multiscale Nets (MSNs) and CDE Trend Filtering. Multiscale nets transform the CDE regression task into a hierarchical classification task by decomposing the density into a series of half-spaces and learning boolean probabilities of each split. CDE Trend Filtering applies a k-th order graph trend filtering penalty to the unnormalized logits of a multinomial classifier network, with each edge in the graph corresponding to a neighboring point on a discretized version of the density. We compare both methods against plain multinomial classifier networks and mixture density networks (MDNs) on a simulated dataset and three real-world datasets. The results suggest the two methods are complementary: MSNs work well in a high-data-per-feature regime and CDE-TF is well suited for few-samples-per-feature scenarios where overfitting is a primary concern. |
Tasks | Density Estimation |
Published | 2016-06-07 |
URL | http://arxiv.org/abs/1606.02321v1 |
http://arxiv.org/pdf/1606.02321v1.pdf | |
PWC | https://paperswithcode.com/paper/better-conditional-density-estimation-for |
Repo | |
Framework | |
Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method
Title | Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method |
Authors | Nija Mani, Gursaran, Ashish Mani |
Abstract | Quantum inspired Evolutionary Algorithms were proposed more than a decade ago and have been employed for solving a wide range of difficult search and optimization problems. A number of changes have been proposed to improve performance of canonical QEA. However, canonical QEA is one of the few evolutionary algorithms, which uses a search operator with relatively large number of parameters. It is well known that performance of evolutionary algorithms is dependent on specific value of parameters for a given problem. The advantage of having large number of parameters in an operator is that the search process can be made more powerful even with a single operator without requiring a combination of other operators for exploration and exploitation. However, the tuning of operators with large number of parameters is complex and computationally expensive. This paper proposes a novel heuristic method for tuning parameters of canonical QEA. The tuned QEA outperforms canonical QEA on a class of discrete combinatorial optimization problems which, validates the design of the proposed parameter tuning framework. The proposed framework can be used for tuning other algorithms with both large and small number of tunable parameters. |
Tasks | Combinatorial Optimization |
Published | 2016-12-23 |
URL | https://arxiv.org/abs/1612.08109v2 |
https://arxiv.org/pdf/1612.08109v2.pdf | |
PWC | https://paperswithcode.com/paper/solving-combinatorial-optimization-problems |
Repo | |
Framework | |
Kernel functions based on triplet comparisons
Title | Kernel functions based on triplet comparisons |
Authors | Matthäus Kleindessner, Ulrike von Luxburg |
Abstract | Given only information in the form of similarity triplets “Object A is more similar to object B than to object C” about a data set, we propose two ways of defining a kernel function on the data set. While previous approaches construct a low-dimensional Euclidean embedding of the data set that reflects the given similarity triplets, we aim at defining kernel functions that correspond to high-dimensional embeddings. These kernel functions can subsequently be used to apply any kernel method to the data set. |
Tasks | |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08456v2 |
http://arxiv.org/pdf/1607.08456v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-functions-based-on-triplet-comparisons |
Repo | |
Framework | |
Face Alignment In-the-Wild: A Survey
Title | Face Alignment In-the-Wild: A Survey |
Authors | Xin Jin, Xiaoyang Tan |
Abstract | Over the last two decades, face alignment or localizing fiducial facial points has received increasing attention owing to its comprehensive applications in automatic face analysis. However, such a task has proven extremely challenging in unconstrained environments due to many confounding factors, such as pose, occlusions, expression and illumination. While numerous techniques have been developed to address these challenges, this problem is still far away from being solved. In this survey, we present an up-to-date critical review of the existing literatures on face alignment, focusing on those methods addressing overall difficulties and challenges of this topic under uncontrolled conditions. Specifically, we categorize existing face alignment techniques, present detailed descriptions of the prominent algorithms within each category, and discuss their advantages and disadvantages. Furthermore, we organize special discussions on the practical aspects of face alignment in-the-wild, towards the development of a robust face alignment system. In addition, we show performance statistics of the state of the art, and conclude this paper with several promising directions for future research. |
Tasks | Face Alignment, Robust Face Alignment |
Published | 2016-08-15 |
URL | http://arxiv.org/abs/1608.04188v1 |
http://arxiv.org/pdf/1608.04188v1.pdf | |
PWC | https://paperswithcode.com/paper/face-alignment-in-the-wild-a-survey |
Repo | |
Framework | |
Graph-based Predictable Feature Analysis
Title | Graph-based Predictable Feature Analysis |
Authors | Björn Weghenkel, Asja Fischer, Laurenz Wiskott |
Abstract | We propose graph-based predictable feature analysis (GPFA), a new method for unsupervised learning of predictable features from high-dimensional time series, where high predictability is understood very generically as low variance in the distribution of the next data point given the previous ones. We show how this measure of predictability can be understood in terms of graph embedding as well as how it relates to the information-theoretic measure of predictive information in special cases. We confirm the effectiveness of GPFA on different datasets, comparing it to three existing algorithms with similar objectives—namely slow feature analysis, forecastable component analysis, and predictable feature analysis—to which GPFA shows very competitive results. |
Tasks | Graph Embedding, Time Series |
Published | 2016-02-01 |
URL | http://arxiv.org/abs/1602.00554v2 |
http://arxiv.org/pdf/1602.00554v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-predictable-feature-analysis |
Repo | |
Framework | |
Learning from Non-Stationary Stream Data in Multiobjective Evolutionary Algorithm
Title | Learning from Non-Stationary Stream Data in Multiobjective Evolutionary Algorithm |
Authors | Jianyong Sun, Hu Zhang, Aimin Zhou, Qingfu Zhang |
Abstract | Evolutionary algorithms (EAs) have been well acknowledged as a promising paradigm for solving optimisation problems with multiple conflicting objectives in the sense that they are able to locate a set of diverse approximations of Pareto optimal solutions in a single run. EAs drive the search for approximated solutions through maintaining a diverse population of solutions and by recombining promising solutions selected from the population. Combining machine learning techniques has shown great potentials since the intrinsic structure of the Pareto optimal solutions of an multiobjective optimisation problem can be learned and used to guide for effective recombination. However, existing multiobjective EAs (MOEAs) based on structure learning spend too much computational resources on learning. To address this problem, we propose to use an online learning scheme. Based on the fact that offsprings along evolution are streamy, dependent and non-stationary (which implies that the intrinsic structure, if any, is temporal and scale-variant), an online agglomerative clustering algorithm is applied to adaptively discover the intrinsic structure of the Pareto optimal solution set; and to guide effective offspring recombination. Experimental results have shown significant improvement over five state-of-the-art MOEAs on a set of well-known benchmark problems with complicated Pareto sets and complex Pareto fronts. |
Tasks | |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05169v1 |
http://arxiv.org/pdf/1606.05169v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-non-stationary-stream-data-in |
Repo | |
Framework | |
Learning from Maps: Visual Common Sense for Autonomous Driving
Title | Learning from Maps: Visual Common Sense for Autonomous Driving |
Authors | Ari Seff, Jianxiong Xiao |
Abstract | Today’s autonomous vehicles rely extensively on high-definition 3D maps to navigate the environment. While this approach works well when these maps are completely up-to-date, safe autonomous vehicles must be able to corroborate the map’s information via a real time sensor-based system. Our goal in this work is to develop a model for road layout inference given imagery from on-board cameras, without any reliance on high-definition maps. However, no sufficient dataset for training such a model exists. Here, we leverage the availability of standard navigation maps and corresponding street view images to construct an automatically labeled, large-scale dataset for this complex scene understanding problem. By matching road vectors and metadata from navigation maps with Google Street View images, we can assign ground truth road layout attributes (e.g., distance to an intersection, one-way vs. two-way street) to the images. We then train deep convolutional networks to predict these road layout attributes given a single monocular RGB image. Experimental evaluation demonstrates that our model learns to correctly infer the road attributes using only panoramas captured by car-mounted cameras as input. Additionally, our results indicate that this method may be suitable to the novel application of recommending safety improvements to infrastructure (e.g., suggesting an alternative speed limit for a street). |
Tasks | Autonomous Driving, Autonomous Vehicles, Common Sense Reasoning, Scene Understanding |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08583v2 |
http://arxiv.org/pdf/1611.08583v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-maps-visual-common-sense-for |
Repo | |
Framework | |
Region Graph Based Method for Multi-Object Detection and Tracking using Depth Cameras
Title | Region Graph Based Method for Multi-Object Detection and Tracking using Depth Cameras |
Authors | Sachin Mehta, Balakrishnan Prabhakaran |
Abstract | In this paper, we propose a multi-object detection and tracking method using depth cameras. Depth maps are very noisy and obscure in object detection. We first propose a region-based method to suppress high magnitude noise which cannot be filtered using spatial filters. Second, the proposed method detect Region of Interests by temporal learning which are then tracked using weighted graph-based approach. We demonstrate the performance of the proposed method on standard depth camera datasets with and without object occlusions. Experimental results show that the proposed method is able to suppress high magnitude noise in depth maps and detect/track the objects (with and without occlusion). |
Tasks | Object Detection |
Published | 2016-03-11 |
URL | http://arxiv.org/abs/1603.03783v1 |
http://arxiv.org/pdf/1603.03783v1.pdf | |
PWC | https://paperswithcode.com/paper/region-graph-based-method-for-multi-object |
Repo | |
Framework | |
Associative Memories to Accelerate Approximate Nearest Neighbor Search
Title | Associative Memories to Accelerate Approximate Nearest Neighbor Search |
Authors | Vincent Gripon, Matthias Löwe, Franck Vermet |
Abstract | Nearest neighbor search is a very active field in machine learning for it appears in many application cases, including classification and object retrieval. In its canonical version, the complexity of the search is linear with both the dimension and the cardinal of the collection of vectors the search is performed in. Recently many works have focused on reducing the dimension of vectors using quantization techniques or hashing, while providing an approximate result. In this paper we focus instead on tackling the cardinal of the collection of vectors. Namely, we introduce a technique that partitions the collection of vectors and stores each part in its own associative memory. When a query vector is given to the system, associative memories are polled to identify which one contain the closest match. Then an exhaustive search is conducted only on the part of vectors stored in the selected associative memory. We study the effectiveness of the system when messages to store are generated from i.i.d. uniform $\pm$1 random variables or 0-1 sparse i.i.d. random variables. We also conduct experiment on both synthetic data and real data and show it is possible to achieve interesting trade-offs between complexity and accuracy. |
Tasks | Quantization |
Published | 2016-11-10 |
URL | http://arxiv.org/abs/1611.05898v2 |
http://arxiv.org/pdf/1611.05898v2.pdf | |
PWC | https://paperswithcode.com/paper/associative-memories-to-accelerate |
Repo | |
Framework | |
Causal Effect Identification in Acyclic Directed Mixed Graphs and Gated Models
Title | Causal Effect Identification in Acyclic Directed Mixed Graphs and Gated Models |
Authors | Jose M. Peña, Marcus Bendtsen |
Abstract | We introduce a new family of graphical models that consists of graphs with possibly directed, undirected and bidirected edges but without directed cycles. We show that these models are suitable for representing causal models with additive error terms. We provide a set of sufficient graphical criteria for the identification of arbitrary causal effects when the new models contain directed and undirected edges but no bidirected edge. We also provide a necessary and sufficient graphical criterion for the identification of the causal effect of a single variable on the rest of the variables. Moreover, we develop an exact algorithm for learning the new models from observational and interventional data via answer set programming. Finally, we introduce gated models for causal effect identification, a new family of graphical models that exploits context specific independences to identify additional causal effects. |
Tasks | |
Published | 2016-12-22 |
URL | http://arxiv.org/abs/1612.07512v2 |
http://arxiv.org/pdf/1612.07512v2.pdf | |
PWC | https://paperswithcode.com/paper/causal-effect-identification-in-acyclic |
Repo | |
Framework | |
3D zigzag for multislicing, multiband and video processing
Title | 3D zigzag for multislicing, multiband and video processing |
Authors | Mario Mastriani |
Abstract | We present a 3D zigzag rafter (first in literature) which allows us to obtain the exact sequence of spectral components after application of Discrete Cosine Transform 3D (DCT-2D) over a cube. Such cube represents part of a video or eventually a group of images such as multislicing (e.g., Magnetic Resonance or Computed Tomography imaging) and multi or hyperspectral imagery (optical satellites). Besides, we present a new version of the traditional 2D zigzag, including the case of rectangular blocks. Finally, all the attached code is done in MATLAB, and that code serves both blocks of pixels or blocks of blocks. |
Tasks | |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05255v1 |
http://arxiv.org/pdf/1606.05255v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-zigzag-for-multislicing-multiband-and |
Repo | |
Framework | |
Learning deep representation from coarse to fine for face alignment
Title | Learning deep representation from coarse to fine for face alignment |
Authors | Zhiwen Shao, Shouhong Ding, Yiru Zhao, Qinchuan Zhang, Lizhuang Ma |
Abstract | In this paper, we propose a novel face alignment method that trains deep convolutional network from coarse to fine. It divides given landmarks into principal subset and elaborate subset. We firstly keep a large weight for principal subset to make our network primarily predict their locations while slightly take elaborate subset into account. Next the weight of principal subset is gradually decreased until two subsets have equivalent weights. This process contributes to learn a good initial model and search the optimal model smoothly to avoid missing fairly good intermediate models in subsequent procedures. On the challenging COFW dataset [1], our method achieves 6.33% mean error with a reduction of 21.37% compared with the best previous result [2]. |
Tasks | Face Alignment |
Published | 2016-07-31 |
URL | http://arxiv.org/abs/1608.00207v1 |
http://arxiv.org/pdf/1608.00207v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-representation-from-coarse-to |
Repo | |
Framework | |
Dense Motion Estimation for Smoke
Title | Dense Motion Estimation for Smoke |
Authors | Da Chen, Wenbin Li, Peter Hall |
Abstract | Motion estimation for highly dynamic phenomena such as smoke is an open challenge for Computer Vision. Traditional dense motion estimation algorithms have difficulties with non-rigid and large motions, both of which are frequently observed in smoke motion. We propose an algorithm for dense motion estimation of smoke. Our algorithm is robust, fast, and has better performance over different types of smoke compared to other dense motion estimation algorithms, including state of the art and neural network approaches. The key to our contribution is to use skeletal flow, without explicit point matching, to provide a sparse flow. This sparse flow is upgraded to a dense flow. In this paper we describe our algorithm in greater detail, and provide experimental evidence to support our claims. |
Tasks | Motion Estimation |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.02001v2 |
http://arxiv.org/pdf/1609.02001v2.pdf | |
PWC | https://paperswithcode.com/paper/dense-motion-estimation-for-smoke |
Repo | |
Framework | |