July 27, 2019

3045 words 15 mins read

Paper Group ANR 591

Drift Analysis. Supervised Deep Hashing for Hierarchical Labeled Data. Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning. A Real-time and Registration-free Framework for Dynamic Shape Instantiation. Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning. Product Graph-based Highe …

Drift Analysis


Title	Drift Analysis
Authors	Johannes Lengler
Abstract	Drift analysis is one of the major tools for analysing evolutionary algorithms and nature-inspired search heuristics. In this chapter we give an introduction to drift analysis and give some examples of how to use it for the analysis of evolutionary algorithms.
Tasks
Published	2017-12-04
URL	http://arxiv.org/abs/1712.00964v2
PDF	http://arxiv.org/pdf/1712.00964v2.pdf
PWC	https://paperswithcode.com/paper/drift-analysis
Repo
Framework

Supervised Deep Hashing for Hierarchical Labeled Data


Title	Supervised Deep Hashing for Hierarchical Labeled Data
Authors	Dan Wang, Heyan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao
Abstract	Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing hashing methods did not consider the hierarchical relation of labels, which means that they ignored the rich information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. In this paper, we propose a novel deep hashing method, called supervised hierarchical deep hashing (SHDH), to perform hash code learning for hierarchical labeled data. Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point. Extensive experiments on several real-world public datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task.
Tasks	Image Retrieval
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02088v3
PDF	http://arxiv.org/pdf/1704.02088v3.pdf
PWC	https://paperswithcode.com/paper/supervised-deep-hashing-for-hierarchical
Repo
Framework

Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning


Title	Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning
Authors	Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos
Abstract	A common problem in disciplines of applied Statistics research such as Astrostatistics is of estimating the posterior distribution of relevant parameters. Typically, the likelihoods for such models are computed via expensive experiments such as cosmological simulations of the universe. An urgent challenge in these research domains is to develop methods that can estimate the posterior with few likelihood evaluations. In this paper, we study active posterior estimation in a Bayesian setting when the likelihood is expensive to evaluate. Existing techniques for posterior estimation are based on generating samples representative of the posterior. Such methods do not consider efficiency in terms of likelihood evaluations. In order to be query efficient we treat posterior estimation in an active regression framework. We propose two myopic query strategies to choose where to evaluate the likelihood and implement them using Gaussian processes. Via experiments on a series of synthetic and real examples we demonstrate that our approach is significantly more query efficient than existing techniques and other heuristics for posterior estimation.
Tasks	Active Learning, Gaussian Processes
Published	2017-02-03
URL	http://arxiv.org/abs/1702.01145v1
PDF	http://arxiv.org/pdf/1702.01145v1.pdf
PWC	https://paperswithcode.com/paper/query-efficient-posterior-estimation-in
Repo
Framework

A Real-time and Registration-free Framework for Dynamic Shape Instantiation


Title	A Real-time and Registration-free Framework for Dynamic Shape Instantiation
Authors	Xiao-Yun Zhou, Guang-Zhong Yang, Su-Lin Lee
Abstract	Real-time 3D navigation during minimally invasive procedures is an essential yet challenging task, especially when considerable tissue motion is involved. To balance image acquisition speed and resolution, only 2D images or low-resolution 3D volumes can be used clinically. In this paper, a real-time and registration-free framework for dynamic shape instantiation, generalizable to multiple anatomical applications, is proposed to instantiate high-resolution 3D shapes of an organ from a single 2D image intra-operatively. Firstly, an approximate optimal scan plane was determined by analyzing the pre-operative 3D statistical shape model (SSM) of the anatomy with sparse principal component analysis (SPCA) and considering practical constraints . Secondly, kernel partial least squares regression (KPLSR) was used to learn the relationship between the pre-operative 3D SSM and a synchronized 2D SSM constructed from 2D images obtained at the approximate optimal scan plane. Finally, the derived relationship was applied to the new intra-operative 2D image obtained at the same scan plane to predict the high-resolution 3D shape intra-operatively. A major feature of the proposed framework is that no extra registration between the pre-operative 3D SSM and the synchronized 2D SSM is required. Detailed validation was performed on studies including the liver and right ventricle (RV) of the heart. The derived results (mean accuracy of 2.19mm on patients and computation speed of 1ms) demonstrate its potential clinical value for real-time, high-resolution, dynamic and 3D interventional guidance.
Tasks
Published	2017-12-30
URL	http://arxiv.org/abs/1801.00182v1
PDF	http://arxiv.org/pdf/1801.00182v1.pdf
PWC	https://paperswithcode.com/paper/a-real-time-and-registration-free-framework
Repo
Framework

Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning


Title	Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning
Authors	Wenbin Li, Jeannette Bohg, Mario Fritz
Abstract	Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.
Tasks
Published	2017-11-01
URL	http://arxiv.org/abs/1711.00267v2
PDF	http://arxiv.org/pdf/1711.00267v2.pdf
PWC	https://paperswithcode.com/paper/acquiring-target-stacking-skills-by-goal
Repo
Framework

Product Graph-based Higher Order Contextual Similarities for Inexact Subgraph Matching


Title	Product Graph-based Higher Order Contextual Similarities for Inexact Subgraph Matching
Authors	Anjan Dutta, Josep Lladós, Horst Bunke, Umapada Pal
Abstract	Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities add discriminating power and allow one to find approximate solutions to the subgraph matching problem.
Tasks	Graph Matching
Published	2017-02-01
URL	http://arxiv.org/abs/1702.00391v1
PDF	http://arxiv.org/pdf/1702.00391v1.pdf
PWC	https://paperswithcode.com/paper/product-graph-based-higher-order-contextual
Repo
Framework

Out-of-focus: Learning Depth from Image Bokeh for Robotic Perception


Title	Out-of-focus: Learning Depth from Image Bokeh for Robotic Perception
Authors	Eric Cristofalo, Zijian Wang
Abstract	In this project, we propose a novel approach for estimating depth from RGB images. Traditionally, most work uses a single RGB image to estimate depth, which is inherently difficult and generally results in poor performance, even with thousands of data examples. In this work, we alternatively use multiple RGB images that were captured while changing the focus of the camera’s lens. This method leverages the natural depth information correlated to the different patterns of clarity/blur in the sequence of focal images, which helps distinguish objects at different depths. Since no such data set exists for learning this mapping, we collect our own data set using customized hardware. We then use a convolutional neural network for learning the depth from the stacked focal images. Comparative studies were conducted on both a standard RGBD data set and our own data set (learning from both single and multiple images), and results verified that stacked focal images yield better depth estimation than using just single RGB image.
Tasks	Depth Estimation
Published	2017-05-02
URL	http://arxiv.org/abs/1705.01152v1
PDF	http://arxiv.org/pdf/1705.01152v1.pdf
PWC	https://paperswithcode.com/paper/out-of-focus-learning-depth-from-image-bokeh
Repo
Framework

Leveraging Sparsity for Efficient Submodular Data Summarization


Title	Leveraging Sparsity for Efficient Submodular Data Summarization
Authors	Erik M. Lindgren, Shanshan Wu, Alexandros G. Dimakis
Abstract	The facility location problem is widely used for summarizing large datasets and has additional applications in sensor placement, image retrieval, and clustering. One difficulty of this problem is that submodular optimization algorithms require the calculation of pairwise benefits for all items in the dataset. This is infeasible for large problems, so recent work proposed to only calculate nearest neighbor benefits. One limitation is that several strong assumptions were invoked to obtain provable approximation guarantees. In this paper we establish that these extra assumptions are not necessary—solving the sparsified problem will be almost optimal under the standard assumptions of the problem. We then analyze a different method of sparsification that is a better model for methods such as Locality Sensitive Hashing to accelerate the nearest neighbor computations and extend the use of the problem to a broader family of similarities. We validate our approach by demonstrating that it rapidly generates interpretable summaries.
Tasks	Data Summarization, Image Retrieval
Published	2017-03-08
URL	http://arxiv.org/abs/1703.02690v1
PDF	http://arxiv.org/pdf/1703.02690v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-sparsity-for-efficient-submodular
Repo
Framework

CMU LiveMedQA at TREC 2017 LiveQA: A Consumer Health Question Answering System


Title	CMU LiveMedQA at TREC 2017 LiveQA: A Consumer Health Question Answering System
Authors	Yuan Yang, Jingcheng Yu, Ye Hu, Xiaoyao Xu, Eric Nyberg
Abstract	In this paper, we present LiveMedQA, a question answering system that is optimized for consumer health question. On top of the general QA system pipeline, we introduce several new features that aim to exploit domain-specific knowledge and entity structures for better performance. This includes a question type/focus analyzer based on deep text classification model, a tree-based knowledge graph for answer generation and a complementary structure-aware searcher for answer retrieval. LiveMedQA system is evaluated in the TREC 2017 LiveQA medical subtask, where it received an average score of 0.356 on a 3 point scale. Evaluation results revealed 3 substantial drawbacks in current LiveMedQA system, based on which we provide a detailed discussion and propose a few solutions that constitute the main focus of our subsequent work.
Tasks	Question Answering, Text Classification
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05789v1
PDF	http://arxiv.org/pdf/1711.05789v1.pdf
PWC	https://paperswithcode.com/paper/cmu-livemedqa-at-trec-2017-liveqa-a-consumer
Repo
Framework

SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering


Title	SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
Authors	Hsiou-Yuan Liu, Dehong Liu, Hassan Mansour, Petros T. Boufounos, Laura Waller, Ulugbek S. Kamilov
Abstract	Multiple scattering of an electromagnetic wave as it passes through an object is a fundamental problem that limits the performance of current imaging systems. In this paper, we describe a new technique-called Series Expansion with Accelerated Gradient Descent on Lippmann-Schwinger Equation (SEAGLE)-for robust imaging under multiple scattering based on a combination of a new nonlinear forward model and a total variation (TV) regularizer. The proposed forward model can account for multiple scattering, which makes it advantageous in applications where linear models are inaccurate. Specifically, it corresponds to a series expansion of the scattered wave with an accelerated-gradient method. This expansion guarantees the convergence even for strongly scattering objects. One of our key insights is that it is possible to obtain an explicit formula for computing the gradient of our nonlinear forward model with respect to the unknown object, thus enabling fast image reconstruction with the state-of-the-art fast iterative shrinkage/thresholding algorithm (FISTA). The proposed method is validated on both simulated and experimentally measured data.
Tasks	Image Reconstruction
Published	2017-05-05
URL	http://arxiv.org/abs/1705.04281v1
PDF	http://arxiv.org/pdf/1705.04281v1.pdf
PWC	https://paperswithcode.com/paper/seagle-sparsity-driven-image-reconstruction
Repo
Framework

VGR-Net: A View Invariant Gait Recognition Network


Title	VGR-Net: A View Invariant Gait Recognition Network
Authors	Daksh Thapar, Divyansh Aggarwal, Punjal Agarwal, Aditya Nigam
Abstract	Biometric identification systems have become immensely popular and important because of their high reliability and efficiency. However person identification at a distance, still remains a challenging problem. Gait can be seen as an essential biometric feature for human recognition and identification. It can be easily acquired from a distance and does not require any user cooperation thus making it suitable for surveillance. But the task of recognizing an individual using gait can be adversely affected by varying view points making this task more and more challenging. Our proposed approach tackles this problem by identifying spatio-temporal features and performing extensive experimentation and training mechanism. In this paper, we propose a 3-D Convolution Deep Neural Network for person identification using gait under multiple view. It is a 2-stage network, in which we have a classification network that initially identifies the viewing point angle. After that another set of networks (one for each angle) has been trained to identify the person under a particular viewing angle. We have tested this network over CASIA-B publicly available database and have achieved state-of-the-art results. The proposed system is much more efficient in terms of time and space and performing better for almost all angles.
Tasks	Gait Recognition, Person Identification
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04803v1
PDF	http://arxiv.org/pdf/1710.04803v1.pdf
PWC	https://paperswithcode.com/paper/vgr-net-a-view-invariant-gait-recognition
Repo
Framework

Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach


Title	Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach
Authors	Bowen Cheng, Zhangyang Wang, Zhaobin Zhang, Zhu Li, Ding Liu, Jianchao Yang, Shuai Huang, Thomas S. Huang
Abstract	Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications. However, the inadequate network bandwidth often limits the spatial resolution of the transmitted video, which will heavily degrade the recognition reliability. We develop a novel framework to achieve robust emotion recognition from low bit rate video. While video frames are downsampled at the encoder side, the decoder is embedded with a deep network model for joint super-resolution (SR) and recognition. Notably, we propose a novel max-mix training strategy, leading to a single “One-for-All” model that is remarkably robust to a vast range of downsampling factors. That makes our framework well adapted for the varied bandwidths in real transmission scenarios, without hampering scalability or efficiency. The proposed framework is evaluated on the AVEC 2016 benchmark, and demonstrates significantly improved stand-alone recognition performance, as well as rate-distortion (R-D) performance, than either directly recognizing from LR frames, or separating SR and recognition.
Tasks	Emotion Recognition, Super-Resolution
Published	2017-09-10
URL	http://arxiv.org/abs/1709.03126v1
PDF	http://arxiv.org/pdf/1709.03126v1.pdf
PWC	https://paperswithcode.com/paper/robust-emotion-recognition-from-low-quality
Repo
Framework

Scalable Annotation of Fine-Grained Categories Without Experts


Title	Scalable Annotation of Fine-Grained Categories Without Experts
Authors	Timnit Gebru, Jonathan Krause, Jia Deng, Li Fei-Fei
Abstract	We present a crowdsourcing workflow to collect image annotations for visually similar synthetic categories without requiring experts. In animals, there is a direct link between taxonomy and visual similarity: e.g. a collie (type of dog) looks more similar to other collies (e.g. smooth collie) than a greyhound (another type of dog). However, in synthetic categories such as cars, objects with similar taxonomy can have very different appearance: e.g. a 2011 Ford F-150 Supercrew-HD looks the same as a 2011 Ford F-150 Supercrew-LL but very different from a 2011 Ford F-150 Supercrew-SVT. We introduce a graph based crowdsourcing algorithm to automatically group visually indistinguishable objects together. Using our workflow, we label 712,430 images by ~1,000 Amazon Mechanical Turk workers; resulting in the largest fine-grained visual dataset reported to date with 2,657 categories of cars annotated at 1/20th the cost of hiring experts.
Tasks
Published	2017-09-07
URL	http://arxiv.org/abs/1709.02482v1
PDF	http://arxiv.org/pdf/1709.02482v1.pdf
PWC	https://paperswithcode.com/paper/scalable-annotation-of-fine-grained
Repo
Framework

Spherical Wards clustering and generalized Voronoi diagrams


Title	Spherical Wards clustering and generalized Voronoi diagrams
Authors	Marek Śmieja, Jacek Tabor
Abstract	Gaussian mixture model is very useful in many practical problems. Nevertheless, it cannot be directly generalized to non Euclidean spaces. To overcome this problem we present a spherical Gaussian-based clustering approach for partitioning data sets with respect to arbitrary dissimilarity measure. The proposed method is a combination of spherical Cross-Entropy Clustering with a generalized Wards approach. The algorithm finds the optimal number of clusters by automatically removing groups which carry no information. Moreover, it is scale invariant and allows for forming of spherically-shaped clusters of arbitrary sizes. In order to graphically represent and interpret the results the notion of Voronoi diagram was generalized to non Euclidean spaces and applied for introduced clustering method.
Tasks
Published	2017-05-04
URL	http://arxiv.org/abs/1705.02232v1
PDF	http://arxiv.org/pdf/1705.02232v1.pdf
PWC	https://paperswithcode.com/paper/spherical-wards-clustering-and-generalized
Repo
Framework

Efficient Structure from Motion for Oblique UAV Images Based on Maximal Spanning Tree Expansions


Title	Efficient Structure from Motion for Oblique UAV Images Based on Maximal Spanning Tree Expansions
Authors	San Jiang, Wanshou Jiang
Abstract	The primary contribution of this paper is an efficient Structure from Motion (SfM) solution for oblique unmanned aerial vehicle (UAV) images. First, an algorithm, considering spatial relationship constrains between image footprints, is designed for match pair selection with assistant of UAV flight control data and oblique camera mounting angles. Second, a topological connection network (TCN), represented by an undirected weighted graph, is constructed from initial match pairs, which encodes overlap area and intersection angle into edge weights. Then, an algorithm, termed MST-Expansion, is proposed to extract the match graph from the TCN where the TCN is firstly simplified by a maximum spanning tree (MST). By further analysis of local structure in the MST, expansion operations are performed on the nodes of the MST for match graph enhancement, which is achieved by introducing critical connections in two expansion directions. Finally, guided by the match graph, an efficient SfM solution is proposed, and its validation is verified through comprehensive analysis and comparison using three UAV datasets captured with different oblique multi-camera systems. Experiment results demonstrate that the efficiency of image matching is improved with a speedup ratio ranging from 19 to 35, and competitive orientation accuracy is achieved from both relative bundle adjustment (BA) without GCPs (Ground Control Points) and absolute BA with GCPs. At the same time, images in the three datasets are successfully oriented. For orientation of oblique UAV images, the proposed method can be a more efficient solution.
Tasks
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03212v1
PDF	http://arxiv.org/pdf/1705.03212v1.pdf
PWC	https://paperswithcode.com/paper/efficient-structure-from-motion-for-oblique
Repo
Framework