October 19, 2019

2876 words 14 mins read

Paper Group ANR 385

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation. Convex Relaxations of Convolutional Neural Nets. Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches. Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture. Inventory Balancing with Online Learning. Exploring the …

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation


Title	Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation
Authors	Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh
Abstract	We present a method to combine markerless motion capture and dense pose feature estimation into a single framework. We demonstrate that dense pose information can help for multiview/single-view motion capture, and multiview motion capture can help the collection of a high-quality dataset for training the dense pose detector. Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector. Thanks to the introduced dense human parsing ability, our method is demonstrated much more efficient, and accurate compared with the available state-of-the-art markerless motion capture approach. Second, we improve the performance of available dense pose detector by using multiview markerless motion capture data. Such dataset is beneficial to dense pose training because they are more dense and accurate and consistent, and can compensate for the corner cases such as unusual viewpoints. We quantitatively demonstrate the improved performance of our dense pose detector over the available DensePose. Our dense pose dataset and detector will be made public.
Tasks	Human Parsing, Markerless Motion Capture, Motion Capture, Pose Estimation
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01783v2
PDF	http://arxiv.org/pdf/1812.01783v2.pdf
PWC	https://paperswithcode.com/paper/capture-dense-markerless-motion-capture-meets
Repo
Framework

Convex Relaxations of Convolutional Neural Nets


Title	Convex Relaxations of Convolutional Neural Nets
Authors	Burak Bartan, Mert Pilanci
Abstract	We propose convex relaxations for convolutional neural nets with one hidden layer where the output weights are fixed. For convex activation functions such as rectified linear units, the relaxations are convex second order cone programs which can be solved very efficiently. We prove that the relaxation recovers the global minimum under a planted model assumption, given sufficiently many training samples from a Gaussian distribution. We also identify a phase transition phenomenon in recovering the global minimum for the relaxation.
Tasks
Published	2018-12-31
URL	http://arxiv.org/abs/1901.00035v1
PDF	http://arxiv.org/pdf/1901.00035v1.pdf
PWC	https://paperswithcode.com/paper/convex-relaxations-of-convolutional-neural
Repo
Framework

Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches


Title	Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches
Authors	Hoda Mohammadzade, Amirhossein Sayyafan, Benyamin Ghojogh
Abstract	The variation of pose, illumination and expression makes face recognition still a challenging problem. As a pre-processing in holistic approaches, faces are usually aligned by eyes. The proposed method tries to perform a pixel alignment rather than eye-alignment by mapping the geometry of faces to a reference face while keeping their own textures. The proposed geometry alignment not only creates a meaningful correspondence among every pixel of all faces, but also removes expression and pose variations effectively. The geometry alignment is performed pixel-wise, i.e., every pixel of the face is corresponded to a pixel of the reference face. In the proposed method, the information of intensity and geometry of faces are separated properly, trained by separate classifiers, and finally fused together to recognize human faces. Experimental results show a great improvement using the proposed method in comparison to eye-aligned recognition. For instance, at the false acceptance rate of 0.001, the recognition rates are respectively improved by 24% and 33% in Yale and AT&T datasets. In LFW dataset, which is a challenging big dataset, improvement is 20% at FAR of 0.1.
Tasks	Face Recognition
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02438v1
PDF	http://arxiv.org/pdf/1802.02438v1.pdf
PWC	https://paperswithcode.com/paper/pixel-level-alignment-of-facial-images-for
Repo
Framework

Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture


Title	Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture
Authors	Micha Livne, Leonid Sigal, Marcus A. Brubaker, David J. Fleet
Abstract	We propose a generative approach to physics-based motion capture. Unlike prior attempts to incorporate physics into tracking that assume the subject and scene geometry are calibrated and known a priori, our approach is automatic and online. This distinction is important since calibration of the environment is often difficult, especially for motions with props, uneven surfaces, or outdoor scenes. The use of physics in this context provides a natural framework to reason about contact and the plausibility of recovered motions. We propose a fast data-driven parametric body model, based on linear-blend skinning, which decouples deformations due to pose, anthropometrics and body shape. Pose (and shape) parameters are estimated using robust ICP optimization with physics-based dynamic priors that incorporate contact. Contact is estimated from torque trajectories and predictions of which contact points were active. To our knowledge, this is the first approach to take physics into account without explicit {\em a priori} knowledge of the environment or body dimensions. We demonstrate effective tracking from a noisy single depth camera, improving on state-of-the-art results quantitatively and producing better qualitative results, reducing visual artifacts like foot-skate and jitter.
Tasks	Calibration, Markerless Motion Capture, Motion Capture
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01203v1
PDF	http://arxiv.org/pdf/1812.01203v1.pdf
PWC	https://paperswithcode.com/paper/walking-on-thin-air-environment-free-physics
Repo
Framework

Inventory Balancing with Online Learning


Title	Inventory Balancing with Online Learning
Authors	Wang Chi Cheung, Will Ma, David Simchi-Levi, Xinshang Wang
Abstract	We study a general problem of allocating limited resources to heterogeneous customers over time under model uncertainty. Each type of customer can be serviced using different actions, each of which stochastically consumes some combination of resources, and returns different rewards for the resources consumed. We consider a general model where the resource consumption distribution associated with each (customer type, action)-combination is not known, but is consistent and can be learned over time. In addition, the sequence of customer types to arrive over time is arbitrary and completely unknown. We overcome both the challenges of model uncertainty and customer heterogeneity by judiciously synthesizing two algorithmic frameworks from the literature: inventory balancing, which “reserves” a portion of each resource for high-reward customer types which could later arrive, and online learning, which shows how to “explore” the resource consumption distributions of each customer type under different actions. We define an auxiliary problem, which allows for existing competitive ratio and regret bounds to be seamlessly integrated. Furthermore, we show that the performance guarantee generated by our framework is tight, that is, we provide an information-theoretic lower bound which shows that both the loss from competitive ratio and the loss for regret are relevant in the combined problem. Finally, we demonstrate the efficacy of our algorithms on a publicly available hotel data set. Our framework is highly practical in that it requires no historical data (no fitted customer choice models, nor forecasting of customer arrival patterns) and can be used to initialize allocation strategies in fast-changing environments.
Tasks
Published	2018-10-11
URL	http://arxiv.org/abs/1810.05640v1
PDF	http://arxiv.org/pdf/1810.05640v1.pdf
PWC	https://paperswithcode.com/paper/inventory-balancing-with-online-learning
Repo
Framework

Exploring the Naturalness of Buggy Code with Recurrent Neural Networks


Title	Exploring the Naturalness of Buggy Code with Recurrent Neural Networks
Authors	Jack Lanchantin, Ji Gao
Abstract	Statistical language models are powerful tools which have been used for many tasks within natural language processing. Recently, they have been used for other sequential data such as source code.(Ray et al., 2015) showed that it is possible train an n-gram source code language mode, and use it to predict buggy lines in code by determining “unnatural” lines via entropy with respect to the language model. In this work, we propose using a more advanced language modeling technique, Long Short-term Memory recurrent neural networks, to model source code and classify buggy lines based on entropy. We show that our method slightly outperforms an n-gram model in the buggy line classification task using AUC.
Tasks	Language Modelling
Published	2018-03-21
URL	http://arxiv.org/abs/1803.08793v1
PDF	http://arxiv.org/pdf/1803.08793v1.pdf
PWC	https://paperswithcode.com/paper/exploring-the-naturalness-of-buggy-code-with
Repo
Framework

Deep Ordinal Hashing with Spatial Attention


Title	Deep Ordinal Hashing with Spatial Attention
Authors	Lu Jin, Xiangbo Shu, Kai Li, Zechao Li, Guo-Jun Qi, Jinhui Tang
Abstract	Hashing has attracted increasing research attentions in recent years due to its high efficiency of computation and storage in image retrieval. Recent works have demonstrated the superiority of simultaneous feature representations and hash functions learning with deep neural networks. However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images. The loss of local spatial structure makes the performance bottleneck of hash functions, therefore limiting its application for accurate similarity retrieval. In this work, we propose a novel Deep Ordinal Hashing (DOH) method, which learns ordinal representations by leveraging the ranking structure of feature space from both local and global views. In particular, to effectively build the ranking structure, we propose to learn the rank correlation space by exploiting the local spatial information from Fully Convolutional Network (FCN) and the global semantic information from the Convolutional Neural Network (CNN) simultaneously. More specifically, an effective spatial attention model is designed to capture the local spatial information by selectively learning well-specified locations closely related to target objects. In such hashing framework,the local spatial and global semantic nature of images are captured in an end-to-end ranking-to-hashing manner. Experimental results conducted on three widely-used datasets demonstrate that the proposed DOH method significantly outperforms the state-of-the-art hashing methods.
Tasks	Image Retrieval
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02459v1
PDF	http://arxiv.org/pdf/1805.02459v1.pdf
PWC	https://paperswithcode.com/paper/deep-ordinal-hashing-with-spatial-attention
Repo
Framework

CyLKs: Unsupervised Cycle Lucas-Kanade Network for Landmark Tracking


Title	CyLKs: Unsupervised Cycle Lucas-Kanade Network for Landmark Tracking
Authors	Xinshuo Weng, Wentao Han
Abstract	Across a majority of modern learning-based tracking systems, expensive annotations are needed to achieve state-of-the-art performance. In contrast, the Lucas-Kanade (LK) algorithm works well without any annotation. However, LK has a strong assumption of photometric (brightness) consistency on image intensity and is easy to drift because of large motion, occlusion, and aperture problem. To relax the assumption and alleviate the drift problem, we propose CyLKs, a data-driven way of training Lucas-Kanade in an unsupervised manner. CyLKs learns a feature transformation through CNNs, transforming the input images to a feature space which is especially favorable to LK tracking. During training, we perform differentiable Lucas-Kanade forward and backward on the convolutional feature maps, and then minimize the re-projection error. During testing, we perform the LK tracking on the learned features. We apply our model to the task of landmark tracking and perform experiments on datasets of THUMOS and 300VW.
Tasks	Landmark Tracking
Published	2018-11-28
URL	https://arxiv.org/abs/1811.11325v4
PDF	https://arxiv.org/pdf/1811.11325v4.pdf
PWC	https://paperswithcode.com/paper/cylks-unsupervised-cycle-lucas-kanade-network
Repo
Framework

Convexification of Neural Graph


Title	Convexification of Neural Graph
Authors	Han Xiao
Abstract	Traditionally, most complex intelligence architectures are extremely non-convex, which could not be well performed by convex optimization. However, this paper decomposes complex structures into three types of nodes: operators, algorithms and functions. Iteratively, propagating from node to node along edge, we prove that “regarding the tree-structured neural graph, it is nearly convex in each variable, when the other variables are fixed.” In fact, the non-convex properties stem from circles and functions, which could be transformed to be convex with our proposed \textit{\textbf{scale mechanism}}. Experimentally, we justify our theoretical analysis by two practical applications.
Tasks
Published	2018-01-09
URL	http://arxiv.org/abs/1801.02901v2
PDF	http://arxiv.org/pdf/1801.02901v2.pdf
PWC	https://paperswithcode.com/paper/convexification-of-neural-graph
Repo
Framework

Attention Boosted Sequential Inference Model


Title	Attention Boosted Sequential Inference Model
Authors	Guanyu Li, Pengfei Zhang, Caiyan Jia
Abstract	Attention mechanism has been proven effective on natural language processing. This paper proposes an attention boosted natural language inference model named aESIM by adding word attention and adaptive direction-oriented attention mechanisms to the traditional Bi-LSTM layer of natural language inference models, e.g. ESIM. This makes the inference model aESIM has the ability to effectively learn the representation of words and model the local subsentential inference between pairs of premise and hypothesis. The empirical studies on the SNLI, MultiNLI and Quora benchmarks manifest that aESIM is superior to the original ESIM model.
Tasks	Natural Language Inference
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01840v2
PDF	http://arxiv.org/pdf/1812.01840v2.pdf
PWC	https://paperswithcode.com/paper/attention-boosted-sequential-inference-model
Repo
Framework

Generating Shared Latent Variables for Robots to Imitate Human Movements and Understand their Physical Limitations


Title	Generating Shared Latent Variables for Robots to Imitate Human Movements and Understand their Physical Limitations
Authors	Maxime Devanne, Sao Mai Nguyen
Abstract	Assistive robotics and particularly robot coaches may be very helpful for rehabilitation healthcare. In this context, we propose a method based on Gaussian Process Latent Variable Model (GP-LVM) to transfer knowledge between a physiotherapist, a robot coach and a patient. Our model is able to map visual human body features to robot data in order to facilitate the robot learning and imitation. In addition , we propose to extend the model to adapt robots’ understanding to patient’s physical limitations during the assessment of rehabilitation exercises. Experimental evaluation demonstrates promising results for both robot imitation and model adaptation according to the patients’ limitations.
Tasks
Published	2018-10-11
URL	http://arxiv.org/abs/1810.04879v2
PDF	http://arxiv.org/pdf/1810.04879v2.pdf
PWC	https://paperswithcode.com/paper/generating-shared-latent-variables-for-robots
Repo
Framework

The global optimum of shallow neural network is attained by ridgelet transform


Title	The global optimum of shallow neural network is attained by ridgelet transform
Authors	Sho Sonoda, Isao Ishikawa, Masahiro Ikeda, Kei Hagihara, Yoshihiro Sawano, Takuo Matsubara, Noboru Murata
Abstract	We prove that the global minimum of the backpropagation (BP) training problem of neural networks with an arbitrary nonlinear activation is given by the ridgelet transform. A series of computational experiments show that there exists an interesting similarity between the scatter plot of hidden parameters in a shallow neural network after the BP training and the spectrum of the ridgelet transform. By introducing a continuous model of neural networks, we reduce the training problem to a convex optimization in an infinite dimensional Hilbert space, and obtain the explicit expression of the global optimizer via the ridgelet transform.
Tasks
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07517v3
PDF	http://arxiv.org/pdf/1805.07517v3.pdf
PWC	https://paperswithcode.com/paper/the-global-optimum-of-shallow-neural-network
Repo
Framework

A Deep Learning Framework for Single-Sided Sound Speed Inversion in Medical Ultrasound


Title	A Deep Learning Framework for Single-Sided Sound Speed Inversion in Medical Ultrasound
Authors	Micha Feigin, Daniel Freedman, Brian W. Anthony
Abstract	Objective: Ultrasound elastography is gaining traction as an accessible and useful diagnostic tool for such things as cancer detection and differentiation and thyroid disease diagnostics. Unfortunately, state of the art shear wave imaging techniques, essential to promote this goal, are limited to high-end ultrasound hardware due to high power requirements; are extremely sensitive to patient and sonographer motion, and generally, suffer from low frame rates. Motivated by research and theory showing that longitudinal wave sound speed carries similar diagnostic abilities to shear wave imaging, we present an alternative approach using single sided pressure-wave sound speed measurements from channel data. Methods: In this paper, we present a single-sided sound speed inversion solution using a fully convolutional deep neural network. We use simulations for training, allowing the generation of limitless ground truth data. Results: We show that it is possible to invert for longitudinal sound speed in soft tissue at high frame rates. We validate the method on simulated data. We present highly encouraging results on limited real data. Conclusion: Sound speed inversion on channel data has significant potential, made possible in real time with deep learning technologies. Significance: Specialized shear wave ultrasound systems remain inaccessible in many locations. longitudinal sound speed and deep learning technologies enable an alternative approach to diagnosis based on tissue elasticity. High frame rates are possible.
Tasks
Published	2018-09-30
URL	https://arxiv.org/abs/1810.00322v4
PDF	https://arxiv.org/pdf/1810.00322v4.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-framework-for-single-sided
Repo
Framework

Obstacle Detection Quality as a Problem-Oriented Approach to Stereo Vision Algorithms Estimation in Road Situation Analysis


Title	Obstacle Detection Quality as a Problem-Oriented Approach to Stereo Vision Algorithms Estimation in Road Situation Analysis
Authors	A. A. Smagina, D. A. Shepelev, E. I. Ershov, A. S. Grigoryev
Abstract	In this work we present a method for performance evaluation of stereo vision based obstacle detection techniques that takes into account the specifics of road situation analysis to minimize the effort required to prepare a test dataset. This approach has been designed to be implemented in systems such as self-driving cars or driver assistance and can also be used as problem-oriented quality criterion for evaluation of stereo vision algorithms.
Tasks	Self-Driving Cars
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02228v1
PDF	http://arxiv.org/pdf/1809.02228v1.pdf
PWC	https://paperswithcode.com/paper/obstacle-detection-quality-as-a-problem
Repo
Framework

Incorporating Privileged Information to Unsupervised Anomaly Detection


Title	Incorporating Privileged Information to Unsupervised Anomaly Detection
Authors	Shubhranshu Shekhar, Leman Akoglu
Abstract	We introduce a new unsupervised anomaly detection ensemble called SPI which can harness privileged information - data available only for training examples but not for (future) test examples. Our ideas build on the Learning Using Privileged Information (LUPI) paradigm pioneered by Vapnik et al. [19,17], which we extend to unsupervised learning and in particular to anomaly detection. SPI (for Spotting anomalies with Privileged Information) constructs a number of frames/fragments of knowledge (i.e., density estimates) in the privileged space and transfers them to the anomaly scoring space through “imitation” functions that use only the partial information available for test examples. Our generalization of the LUPI paradigm to unsupervised anomaly detection shepherds the field in several key directions, including (i) domain knowledge-augmented detection using expert annotations as PI, (ii) fast detection using computationally-demanding data as PI, and (iii) early detection using “historical future” data as PI. Through extensive experiments on simulated and real datasets, we show that augmenting privileged information to anomaly detection significantly improves detection performance. We also demonstrate the promise of SPI under all three settings (i-iii); with PI capturing expert knowledge, computationally expensive features, and future data on three real world detection tasks.
Tasks	Anomaly Detection, Unsupervised Anomaly Detection
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02269v2
PDF	http://arxiv.org/pdf/1805.02269v2.pdf
PWC	https://paperswithcode.com/paper/incorporating-privileged-information-to
Repo
Framework