October 19, 2019

2876 words 14 mins read

Paper Group ANR 385

Paper Group ANR 385

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation. Convex Relaxations of Convolutional Neural Nets. Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches. Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture. Inventory Balancing with Online Learning. Exploring the …

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation

Title Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation
Authors Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh
Abstract We present a method to combine markerless motion capture and dense pose feature estimation into a single framework. We demonstrate that dense pose information can help for multiview/single-view motion capture, and multiview motion capture can help the collection of a high-quality dataset for training the dense pose detector. Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector. Thanks to the introduced dense human parsing ability, our method is demonstrated much more efficient, and accurate compared with the available state-of-the-art markerless motion capture approach. Second, we improve the performance of available dense pose detector by using multiview markerless motion capture data. Such dataset is beneficial to dense pose training because they are more dense and accurate and consistent, and can compensate for the corner cases such as unusual viewpoints. We quantitatively demonstrate the improved performance of our dense pose detector over the available DensePose. Our dense pose dataset and detector will be made public.
Tasks Human Parsing, Markerless Motion Capture, Motion Capture, Pose Estimation
Published 2018-12-05
URL http://arxiv.org/abs/1812.01783v2
PDF http://arxiv.org/pdf/1812.01783v2.pdf
PWC https://paperswithcode.com/paper/capture-dense-markerless-motion-capture-meets
Repo
Framework

Convex Relaxations of Convolutional Neural Nets

Title Convex Relaxations of Convolutional Neural Nets
Authors Burak Bartan, Mert Pilanci
Abstract We propose convex relaxations for convolutional neural nets with one hidden layer where the output weights are fixed. For convex activation functions such as rectified linear units, the relaxations are convex second order cone programs which can be solved very efficiently. We prove that the relaxation recovers the global minimum under a planted model assumption, given sufficiently many training samples from a Gaussian distribution. We also identify a phase transition phenomenon in recovering the global minimum for the relaxation.
Tasks
Published 2018-12-31
URL http://arxiv.org/abs/1901.00035v1
PDF http://arxiv.org/pdf/1901.00035v1.pdf
PWC https://paperswithcode.com/paper/convex-relaxations-of-convolutional-neural
Repo
Framework

Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches

Title Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches
Authors Hoda Mohammadzade, Amirhossein Sayyafan, Benyamin Ghojogh
Abstract The variation of pose, illumination and expression makes face recognition still a challenging problem. As a pre-processing in holistic approaches, faces are usually aligned by eyes. The proposed method tries to perform a pixel alignment rather than eye-alignment by mapping the geometry of faces to a reference face while keeping their own textures. The proposed geometry alignment not only creates a meaningful correspondence among every pixel of all faces, but also removes expression and pose variations effectively. The geometry alignment is performed pixel-wise, i.e., every pixel of the face is corresponded to a pixel of the reference face. In the proposed method, the information of intensity and geometry of faces are separated properly, trained by separate classifiers, and finally fused together to recognize human faces. Experimental results show a great improvement using the proposed method in comparison to eye-aligned recognition. For instance, at the false acceptance rate of 0.001, the recognition rates are respectively improved by 24% and 33% in Yale and AT&T datasets. In LFW dataset, which is a challenging big dataset, improvement is 20% at FAR of 0.1.
Tasks Face Recognition
Published 2018-02-07
URL http://arxiv.org/abs/1802.02438v1
PDF http://arxiv.org/pdf/1802.02438v1.pdf
PWC https://paperswithcode.com/paper/pixel-level-alignment-of-facial-images-for
Repo
Framework

Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture

Title Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture
Authors Micha Livne, Leonid Sigal, Marcus A. Brubaker, David J. Fleet
Abstract We propose a generative approach to physics-based motion capture. Unlike prior attempts to incorporate physics into tracking that assume the subject and scene geometry are calibrated and known a priori, our approach is automatic and online. This distinction is important since calibration of the environment is often difficult, especially for motions with props, uneven surfaces, or outdoor scenes. The use of physics in this context provides a natural framework to reason about contact and the plausibility of recovered motions. We propose a fast data-driven parametric body model, based on linear-blend skinning, which decouples deformations due to pose, anthropometrics and body shape. Pose (and shape) parameters are estimated using robust ICP optimization with physics-based dynamic priors that incorporate contact. Contact is estimated from torque trajectories and predictions of which contact points were active. To our knowledge, this is the first approach to take physics into account without explicit {\em a priori} knowledge of the environment or body dimensions. We demonstrate effective tracking from a noisy single depth camera, improving on state-of-the-art results quantitatively and producing better qualitative results, reducing visual artifacts like foot-skate and jitter.
Tasks Calibration, Markerless Motion Capture, Motion Capture
Published 2018-12-04
URL http://arxiv.org/abs/1812.01203v1
PDF http://arxiv.org/pdf/1812.01203v1.pdf
PWC https://paperswithcode.com/paper/walking-on-thin-air-environment-free-physics
Repo
Framework

Inventory Balancing with Online Learning

Title Inventory Balancing with Online Learning
Authors Wang Chi Cheung, Will Ma, David Simchi-Levi, Xinshang Wang
Abstract We study a general problem of allocating limited resources to heterogeneous customers over time under model uncertainty. Each type of customer can be serviced using different actions, each of which stochastically consumes some combination of resources, and returns different rewards for the resources consumed. We consider a general model where the resource consumption distribution associated with each (customer type, action)-combination is not known, but is consistent and can be learned over time. In addition, the sequence of customer types to arrive over time is arbitrary and completely unknown. We overcome both the challenges of model uncertainty and customer heterogeneity by judiciously synthesizing two algorithmic frameworks from the literature: inventory balancing, which “reserves” a portion of each resource for high-reward customer types which could later arrive, and online learning, which shows how to “explore” the resource consumption distributions of each customer type under different actions. We define an auxiliary problem, which allows for existing competitive ratio and regret bounds to be seamlessly integrated. Furthermore, we show that the performance guarantee generated by our framework is tight, that is, we provide an information-theoretic lower bound which shows that both the loss from competitive ratio and the loss for regret are relevant in the combined problem. Finally, we demonstrate the efficacy of our algorithms on a publicly available hotel data set. Our framework is highly practical in that it requires no historical data (no fitted customer choice models, nor forecasting of customer arrival patterns) and can be used to initialize allocation strategies in fast-changing environments.
Tasks
Published 2018-10-11
URL http://arxiv.org/abs/1810.05640v1
PDF http://arxiv.org/pdf/1810.05640v1.pdf
PWC https://paperswithcode.com/paper/inventory-balancing-with-online-learning
Repo
Framework

Exploring the Naturalness of Buggy Code with Recurrent Neural Networks

Title Exploring the Naturalness of Buggy Code with Recurrent Neural Networks
Authors Jack Lanchantin, Ji Gao
Abstract Statistical language models are powerful tools which have been used for many tasks within natural language processing. Recently, they have been used for other sequential data such as source code.(Ray et al., 2015) showed that it is possible train an n-gram source code language mode, and use it to predict buggy lines in code by determining “unnatural” lines via entropy with respect to the language model. In this work, we propose using a more advanced language modeling technique, Long Short-term Memory recurrent neural networks, to model source code and classify buggy lines based on entropy. We show that our method slightly outperforms an n-gram model in the buggy line classification task using AUC.
Tasks Language Modelling
Published 2018-03-21
URL http://arxiv.org/abs/1803.08793v1
PDF http://arxiv.org/pdf/1803.08793v1.pdf
PWC https://paperswithcode.com/paper/exploring-the-naturalness-of-buggy-code-with
Repo
Framework

Deep Ordinal Hashing with Spatial Attention

Title Deep Ordinal Hashing with Spatial Attention
Authors Lu Jin, Xiangbo Shu, Kai Li, Zechao Li, Guo-Jun Qi, Jinhui Tang
Abstract Hashing has attracted increasing research attentions in recent years due to its high efficiency of computation and storage in image retrieval. Recent works have demonstrated the superiority of simultaneous feature representations and hash functions learning with deep neural networks. However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images. The loss of local spatial structure makes the performance bottleneck of hash functions, therefore limiting its application for accurate similarity retrieval. In this work, we propose a novel Deep Ordinal Hashing (DOH) method, which learns ordinal representations by leveraging the ranking structure of feature space from both local and global views. In particular, to effectively build the ranking structure, we propose to learn the rank correlation space by exploiting the local spatial information from Fully Convolutional Network (FCN) and the global semantic information from the Convolutional Neural Network (CNN) simultaneously. More specifically, an effective spatial attention model is designed to capture the local spatial information by selectively learning well-specified locations closely related to target objects. In such hashing framework,the local spatial and global semantic nature of images are captured in an end-to-end ranking-to-hashing manner. Experimental results conducted on three widely-used datasets demonstrate that the proposed DOH method significantly outperforms the state-of-the-art hashing methods.
Tasks Image Retrieval
Published 2018-05-07
URL http://arxiv.org/abs/1805.02459v1
PDF http://arxiv.org/pdf/1805.02459v1.pdf
PWC https://paperswithcode.com/paper/deep-ordinal-hashing-with-spatial-attention
Repo
Framework

CyLKs: Unsupervised Cycle Lucas-Kanade Network for Landmark Tracking

Title CyLKs: Unsupervised Cycle Lucas-Kanade Network for Landmark Tracking
Authors Xinshuo Weng, Wentao Han
Abstract Across a majority of modern learning-based tracking systems, expensive annotations are needed to achieve state-of-the-art performance. In contrast, the Lucas-Kanade (LK) algorithm works well without any annotation. However, LK has a strong assumption of photometric (brightness) consistency on image intensity and is easy to drift because of large motion, occlusion, and aperture problem. To relax the assumption and alleviate the drift problem, we propose CyLKs, a data-driven way of training Lucas-Kanade in an unsupervised manner. CyLKs learns a feature transformation through CNNs, transforming the input images to a feature space which is especially favorable to LK tracking. During training, we perform differentiable Lucas-Kanade forward and backward on the convolutional feature maps, and then minimize the re-projection error. During testing, we perform the LK tracking on the learned features. We apply our model to the task of landmark tracking and perform experiments on datasets of THUMOS and 300VW.
Tasks Landmark Tracking
Published 2018-11-28
URL https://arxiv.org/abs/1811.11325v4
PDF https://arxiv.org/pdf/1811.11325v4.pdf
PWC https://paperswithcode.com/paper/cylks-unsupervised-cycle-lucas-kanade-network
Repo
Framework

Convexification of Neural Graph

Title Convexification of Neural Graph
Authors Han Xiao
Abstract Traditionally, most complex intelligence architectures are extremely non-convex, which could not be well performed by convex optimization. However, this paper decomposes complex structures into three types of nodes: operators, algorithms and functions. Iteratively, propagating from node to node along edge, we prove that “regarding the tree-structured neural graph, it is nearly convex in each variable, when the other variables are fixed.” In fact, the non-convex properties stem from circles and functions, which could be transformed to be convex with our proposed \textit{\textbf{scale mechanism}}. Experimentally, we justify our theoretical analysis by two practical applications.
Tasks
Published 2018-01-09
URL http://arxiv.org/abs/1801.02901v2
PDF http://arxiv.org/pdf/1801.02901v2.pdf
PWC https://paperswithcode.com/paper/convexification-of-neural-graph
Repo
Framework

Attention Boosted Sequential Inference Model

Title Attention Boosted Sequential Inference Model
Authors Guanyu Li, Pengfei Zhang, Caiyan Jia
Abstract Attention mechanism has been proven effective on natural language processing. This paper proposes an attention boosted natural language inference model named aESIM by adding word attention and adaptive direction-oriented attention mechanisms to the traditional Bi-LSTM layer of natural language inference models, e.g. ESIM. This makes the inference model aESIM has the ability to effectively learn the representation of words and model the local subsentential inference between pairs of premise and hypothesis. The empirical studies on the SNLI, MultiNLI and Quora benchmarks manifest that aESIM is superior to the original ESIM model.
Tasks Natural Language Inference
Published 2018-12-05
URL http://arxiv.org/abs/1812.01840v2
PDF http://arxiv.org/pdf/1812.01840v2.pdf
PWC https://paperswithcode.com/paper/attention-boosted-sequential-inference-model
Repo
Framework

Generating Shared Latent Variables for Robots to Imitate Human Movements and Understand their Physical Limitations

Title Generating Shared Latent Variables for Robots to Imitate Human Movements and Understand their Physical Limitations
Authors Maxime Devanne, Sao Mai Nguyen
Abstract Assistive robotics and particularly robot coaches may be very helpful for rehabilitation healthcare. In this context, we propose a method based on Gaussian Process Latent Variable Model (GP-LVM) to transfer knowledge between a physiotherapist, a robot coach and a patient. Our model is able to map visual human body features to robot data in order to facilitate the robot learning and imitation. In addition , we propose to extend the model to adapt robots’ understanding to patient’s physical limitations during the assessment of rehabilitation exercises. Experimental evaluation demonstrates promising results for both robot imitation and model adaptation according to the patients’ limitations.
Tasks
Published 2018-10-11
URL http://arxiv.org/abs/1810.04879v2
PDF http://arxiv.org/pdf/1810.04879v2.pdf
PWC https://paperswithcode.com/paper/generating-shared-latent-variables-for-robots
Repo
Framework

The global optimum of shallow neural network is attained by ridgelet transform

Title The global optimum of shallow neural network is attained by ridgelet transform
Authors Sho Sonoda, Isao Ishikawa, Masahiro Ikeda, Kei Hagihara, Yoshihiro Sawano, Takuo Matsubara, Noboru Murata
Abstract We prove that the global minimum of the backpropagation (BP) training problem of neural networks with an arbitrary nonlinear activation is given by the ridgelet transform. A series of computational experiments show that there exists an interesting similarity between the scatter plot of hidden parameters in a shallow neural network after the BP training and the spectrum of the ridgelet transform. By introducing a continuous model of neural networks, we reduce the training problem to a convex optimization in an infinite dimensional Hilbert space, and obtain the explicit expression of the global optimizer via the ridgelet transform.
Tasks
Published 2018-05-19
URL http://arxiv.org/abs/1805.07517v3
PDF http://arxiv.org/pdf/1805.07517v3.pdf
PWC https://paperswithcode.com/paper/the-global-optimum-of-shallow-neural-network
Repo
Framework

A Deep Learning Framework for Single-Sided Sound Speed Inversion in Medical Ultrasound

Title A Deep Learning Framework for Single-Sided Sound Speed Inversion in Medical Ultrasound
Authors Micha Feigin, Daniel Freedman, Brian W. Anthony
Abstract Objective: Ultrasound elastography is gaining traction as an accessible and useful diagnostic tool for such things as cancer detection and differentiation and thyroid disease diagnostics. Unfortunately, state of the art shear wave imaging techniques, essential to promote this goal, are limited to high-end ultrasound hardware due to high power requirements; are extremely sensitive to patient and sonographer motion, and generally, suffer from low frame rates. Motivated by research and theory showing that longitudinal wave sound speed carries similar diagnostic abilities to shear wave imaging, we present an alternative approach using single sided pressure-wave sound speed measurements from channel data. Methods: In this paper, we present a single-sided sound speed inversion solution using a fully convolutional deep neural network. We use simulations for training, allowing the generation of limitless ground truth data. Results: We show that it is possible to invert for longitudinal sound speed in soft tissue at high frame rates. We validate the method on simulated data. We present highly encouraging results on limited real data. Conclusion: Sound speed inversion on channel data has significant potential, made possible in real time with deep learning technologies. Significance: Specialized shear wave ultrasound systems remain inaccessible in many locations. longitudinal sound speed and deep learning technologies enable an alternative approach to diagnosis based on tissue elasticity. High frame rates are possible.
Tasks
Published 2018-09-30
URL https://arxiv.org/abs/1810.00322v4
PDF https://arxiv.org/pdf/1810.00322v4.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-framework-for-single-sided
Repo
Framework

Obstacle Detection Quality as a Problem-Oriented Approach to Stereo Vision Algorithms Estimation in Road Situation Analysis

Title Obstacle Detection Quality as a Problem-Oriented Approach to Stereo Vision Algorithms Estimation in Road Situation Analysis
Authors A. A. Smagina, D. A. Shepelev, E. I. Ershov, A. S. Grigoryev
Abstract In this work we present a method for performance evaluation of stereo vision based obstacle detection techniques that takes into account the specifics of road situation analysis to minimize the effort required to prepare a test dataset. This approach has been designed to be implemented in systems such as self-driving cars or driver assistance and can also be used as problem-oriented quality criterion for evaluation of stereo vision algorithms.
Tasks Self-Driving Cars
Published 2018-09-06
URL http://arxiv.org/abs/1809.02228v1
PDF http://arxiv.org/pdf/1809.02228v1.pdf
PWC https://paperswithcode.com/paper/obstacle-detection-quality-as-a-problem
Repo
Framework

Incorporating Privileged Information to Unsupervised Anomaly Detection

Title Incorporating Privileged Information to Unsupervised Anomaly Detection
Authors Shubhranshu Shekhar, Leman Akoglu
Abstract We introduce a new unsupervised anomaly detection ensemble called SPI which can harness privileged information - data available only for training examples but not for (future) test examples. Our ideas build on the Learning Using Privileged Information (LUPI) paradigm pioneered by Vapnik et al. [19,17], which we extend to unsupervised learning and in particular to anomaly detection. SPI (for Spotting anomalies with Privileged Information) constructs a number of frames/fragments of knowledge (i.e., density estimates) in the privileged space and transfers them to the anomaly scoring space through “imitation” functions that use only the partial information available for test examples. Our generalization of the LUPI paradigm to unsupervised anomaly detection shepherds the field in several key directions, including (i) domain knowledge-augmented detection using expert annotations as PI, (ii) fast detection using computationally-demanding data as PI, and (iii) early detection using “historical future” data as PI. Through extensive experiments on simulated and real datasets, we show that augmenting privileged information to anomaly detection significantly improves detection performance. We also demonstrate the promise of SPI under all three settings (i-iii); with PI capturing expert knowledge, computationally expensive features, and future data on three real world detection tasks.
Tasks Anomaly Detection, Unsupervised Anomaly Detection
Published 2018-05-06
URL http://arxiv.org/abs/1805.02269v2
PDF http://arxiv.org/pdf/1805.02269v2.pdf
PWC https://paperswithcode.com/paper/incorporating-privileged-information-to
Repo
Framework
comments powered by Disqus