July 27, 2019

2991 words 15 mins read

Paper Group ANR 562

Paper Group ANR 562

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models. On Encoding Temporal Evolution for Real-time Action Prediction. The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation. Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization. On the Reconstruction of Face Images from Deep Face Temp …

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models

Title ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models
Authors Talal Ahmed, Waheed U. Bajwa
Abstract Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to statistical inference, can be used to overcome this challenge. Prior works on correlation-based variable screening either impose strong statistical priors on the linear model or assume specific post-screening inference methods. This paper first extends the analysis of correlation-based variable screening to arbitrary linear models and post-screening inference techniques. In particular, ($i$) it shows that a condition—termed the screening condition—is sufficient for successful correlation-based screening of linear models, and ($ii$) it provides insights into the dependence of marginal correlation-based screening on different problem parameters. Numerical experiments confirm that these insights are not mere artifacts of analysis; rather, they are reflective of the challenges associated with marginal correlation-based variable screening. Second, the paper explicitly derives the screening condition for two families of linear models, namely, sub-Gaussian linear models and arbitrary (random or deterministic) linear models. In the process, it establishes that—under appropriate conditions—it is possible to reduce the dimension of an ultrahigh-dimensional, arbitrary linear model to almost the sample size even when the number of active variables scales almost linearly with the sample size.
Tasks
Published 2017-08-21
URL http://arxiv.org/abs/1708.06077v1
PDF http://arxiv.org/pdf/1708.06077v1.pdf
PWC https://paperswithcode.com/paper/exsis-extended-sure-independence-screening
Repo
Framework

On Encoding Temporal Evolution for Real-time Action Prediction

Title On Encoding Temporal Evolution for Real-time Action Prediction
Authors Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis
Abstract Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames. To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames. We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI. We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.
Tasks
Published 2017-09-22
URL http://arxiv.org/abs/1709.07894v3
PDF http://arxiv.org/pdf/1709.07894v3.pdf
PWC https://paperswithcode.com/paper/on-encoding-temporal-evolution-for-real-time
Repo
Framework

The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation

Title The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation
Authors Shanxin Yuan, Qi Ye, Guillermo Garcia-Hernando, Tae-Kyun Kim
Abstract We present the 2017 Hands in the Million Challenge, a public competition designed for the evaluation of the task of 3D hand pose estimation. The goal of this challenge is to assess how far is the state of the art in terms of solving the problem of 3D hand pose estimation as well as detect major failure and strength modes of both systems and evaluation metrics that can help to identify future research directions. The challenge follows up the recent publication of BigHand2.2M and First-Person Hand Action datasets, which have been designed to exhaustively cover multiple hand, viewpoint, hand articulation, and occlusion. The challenge consists of a standardized dataset, an evaluation protocol for two different tasks, and a public competition. In this document we describe the different aspects of the challenge and, jointly with the results of the participants, it will be presented at the 3rd International Workshop on Observing and Understanding Hands in Action, HANDS 2017, with ICCV 2017.
Tasks Hand Pose Estimation, Pose Estimation
Published 2017-07-07
URL http://arxiv.org/abs/1707.02237v1
PDF http://arxiv.org/pdf/1707.02237v1.pdf
PWC https://paperswithcode.com/paper/the-2017-hands-in-the-million-challenge-on-3d
Repo
Framework

Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization

Title Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization
Authors Moran Feldman, Christopher Harshaw, Amin Karbasi
Abstract It is known that greedy methods perform well for maximizing monotone submodular functions. At the same time, such methods perform poorly in the face of non-monotonicity. In this paper, we show - arguably, surprisingly - that invoking the classical greedy algorithm $O(\sqrt{k})$-times leads to the (currently) fastest deterministic algorithm, called Repeated Greedy, for maximizing a general submodular function subject to $k$-independent system constraints. Repeated Greedy achieves $(1 + O(1/\sqrt{k}))k$ approximation using $O(nr\sqrt{k})$ function evaluations (here, $n$ and $r$ denote the size of the ground set and the maximum size of a feasible solution, respectively). We then show that by a careful sampling procedure, we can run the greedy algorithm only once and obtain the (currently) fastest randomized algorithm, called Sample Greedy, for maximizing a submodular function subject to $k$-extendible system constraints (a subclass of $k$-independent system constrains). Sample Greedy achieves $(k + 3)$-approximation with only $O(nr/k)$ function evaluations. Finally, we derive an almost matching lower bound, and show that no polynomial time algorithm can have an approximation ratio smaller than $ k + 1/2 - \varepsilon$. To further support our theoretical results, we compare the performance of Repeated Greedy and Sample Greedy with prior art in a concrete application (movie recommendation). We consistently observe that while Sample Greedy achieves practically the same utility as the best baseline, it performs at least two orders of magnitude faster.
Tasks
Published 2017-04-05
URL http://arxiv.org/abs/1704.01652v1
PDF http://arxiv.org/pdf/1704.01652v1.pdf
PWC https://paperswithcode.com/paper/greed-is-good-near-optimal-submodular
Repo
Framework

On the Reconstruction of Face Images from Deep Face Templates

Title On the Reconstruction of Face Images from Deep Face Templates
Authors Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain
Abstract State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template reconstruction attack. We propose a neighborly de-convolutional neural network (\textit{NbNet}) to reconstruct face images from their deep templates. In our experiments, we assumed that no knowledge about the target subject and the deep network are available. To train the \textit{NbNet} reconstruction models, we augmented two benchmark face datasets (VGG-Face and Multi-PIE) with a large collection of images synthesized using a face generator. The proposed reconstruction was evaluated using type-I (comparing the reconstructed images against the original face images used to generate the deep template) and type-II (comparing the reconstructed images against a different face image of the same subject) attacks. Given the images reconstructed from \textit{NbNets}, we show that for verification, we achieve TAR of 95.20% (58.05%) on LFW under type-I (type-II) attacks @ FAR of 0.1%. Besides, 96.58% (92.84%) of the images reconstruction from templates of partition \textit{fa} (\textit{fb}) can be identified from partition \textit{fa} in color FERET. Our study demonstrates the need to secure deep templates in face recognition systems.
Tasks Face Recognition
Published 2017-03-02
URL http://arxiv.org/abs/1703.00832v4
PDF http://arxiv.org/pdf/1703.00832v4.pdf
PWC https://paperswithcode.com/paper/on-the-reconstruction-of-face-images-from
Repo
Framework

Automatic Measurement of Pre-aspiration

Title Automatic Measurement of Pre-aspiration
Authors Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet
Abstract Pre-aspiration is defined as the period of glottal friction occurring in sequences of vocalic/consonantal sonorants and phonetically voiceless obstruents. We propose two machine learning methods for automatic measurement of pre-aspiration duration: a feedforward neural network, which works at the frame level; and a structured prediction model, which relies on manually designed feature functions, and works at the segment level. The input for both algorithms is a speech signal of an arbitrary length containing a single obstruent, and the output is a pair of times which constitutes the pre-aspiration boundaries. We train both models on a set of manually annotated examples. Results suggest that the structured model is superior to the frame-based model as it yields higher accuracy in predicting the boundaries and generalizes to new speakers and new languages. Finally, we demonstrate the applicability of our structured prediction algorithm by replicating linguistic analysis of pre-aspiration in Aberystwyth English with high correlation.
Tasks Structured Prediction
Published 2017-04-05
URL http://arxiv.org/abs/1704.01653v3
PDF http://arxiv.org/pdf/1704.01653v3.pdf
PWC https://paperswithcode.com/paper/automatic-measurement-of-pre-aspiration
Repo
Framework

Word Affect Intensities

Title Word Affect Intensities
Authors Saif M. Mohammad
Abstract Words often convey affect – emotions, feelings, and attitudes. Lexicons of word-affect association have applications in automatic emotion analysis and natural language generation. However, existing lexicons indicate only coarse categories of affect association. Here, for the first time, we create an affect intensity lexicon with real-valued scores of association. We use a technique called best-worst scaling that improves annotation consistency and obtains reliable fine-grained scores. The lexicon includes terms common from both general English and terms specific to social media communications. It has close to 6,000 entries for four basic emotions. We will be adding entries for other affect dimensions shortly.
Tasks Emotion Recognition, Text Generation
Published 2017-04-28
URL http://arxiv.org/abs/1704.08798v1
PDF http://arxiv.org/pdf/1704.08798v1.pdf
PWC https://paperswithcode.com/paper/word-affect-intensities
Repo
Framework

BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis

Title BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis
Authors Shanxin Yuan, Qi Ye, Bjorn Stenger, Siddhant Jain, Tae-Kyun Kim
Abstract In this paper we introduce a large-scale hand pose dataset, collected using a novel capture method. Existing datasets are either generated synthetically or captured using depth sensors: synthetic datasets exhibit a certain level of appearance difference from real depth images, and real datasets are limited in quantity and coverage, mainly due to the difficulty to annotate them. We propose a tracking system with six 6D magnetic sensors and inverse kinematics to automatically obtain 21-joints hand pose annotations of depth maps captured with minimal restriction on the range of motion. The capture protocol aims to fully cover the natural hand pose space. As shown in embedding plots, the new dataset exhibits a significantly wider and denser range of hand poses compared to existing benchmarks. Current state-of-the-art methods are evaluated on the dataset, and we demonstrate significant improvements in cross-benchmark performance. We also show significant improvements in egocentric hand pose estimation with a CNN trained on the new dataset.
Tasks Art Analysis, Hand Pose Estimation, Pose Estimation
Published 2017-04-09
URL http://arxiv.org/abs/1704.02612v2
PDF http://arxiv.org/pdf/1704.02612v2.pdf
PWC https://paperswithcode.com/paper/bighand22m-benchmark-hand-pose-dataset-and
Repo
Framework

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

Title Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor
Authors Franziska Mueller, Dushyant Mehta, Oleksandr Sotnychenko, Srinath Sridhar, Dan Casas, Christian Theobalt
Abstract We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints, common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.
Tasks Hand Pose Estimation, Pose Estimation, Pose Tracking
Published 2017-04-07
URL http://arxiv.org/abs/1704.02201v2
PDF http://arxiv.org/pdf/1704.02201v2.pdf
PWC https://paperswithcode.com/paper/real-time-hand-tracking-under-occlusion-from
Repo
Framework

Efficient Registration of Pathological Images: A Joint PCA/Image-Reconstruction Approach

Title Efficient Registration of Pathological Images: A Joint PCA/Image-Reconstruction Approach
Authors Xu Han, Xiao Yang, Stephen Aylward, Roland Kwitt, Marc Niethammer
Abstract Registration involving one or more images containing pathologies is challenging, as standard image similarity measures and spatial transforms cannot account for common changes due to pathologies. Low-rank/Sparse (LRS) decomposition removes pathologies prior to registration; however, LRS is memory-demanding and slow, which limits its use on larger data sets. Additionally, LRS blurs normal tissue regions, which may degrade registration performance. This paper proposes an efficient alternative to LRS: (1) normal tissue appearance is captured by principal component analysis (PCA) and (2) blurring is avoided by an integrated model for pathology removal and image reconstruction. Results on synthetic and BRATS 2015 data demonstrate its utility.
Tasks Image Reconstruction
Published 2017-03-31
URL http://arxiv.org/abs/1704.00036v1
PDF http://arxiv.org/pdf/1704.00036v1.pdf
PWC https://paperswithcode.com/paper/efficient-registration-of-pathological-images
Repo
Framework

3D Face Reconstruction with Geometry Details from a Single Image

Title 3D Face Reconstruction with Geometry Details from a Single Image
Authors Luo Jiang, Juyong Zhang, Bailin Deng, Hao Li, Ligang Liu
Abstract 3D face reconstruction from a single image is a classical and challenging problem, with wide applications in many areas. Inspired by recent works in face animation from RGB-D or monocular video inputs, we develop a novel method for reconstructing 3D faces from unconstrained 2D images, using a coarse-to-fine optimization strategy. First, a smooth coarse 3D face is generated from an example-based bilinear face model, by aligning the projection of 3D face landmarks with 2D landmarks detected from the input image. Afterwards, using local corrective deformation fields, the coarse 3D face is refined using photometric consistency constraints, resulting in a medium face shape. Finally, a shape-from-shading method is applied on the medium face to recover fine geometric details. Our method outperforms state-of-the-art approaches in terms of accuracy and detail recovery, which is demonstrated in extensive experiments using real world models and publicly available datasets.
Tasks 3D Face Reconstruction, Face Reconstruction
Published 2017-02-18
URL http://arxiv.org/abs/1702.05619v2
PDF http://arxiv.org/pdf/1702.05619v2.pdf
PWC https://paperswithcode.com/paper/3d-face-reconstruction-with-geometry-details
Repo
Framework

Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages

Title Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages
Authors Zheng Wei, Erin M. Conlon
Abstract Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.
Tasks
Published 2017-12-16
URL http://arxiv.org/abs/1712.05907v2
PDF http://arxiv.org/pdf/1712.05907v2.pdf
PWC https://paperswithcode.com/paper/parallel-markov-chain-monte-carlo-for
Repo
Framework

Variance-Reduced Stochastic Learning under Random Reshuffling

Title Variance-Reduced Stochastic Learning under Random Reshuffling
Authors Bicheng Ying, Kun Yuan, Ali H. Sayed
Abstract Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results assume uniform data sampling with replacement. However, it has been observed in related works that random reshuffling can deliver superior performance over uniform sampling and, yet, no formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper makes two contributions. First, it resolves this open issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms. Second, under random reshuffling, the paper proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. AVRG is also shown analytically to converge linearly.
Tasks
Published 2017-08-04
URL http://arxiv.org/abs/1708.01383v3
PDF http://arxiv.org/pdf/1708.01383v3.pdf
PWC https://paperswithcode.com/paper/variance-reduced-stochastic-learning-under
Repo
Framework

Deep Exploration via Randomized Value Functions

Title Deep Exploration via Randomized Value Functions
Authors Ian Osband, Benjamin Van Roy, Daniel Russo, Zheng Wen
Abstract We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation.
Tasks Efficient Exploration
Published 2017-03-22
URL https://arxiv.org/abs/1703.07608v5
PDF https://arxiv.org/pdf/1703.07608v5.pdf
PWC https://paperswithcode.com/paper/deep-exploration-via-randomized-value
Repo
Framework

Deep Robust Kalman Filter

Title Deep Robust Kalman Filter
Authors Shirli Di-Castro Shashua, Shie Mannor
Abstract A Robust Markov Decision Process (RMDP) is a sequential decision making model that accounts for uncertainty in the parameters of dynamic systems. This uncertainty introduces difficulties in learning an optimal policy, especially for environments with large state spaces. We propose two algorithms, RTD-DQN and Deep-RoK, for solving large-scale RMDPs using nonlinear approximation schemes such as deep neural networks. The RTD-DQN algorithm incorporates the robust Bellman temporal difference error into a robust loss function, yielding robust policies for the agent. The Deep-RoK algorithm is a robust Bayesian method, based on the Extended Kalman Filter (EKF), that accounts for both the uncertainty in the weights of the approximated value function and the uncertainty in the transition probabilities, improving the robustness of the agent. We provide theoretical results for our approach and test the proposed algorithms on a continuous state domain.
Tasks Decision Making
Published 2017-03-07
URL http://arxiv.org/abs/1703.02310v1
PDF http://arxiv.org/pdf/1703.02310v1.pdf
PWC https://paperswithcode.com/paper/deep-robust-kalman-filter
Repo
Framework
comments powered by Disqus