July 27, 2019

2991 words 15 mins read

Paper Group ANR 562

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models. On Encoding Temporal Evolution for Real-time Action Prediction. The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation. Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization. On the Reconstruction of Face Images from Deep Face Temp …

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models


Title	ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models
Authors	Talal Ahmed, Waheed U. Bajwa
Abstract	Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to statistical inference, can be used to overcome this challenge. Prior works on correlation-based variable screening either impose strong statistical priors on the linear model or assume specific post-screening inference methods. This paper first extends the analysis of correlation-based variable screening to arbitrary linear models and post-screening inference techniques. In particular, ($i$) it shows that a condition—termed the screening condition—is sufficient for successful correlation-based screening of linear models, and ($ii$) it provides insights into the dependence of marginal correlation-based screening on different problem parameters. Numerical experiments confirm that these insights are not mere artifacts of analysis; rather, they are reflective of the challenges associated with marginal correlation-based variable screening. Second, the paper explicitly derives the screening condition for two families of linear models, namely, sub-Gaussian linear models and arbitrary (random or deterministic) linear models. In the process, it establishes that—under appropriate conditions—it is possible to reduce the dimension of an ultrahigh-dimensional, arbitrary linear model to almost the sample size even when the number of active variables scales almost linearly with the sample size.
Tasks
Published	2017-08-21
URL	http://arxiv.org/abs/1708.06077v1
PDF	http://arxiv.org/pdf/1708.06077v1.pdf
PWC	https://paperswithcode.com/paper/exsis-extended-sure-independence-screening
Repo
Framework

On Encoding Temporal Evolution for Real-time Action Prediction


Title	On Encoding Temporal Evolution for Real-time Action Prediction
Authors	Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis
Abstract	Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames. To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames. We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI. We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.
Tasks
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07894v3
PDF	http://arxiv.org/pdf/1709.07894v3.pdf
PWC	https://paperswithcode.com/paper/on-encoding-temporal-evolution-for-real-time
Repo
Framework

The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation


Title	The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation
Authors	Shanxin Yuan, Qi Ye, Guillermo Garcia-Hernando, Tae-Kyun Kim
Abstract	We present the 2017 Hands in the Million Challenge, a public competition designed for the evaluation of the task of 3D hand pose estimation. The goal of this challenge is to assess how far is the state of the art in terms of solving the problem of 3D hand pose estimation as well as detect major failure and strength modes of both systems and evaluation metrics that can help to identify future research directions. The challenge follows up the recent publication of BigHand2.2M and First-Person Hand Action datasets, which have been designed to exhaustively cover multiple hand, viewpoint, hand articulation, and occlusion. The challenge consists of a standardized dataset, an evaluation protocol for two different tasks, and a public competition. In this document we describe the different aspects of the challenge and, jointly with the results of the participants, it will be presented at the 3rd International Workshop on Observing and Understanding Hands in Action, HANDS 2017, with ICCV 2017.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2017-07-07
URL	http://arxiv.org/abs/1707.02237v1
PDF	http://arxiv.org/pdf/1707.02237v1.pdf
PWC	https://paperswithcode.com/paper/the-2017-hands-in-the-million-challenge-on-3d
Repo
Framework

Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization


Title	Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization
Authors	Moran Feldman, Christopher Harshaw, Amin Karbasi
Abstract	It is known that greedy methods perform well for maximizing monotone submodular functions. At the same time, such methods perform poorly in the face of non-monotonicity. In this paper, we show - arguably, surprisingly - that invoking the classical greedy algorithm $O(\sqrt{k})$-times leads to the (currently) fastest deterministic algorithm, called Repeated Greedy, for maximizing a general submodular function subject to $k$-independent system constraints. Repeated Greedy achieves $(1 + O(1/\sqrt{k}))k$ approximation using $O(nr\sqrt{k})$ function evaluations (here, $n$ and $r$ denote the size of the ground set and the maximum size of a feasible solution, respectively). We then show that by a careful sampling procedure, we can run the greedy algorithm only once and obtain the (currently) fastest randomized algorithm, called Sample Greedy, for maximizing a submodular function subject to $k$-extendible system constraints (a subclass of $k$-independent system constrains). Sample Greedy achieves $(k + 3)$-approximation with only $O(nr/k)$ function evaluations. Finally, we derive an almost matching lower bound, and show that no polynomial time algorithm can have an approximation ratio smaller than $ k + 1/2 - \varepsilon$. To further support our theoretical results, we compare the performance of Repeated Greedy and Sample Greedy with prior art in a concrete application (movie recommendation). We consistently observe that while Sample Greedy achieves practically the same utility as the best baseline, it performs at least two orders of magnitude faster.
Tasks
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01652v1
PDF	http://arxiv.org/pdf/1704.01652v1.pdf
PWC	https://paperswithcode.com/paper/greed-is-good-near-optimal-submodular
Repo
Framework

On the Reconstruction of Face Images from Deep Face Templates


Title	On the Reconstruction of Face Images from Deep Face Templates
Authors	Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain
Abstract	State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template reconstruction attack. We propose a neighborly de-convolutional neural network (\textit{NbNet}) to reconstruct face images from their deep templates. In our experiments, we assumed that no knowledge about the target subject and the deep network are available. To train the \textit{NbNet} reconstruction models, we augmented two benchmark face datasets (VGG-Face and Multi-PIE) with a large collection of images synthesized using a face generator. The proposed reconstruction was evaluated using type-I (comparing the reconstructed images against the original face images used to generate the deep template) and type-II (comparing the reconstructed images against a different face image of the same subject) attacks. Given the images reconstructed from \textit{NbNets}, we show that for verification, we achieve TAR of 95.20% (58.05%) on LFW under type-I (type-II) attacks @ FAR of 0.1%. Besides, 96.58% (92.84%) of the images reconstruction from templates of partition \textit{fa} (\textit{fb}) can be identified from partition \textit{fa} in color FERET. Our study demonstrates the need to secure deep templates in face recognition systems.
Tasks	Face Recognition
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00832v4
PDF	http://arxiv.org/pdf/1703.00832v4.pdf
PWC	https://paperswithcode.com/paper/on-the-reconstruction-of-face-images-from
Repo
Framework

Automatic Measurement of Pre-aspiration


Title	Automatic Measurement of Pre-aspiration
Authors	Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet
Abstract	Pre-aspiration is defined as the period of glottal friction occurring in sequences of vocalic/consonantal sonorants and phonetically voiceless obstruents. We propose two machine learning methods for automatic measurement of pre-aspiration duration: a feedforward neural network, which works at the frame level; and a structured prediction model, which relies on manually designed feature functions, and works at the segment level. The input for both algorithms is a speech signal of an arbitrary length containing a single obstruent, and the output is a pair of times which constitutes the pre-aspiration boundaries. We train both models on a set of manually annotated examples. Results suggest that the structured model is superior to the frame-based model as it yields higher accuracy in predicting the boundaries and generalizes to new speakers and new languages. Finally, we demonstrate the applicability of our structured prediction algorithm by replicating linguistic analysis of pre-aspiration in Aberystwyth English with high correlation.
Tasks	Structured Prediction
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01653v3
PDF	http://arxiv.org/pdf/1704.01653v3.pdf
PWC	https://paperswithcode.com/paper/automatic-measurement-of-pre-aspiration
Repo
Framework

Word Affect Intensities


Title	Word Affect Intensities
Authors	Saif M. Mohammad
Abstract	Words often convey affect – emotions, feelings, and attitudes. Lexicons of word-affect association have applications in automatic emotion analysis and natural language generation. However, existing lexicons indicate only coarse categories of affect association. Here, for the first time, we create an affect intensity lexicon with real-valued scores of association. We use a technique called best-worst scaling that improves annotation consistency and obtains reliable fine-grained scores. The lexicon includes terms common from both general English and terms specific to social media communications. It has close to 6,000 entries for four basic emotions. We will be adding entries for other affect dimensions shortly.
Tasks	Emotion Recognition, Text Generation
Published	2017-04-28
URL	http://arxiv.org/abs/1704.08798v1
PDF	http://arxiv.org/pdf/1704.08798v1.pdf
PWC	https://paperswithcode.com/paper/word-affect-intensities
Repo
Framework

BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis


Title	BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis
Authors	Shanxin Yuan, Qi Ye, Bjorn Stenger, Siddhant Jain, Tae-Kyun Kim
Abstract	In this paper we introduce a large-scale hand pose dataset, collected using a novel capture method. Existing datasets are either generated synthetically or captured using depth sensors: synthetic datasets exhibit a certain level of appearance difference from real depth images, and real datasets are limited in quantity and coverage, mainly due to the difficulty to annotate them. We propose a tracking system with six 6D magnetic sensors and inverse kinematics to automatically obtain 21-joints hand pose annotations of depth maps captured with minimal restriction on the range of motion. The capture protocol aims to fully cover the natural hand pose space. As shown in embedding plots, the new dataset exhibits a significantly wider and denser range of hand poses compared to existing benchmarks. Current state-of-the-art methods are evaluated on the dataset, and we demonstrate significant improvements in cross-benchmark performance. We also show significant improvements in egocentric hand pose estimation with a CNN trained on the new dataset.
Tasks	Art Analysis, Hand Pose Estimation, Pose Estimation
Published	2017-04-09
URL	http://arxiv.org/abs/1704.02612v2
PDF	http://arxiv.org/pdf/1704.02612v2.pdf
PWC	https://paperswithcode.com/paper/bighand22m-benchmark-hand-pose-dataset-and
Repo
Framework

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor


Title	Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor
Authors	Franziska Mueller, Dushyant Mehta, Oleksandr Sotnychenko, Srinath Sridhar, Dan Casas, Christian Theobalt
Abstract	We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints, common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.
Tasks	Hand Pose Estimation, Pose Estimation, Pose Tracking
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02201v2
PDF	http://arxiv.org/pdf/1704.02201v2.pdf
PWC	https://paperswithcode.com/paper/real-time-hand-tracking-under-occlusion-from
Repo
Framework

Efficient Registration of Pathological Images: A Joint PCA/Image-Reconstruction Approach


Title	Efficient Registration of Pathological Images: A Joint PCA/Image-Reconstruction Approach
Authors	Xu Han, Xiao Yang, Stephen Aylward, Roland Kwitt, Marc Niethammer
Abstract	Registration involving one or more images containing pathologies is challenging, as standard image similarity measures and spatial transforms cannot account for common changes due to pathologies. Low-rank/Sparse (LRS) decomposition removes pathologies prior to registration; however, LRS is memory-demanding and slow, which limits its use on larger data sets. Additionally, LRS blurs normal tissue regions, which may degrade registration performance. This paper proposes an efficient alternative to LRS: (1) normal tissue appearance is captured by principal component analysis (PCA) and (2) blurring is avoided by an integrated model for pathology removal and image reconstruction. Results on synthetic and BRATS 2015 data demonstrate its utility.
Tasks	Image Reconstruction
Published	2017-03-31
URL	http://arxiv.org/abs/1704.00036v1
PDF	http://arxiv.org/pdf/1704.00036v1.pdf
PWC	https://paperswithcode.com/paper/efficient-registration-of-pathological-images
Repo
Framework

3D Face Reconstruction with Geometry Details from a Single Image


Title	3D Face Reconstruction with Geometry Details from a Single Image
Authors	Luo Jiang, Juyong Zhang, Bailin Deng, Hao Li, Ligang Liu
Abstract	3D face reconstruction from a single image is a classical and challenging problem, with wide applications in many areas. Inspired by recent works in face animation from RGB-D or monocular video inputs, we develop a novel method for reconstructing 3D faces from unconstrained 2D images, using a coarse-to-fine optimization strategy. First, a smooth coarse 3D face is generated from an example-based bilinear face model, by aligning the projection of 3D face landmarks with 2D landmarks detected from the input image. Afterwards, using local corrective deformation fields, the coarse 3D face is refined using photometric consistency constraints, resulting in a medium face shape. Finally, a shape-from-shading method is applied on the medium face to recover fine geometric details. Our method outperforms state-of-the-art approaches in terms of accuracy and detail recovery, which is demonstrated in extensive experiments using real world models and publicly available datasets.
Tasks	3D Face Reconstruction, Face Reconstruction
Published	2017-02-18
URL	http://arxiv.org/abs/1702.05619v2
PDF	http://arxiv.org/pdf/1702.05619v2.pdf
PWC	https://paperswithcode.com/paper/3d-face-reconstruction-with-geometry-details
Repo
Framework

Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages


Title	Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages
Authors	Zheng Wei, Erin M. Conlon
Abstract	Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.
Tasks
Published	2017-12-16
URL	http://arxiv.org/abs/1712.05907v2
PDF	http://arxiv.org/pdf/1712.05907v2.pdf
PWC	https://paperswithcode.com/paper/parallel-markov-chain-monte-carlo-for
Repo
Framework

Variance-Reduced Stochastic Learning under Random Reshuffling


Title	Variance-Reduced Stochastic Learning under Random Reshuffling
Authors	Bicheng Ying, Kun Yuan, Ali H. Sayed
Abstract	Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results assume uniform data sampling with replacement. However, it has been observed in related works that random reshuffling can deliver superior performance over uniform sampling and, yet, no formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper makes two contributions. First, it resolves this open issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms. Second, under random reshuffling, the paper proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. AVRG is also shown analytically to converge linearly.
Tasks
Published	2017-08-04
URL	http://arxiv.org/abs/1708.01383v3
PDF	http://arxiv.org/pdf/1708.01383v3.pdf
PWC	https://paperswithcode.com/paper/variance-reduced-stochastic-learning-under
Repo
Framework

Deep Exploration via Randomized Value Functions


Title	Deep Exploration via Randomized Value Functions
Authors	Ian Osband, Benjamin Van Roy, Daniel Russo, Zheng Wen
Abstract	We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation.
Tasks	Efficient Exploration
Published	2017-03-22
URL	https://arxiv.org/abs/1703.07608v5
PDF	https://arxiv.org/pdf/1703.07608v5.pdf
PWC	https://paperswithcode.com/paper/deep-exploration-via-randomized-value
Repo
Framework

Deep Robust Kalman Filter


Title	Deep Robust Kalman Filter
Authors	Shirli Di-Castro Shashua, Shie Mannor
Abstract	A Robust Markov Decision Process (RMDP) is a sequential decision making model that accounts for uncertainty in the parameters of dynamic systems. This uncertainty introduces difficulties in learning an optimal policy, especially for environments with large state spaces. We propose two algorithms, RTD-DQN and Deep-RoK, for solving large-scale RMDPs using nonlinear approximation schemes such as deep neural networks. The RTD-DQN algorithm incorporates the robust Bellman temporal difference error into a robust loss function, yielding robust policies for the agent. The Deep-RoK algorithm is a robust Bayesian method, based on the Extended Kalman Filter (EKF), that accounts for both the uncertainty in the weights of the approximated value function and the uncertainty in the transition probabilities, improving the robustness of the agent. We provide theoretical results for our approach and test the proposed algorithms on a continuous state domain.
Tasks	Decision Making
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02310v1
PDF	http://arxiv.org/pdf/1703.02310v1.pdf
PWC	https://paperswithcode.com/paper/deep-robust-kalman-filter
Repo
Framework