January 26, 2020

3267 words 16 mins read

Paper Group ANR 1426

POSEAMM: A Unified Framework for Solving Pose Problems using an Alternating Minimization Method. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes. Entropic Risk Measure in Policy Search. InfoRL: Interpretable Reinforcement Learning using Information Maximization. Tensor Sparse PCA and Face Recognition: A Novel …

POSEAMM: A Unified Framework for Solving Pose Problems using an Alternating Minimization Method


Title	POSEAMM: A Unified Framework for Solving Pose Problems using an Alternating Minimization Method
Authors	Joao Campos, Joao R. Cardoso, Pedro Miraldo
Abstract	Pose estimation is one of the most important problems in computer vision. It can be divided in two different categories – absolute and relative – and may involve two different types of camera models: central and non-central. State-of-the-art methods have been designed to solve separately these problems. This paper presents a unified framework that is able to solve any pose problem by alternating optimization techniques between two set of parameters, rotation and translation. In order to make this possible, it is necessary to define an objective function that captures the problem at hand. Since the objective function will depend on the rotation and translation it is not possible to solve it as a simple minimization problem. Hence the use of Alternating Minimization methods, in which the function will be alternatively minimized with respect to the rotation and the translation. We show how to use our framework in three distinct pose problems. Our methods are then benchmarked with both synthetic and real data, showing their better balance between computational time and accuracy.
Tasks	Pose Estimation
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04858v1
PDF	http://arxiv.org/pdf/1904.04858v1.pdf
PWC	https://paperswithcode.com/paper/poseamm-a-unified-framework-for-solving-pose
Repo
Framework

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes


Title	Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes
Authors	Alekh Agarwal, Sham M. Kakade, Jason D. Lee, Gaurav Mahajan
Abstract	Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces. However, little is known about even their most basic theoretical convergence properties, including: if and how fast they converge to a globally optimal solution (say with a sufficiently rich policy class); how they cope with approximation error due to using a restricted class of parametric policies; or their finite sample behavior. Such characterizations are important not only to compare these methods to their approximate value function counterparts (where such issues are relatively well understood, at least in the worst case), but also to help with more principled approaches to algorithm design. This work provides provable characterizations of computational, approximation, and sample size issues with regards to policy gradient methods in the context of discounted Markov Decision Processes (MDPs). We focus on both: 1) “tabular” policy parameterizations, where the optimal policy is contained in the class and where we show global convergence to the optimal policy, and 2) restricted policy classes, which may not contain the optimal policy and where we provide agnostic learning results. One insight of this work is in formalizing the importance how a favorable initial state distribution provides a means to circumvent worst-case exploration issues. Overall, these results place policy gradient methods under a solid theoretical footing, analogous to the global convergence guarantees of iterative value function based algorithms.
Tasks	Policy Gradient Methods
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00261v2
PDF	https://arxiv.org/pdf/1908.00261v2.pdf
PWC	https://paperswithcode.com/paper/optimality-and-approximation-with-policy
Repo
Framework

Entropic Risk Measure in Policy Search


Title	Entropic Risk Measure in Policy Search
Authors	David Nass, Boris Belousov, Jan Peters
Abstract	With the increasing pace of automation, modern robotic systems need to act in stochastic, non-stationary, partially observable environments. A range of algorithms for finding parameterized policies that optimize for long-term average performance have been proposed in the past. However, the majority of the proposed approaches does not explicitly take into account the variability of the performance metric, which may lead to finding policies that although performing well on average, can perform spectacularly bad in a particular run or over a period of time. To address this shortcoming, we study an approach to policy optimization that explicitly takes into account higher order statistics of the reward function. In this paper, we extend policy gradient methods to include the entropic risk measure in the objective function and evaluate their performance in simulation experiments and on a real-robot task of learning a hitting motion in robot badminton.
Tasks	Policy Gradient Methods
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09090v2
PDF	https://arxiv.org/pdf/1906.09090v2.pdf
PWC	https://paperswithcode.com/paper/entropic-risk-measure-in-policy-search
Repo
Framework

InfoRL: Interpretable Reinforcement Learning using Information Maximization


Title	InfoRL: Interpretable Reinforcement Learning using Information Maximization
Authors	Aadil Hayat, Utsav Singh, Vinay P. Namboodiri
Abstract	Recent advances in reinforcement learning have proved that given an environment we can learn to perform a task in that environment if we have access to some form of a reward function (dense, sparse or derived from IRL). But most of the algorithms focus on learning a single best policy to perform a given set of tasks. In this paper, we focus on an algorithm that learns to not just perform a task but different ways to perform the same task. As we know when the environment is complex enough there always exists multiple ways to perform a task. We show that using the concept of information maximization it is possible to learn latent codes for discovering multiple ways to perform any given task in an environment.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10404v1
PDF	https://arxiv.org/pdf/1905.10404v1.pdf
PWC	https://paperswithcode.com/paper/inforl-interpretable-reinforcement-learning
Repo
Framework

Tensor Sparse PCA and Face Recognition: A Novel Approach


Title	Tensor Sparse PCA and Face Recognition: A Novel Approach
Authors	Loc Hoang Tran, Linh Hoang Tran
Abstract	Face recognition is the important field in machine learning and pattern recognition research area. It has a lot of applications in military, finance, public security, to name a few. In this paper, the combination of the tensor sparse PCA with the nearest-neighbor method (and with the kernel ridge regression method) will be proposed and applied to the face dataset. Experimental results show that the combination of the tensor sparse PCA with any classification system does not always reach the best accuracy performance measures. However, the accuracy of the combination of the sparse PCA method and one specific classification system is always better than the accuracy of the combination of the PCA method and one specific classification system and is always better than the accuracy of the classification system itself.
Tasks	Face Recognition
Published	2019-04-12
URL	https://arxiv.org/abs/1904.08496v3
PDF	https://arxiv.org/pdf/1904.08496v3.pdf
PWC	https://paperswithcode.com/paper/190408496
Repo
Framework

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies


Title	Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Authors	Kaiqing Zhang, Alec Koppel, Hao Zhu, Tamer Başar
Abstract	Policy gradient (PG) methods are a widely used reinforcement learning methodology in many applications such as video games, autonomous driving, and robotics. In spite of its empirical success, a rigorous understanding of the global convergence of PG methods is lacking in the literature. In this work, we close the gap by viewing PG methods from a nonconvex optimization perspective. In particular, we propose a new variant of PG methods for infinite-horizon problems that uses a random rollout horizon for the Monte-Carlo estimation of the policy gradient. This method then yields an unbiased estimate of the policy gradient with bounded variance, which enables the tools from nonconvex optimization to be applied to establish global convergence. Employing this perspective, we first recover the convergence results with rates to the stationary-point policies in the literature. More interestingly, motivated by advances in nonconvex optimization, we modify the proposed PG method by introducing periodically enlarged stepsizes. The modified algorithm is shown to escape saddle points under mild assumptions on the reward and the policy parameterization. Under a further strict saddle points assumption, this result establishes convergence to essentially locally-optimal policies of the underlying problem, and thus bridges the gap in existing literature on the convergence of PG methods. Results from experiments on the inverted pendulum are then provided to corroborate our theory, namely, by slightly reshaping the reward function to satisfy our assumption, unfavorable saddle points can be avoided and better limit points can be attained. Intriguingly, this empirical finding justifies the benefit of reward-reshaping from a nonconvex optimization perspective.
Tasks	Autonomous Driving, Policy Gradient Methods
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08383v2
PDF	https://arxiv.org/pdf/1906.08383v2.pdf
PWC	https://paperswithcode.com/paper/global-convergence-of-policy-gradient-methods-3
Repo
Framework

Identification of relevant diffusion MRI metrics impacting cognitive functions using a novel feature selection method


Title	Identification of relevant diffusion MRI metrics impacting cognitive functions using a novel feature selection method
Authors	Tongda Xu, Xiyan Cai, Yao Wang, Xiuyuan Wang, Sohae Chung, Els Fieremans, Joseph Rath, Steven Flanagan, Yvonne W Lui
Abstract	Mild Traumatic Brain Injury (mTBI) is a significant public health problem. The most troubling symptoms after mTBI are cognitive complaints. Studies show measurable differences between patients with mTBI and healthy controls with respect to tissue microstructure using diffusion MRI. However, it remains unclear which diffusion measures are the most informative with regard to cognitive functions in both the healthy state as well as after injury. In this study, we use diffusion MRI to formulate a predictive model for performance on working memory based on the most relevant MRI features. The key challenge is to identify relevant features over a large feature space with high accuracy in an efficient manner. To tackle this challenge, we propose a novel improvement of the best first search approach with crossover operators inspired by genetic algorithm. Compared against other heuristic feature selection algorithms, the proposed method achieves significantly more accurate predictions and yields clinically interpretable selected features.
Tasks	Feature Selection
Published	2019-08-10
URL	https://arxiv.org/abs/1908.04752v2
PDF	https://arxiv.org/pdf/1908.04752v2.pdf
PWC	https://paperswithcode.com/paper/identification-of-relevant-diffusion-mri
Repo
Framework

3DFR: A Swift 3D Feature Reductionist Framework for Scene Independent Change Detection


Title	3DFR: A Swift 3D Feature Reductionist Framework for Scene Independent Change Detection
Authors	Murari Mandal, Vansh Dhar, Abhishek Mishra, Santosh Kumar Vipparthi
Abstract	In this paper we propose an end-to-end swift 3D feature reductionist framework (3DFR) for scene independent change detection. The 3DFR framework consists of three feature streams: a swift 3D feature reductionist stream (AvFeat), a contemporary feature stream (ConFeat) and a temporal median feature map. These multilateral foreground/background features are further refined through an encoder-decoder network. As a result, the proposed framework not only detects temporal changes but also learns high-level appearance features. Thus, it incorporates the object semantics for effective change detection. Furthermore, the proposed framework is validated through a scene independent evaluation scheme in order to demonstrate the robustness and generalization capability of the network. The performance of the proposed method is evaluated on the benchmark CDnet 2014 dataset. The experimental results show that the proposed 3DFR network outperforms the state-of-the-art approaches.
Tasks
Published	2019-12-26
URL	https://arxiv.org/abs/1912.11891v1
PDF	https://arxiv.org/pdf/1912.11891v1.pdf
PWC	https://paperswithcode.com/paper/3dfr-a-swift-3d-feature-reductionist
Repo
Framework

Learning with Wasserstein barycenters and applications


Title	Learning with Wasserstein barycenters and applications
Authors	G. Domazakis, D. Drivaliaris, S. Koukoulas, G. Papayiannis, A. Tsekrekos, A. Yannacopoulos
Abstract	In this work, learning schemes for measure-valued data are proposed, i.e. data that their structure can be more efficiently represented as probability measures instead of points on $\R^d$, employing the concept of probability barycenters as defined with respect to the Wasserstein metric. Such type of learning approaches are highly appreciated in many fields where the observational/experimental error is significant (e.g. astronomy, biology, remote sensing, etc.) or the data nature is more complex and the traditional learning algorithms are not applicable or effective to treat them (e.g. network data, interval data, high frequency records, matrix data, etc.). Under this perspective, each observation is identified by an appropriate probability measure and the proposed statistical learning schemes rely on discrimination criteria that utilize the geometric structure of the space of probability measures through core techniques from the optimal transport theory. The discussed approaches are implemented in two real world applications: (a) clustering eurozone countries according to their observed government bond yield curves and (b) classifying the areas of a satellite image to certain land uses categories which is a standard task in remote sensing. In both case studies the results are particularly interesting and meaningful while the accuracy obtained is high.
Tasks
Published	2019-12-26
URL	https://arxiv.org/abs/1912.11801v1
PDF	https://arxiv.org/pdf/1912.11801v1.pdf
PWC	https://paperswithcode.com/paper/learning-with-wasserstein-barycenters-and
Repo
Framework

KNN and ANN-based Recognition of Handwritten Pashto Letters using Zoning Features


Title	KNN and ANN-based Recognition of Handwritten Pashto Letters using Zoning Features
Authors	Sulaiman Khan, Hazrat Ali, Zahid Ullah, Nasru Minallah, Shahid Maqsood, Abdul Hafeez
Abstract	This paper presents a recognition system for handwritten Pashto letters. However, handwritten character recognition is a challenging task. These letters not only differ in shape and style but also vary among individuals. The recognition becomes further daunting due to the lack of standard datasets for inscribed Pashto letters. In this work, we have designed a database of moderate size, which encompasses a total of 4488 images, stemming from 102 distinguishing samples for each of the 44 letters in Pashto. The recognition framework uses zoning feature extractor followed by K-Nearest Neighbour (KNN) and Neural Network (NN) classifiers for classifying individual letter. Based on the evaluation of the proposed system, an overall classification accuracy of approximately 70.05% is achieved by using KNN while 72% is achieved by using NN.
Tasks
Published	2019-04-06
URL	https://arxiv.org/abs/1904.03391v2
PDF	https://arxiv.org/pdf/1904.03391v2.pdf
PWC	https://paperswithcode.com/paper/higher-accurate-recognition-of-handwritten
Repo
Framework

Template-based Unseen Instance Detection


Title	Template-based Unseen Instance Detection
Authors	Jean-Philippe Mercier, Mathieu Garon, Philippe Giguère, Jean-François Lalonde
Abstract	Much of the focus in the object detection literature has been on the problem of identifying the bounding box of a particular class of object in an image. Yet, in contexts such as robotics and augmented reality, it is often necessary to find a specific object instance—a unique toy or a custom industrial part for example—rather than a generic object class. Here, applications can require a rapid shift from one object instance to another, thus requiring fast turnaround which affords little-to-no training time. In this context, we propose a generic approach to detect unseen instances based on templates rendered from a textured 3D model. To this effect, we introduce a network architecture which employs tunable filters, and leverage learned feature embeddings to correlate object templates and query images. At test time, our approach is able to successfully detect a previously unknown (not seen in training) object, even under significant occlusion. For instance, our method offers an improvement of almost 30 mAP over the previous template matching methods on the challenging Occluded Linemod (overall mAP of 50.7). With no access to the objects to be detected at training time, our method still yields detection results that are on par with existing ones that are allowed to train on the objects. By reviving this research direction in the context of more powerful, deep feature extractors, our work sets the stage for more development in the area of unseen object instance detection.
Tasks	Object Detection
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11822v2
PDF	https://arxiv.org/pdf/1911.11822v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-match-templates-for-unseen
Repo
Framework

Balancing Reconstruction Quality and Regularisation in ELBO for VAEs


Title	Balancing Reconstruction Quality and Regularisation in ELBO for VAEs
Authors	Shuyu Lin, Stephen Roberts, Niki Trigoni, Ronald Clark
Abstract	A trade-off exists between reconstruction quality and the prior regularisation in the Evidence Lower Bound (ELBO) loss that Variational Autoencoder (VAE) models use for learning. There are few satisfactory approaches to deal with a balance between the prior and reconstruction objective, with most methods dealing with this problem through heuristics. In this paper, we show that the noise variance (often set as a fixed value) in the Gaussian likelihood p(xz) for real-valued data can naturally act to provide such a balance. By learning this noise variance so as to maximise the ELBO loss, we automatically obtain an optimal trade-off between the reconstruction error and the prior constraint on the posteriors. This variance can be interpreted intuitively as the necessary noise level for the current model to be the best explanation of the observed dataset. Further, by allowing the variance inference to be more flexible it can conveniently be used as an uncertainty estimator for reconstructed or generated samples. We demonstrate that optimising the noise variance is a crucial component of VAE learning, and showcase the performance on MNIST, Fashion MNIST and CelebA datasets. We find our approach can significantly improve the quality of generated samples whilst maintaining a smooth latent-space manifold to represent the data. The method also offers an indication of uncertainty in the final generative model.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03765v1
PDF	https://arxiv.org/pdf/1909.03765v1.pdf
PWC	https://paperswithcode.com/paper/balancing-reconstruction-quality-and
Repo
Framework

Leveraging Frequent Query Substructures to Generate Formal Queries for Complex Question Answering


Title	Leveraging Frequent Query Substructures to Generate Formal Queries for Complex Question Answering
Authors	Jiwei Ding, Wei Hu, Qixin Xu, Yuzhong Qu
Abstract	Formal query generation aims to generate correct executable queries for question answering over knowledge bases (KBs), given entity and relation linking results. Current approaches build universal paraphrasing or ranking models for the whole questions, which are likely to fail in generating queries for complex, long-tail questions. In this paper, we propose SubQG, a new query generation approach based on frequent query substructures, which helps rank the existing (but nonsignificant) query structures or build new query structures. Our experiments on two benchmark datasets show that our approach significantly outperforms the existing ones, especially for complex questions. Also, it achieves promising performance with limited training data and noisy entity/relation linking results.
Tasks	Question Answering
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11053v1
PDF	https://arxiv.org/pdf/1908.11053v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-frequent-query-substructures-to
Repo
Framework

On Neural Learnability of Chaotic Dynamics


Title	On Neural Learnability of Chaotic Dynamics
Authors	Ziwei Li, Sai Ravela
Abstract	Neural networks are of interest for prediction and uncertainty quantification of nonlinear dynamics. The learnability of chaotic dynamics by neural models, however, remains poorly understood. In this paper, we show that a parsimonious feed-forward network trained on a few data points suffices for accurate prediction of local divergence rates on the whole attractor of the Lorenz 63 system. We show that the neural mappings consist of a series of geometric stretching and compressing operations that indicate topological mixing and, therefore, chaos. Thus, chaotic dynamics is learnable. The emergence of topological mixing within the neural system demands a parsimonious neural structure. We synthesize parsimonious structure using an approach that matches the spectrum of learning dynamics with that of a polynomial regression machine derived from the polynomial Lorenz equations.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05081v1
PDF	https://arxiv.org/pdf/1912.05081v1.pdf
PWC	https://paperswithcode.com/paper/on-neural-learnability-of-chaotic-dynamics
Repo
Framework

Posture and sequence recognition for Bharatanatyam dance performances using machine learning approach


Title	Posture and sequence recognition for Bharatanatyam dance performances using machine learning approach
Authors	Tanwi Mallick, Partha Pratim Das, Arun Kumar Majumdar
Abstract	Understanding the underlying semantics of performing arts like dance is a challenging task. Dance is multimedia in nature and spans over time as well as space. Capturing and analyzing the multimedia content of the dance is useful for the preservation of cultural heritage, to build video recommendation systems, to assist learners to use tutoring systems. To develop an application for dance, three aspects of dance analysis need to be addressed: 1) Segmentation of the dance video to find the representative action elements, 2) Matching or recognition of the detected action elements, and 3) Recognition of the dance sequences formed by combining a number of action elements under certain rules. This paper attempts to solve three fundamental problems of dance analysis for understanding the underlying semantics of dance forms. Our focus is on an Indian Classical Dance (ICD) form known as Bharatanatyam. As dance is driven by music, we use the music as well as motion information for key posture extraction. Next, we recognize the key postures using machine learning as well as deep learning techniques. Finally, the dance sequence is recognized using the Hidden Markov Model (HMM). We capture the multi-modal data of Bharatanatyam dance using Kinect and build an annotated data set for research in ICD.
Tasks	Recommendation Systems
Published	2019-09-24
URL	https://arxiv.org/abs/1909.11023v1
PDF	https://arxiv.org/pdf/1909.11023v1.pdf
PWC	https://paperswithcode.com/paper/posture-and-sequence-recognition-for
Repo
Framework