January 28, 2020

3199 words 16 mins read

Paper Group ANR 983

Paper Group ANR 983

Performance Analysis of Spatial and Transform Filters for Efficient Image Noise Reduction. Interpolating Local and Global Search by Controlling the Variance of Standard Bit Mutation. Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization. A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmen …

Performance Analysis of Spatial and Transform Filters for Efficient Image Noise Reduction

Title Performance Analysis of Spatial and Transform Filters for Efficient Image Noise Reduction
Authors Santosh Paudel, Ajay Kumar Shrestha, Pradip Singh Maharjan, Rameshwar Rijal
Abstract During the acquisition of an image from its source, noise always becomes an integral part of it. Various algorithms have been used in past to denoise the images. Image denoising still has scope for improvement. Visual information transmitted in the form of digital images has become a considerable method of communication in the modern age, but the image obtained after the transmission is often corrupted due to noise. In this paper, we review the existing denoising algorithms such as filtering approach and wavelets based approach and then perform their comparative study with bilateral filters. We use different noise models to describe additive and multiplicative noise in an image. Based on the samples of degraded pixel neighbourhoods as inputs, the output of an efficient filtering approach has shown a better image denoising performance. This yields promising qualitative and quantitative results of the degraded noisy images in terms of Peak Signal to Noise Ratio, Mean Square Error and Universal Quality Identifier.
Tasks Denoising, Image Denoising
Published 2019-09-14
URL https://arxiv.org/abs/1909.06507v1
PDF https://arxiv.org/pdf/1909.06507v1.pdf
PWC https://paperswithcode.com/paper/performance-analysis-of-spatial-and-transform
Repo
Framework

Interpolating Local and Global Search by Controlling the Variance of Standard Bit Mutation

Title Interpolating Local and Global Search by Controlling the Variance of Standard Bit Mutation
Authors Furong Ye, Carola Doerr, Thomas Bäck
Abstract A key property underlying the success of evolutionary algorithms (EAs) is their global search behavior, which allows the algorithms to `jump’ from a current state to other parts of the search space, thereby avoiding to get stuck in local optima. This property is obtained through a random choice of the radius at which offspring are sampled from previously evaluated solutions. It is well known that, thanks to this global search behavior, the probability that an EA using standard bit mutation finds a global optimum of an arbitrary function $f:{0,1}^n \to \mathbb{R}$ tends to one as the number of function evaluations grows. This advantage over heuristics using a fixed search radius, however, comes at the cost of using non-optimal step sizes also in those regimes in which the optimal rate is stable for a long time. This downside results in significant performance losses for many standard benchmark problems. We introduce in this work a simple way to interpolate between the random global search of EAs and their deterministic counterparts which sample from a fixed radius only. To this end, we introduce \emph{normalized standard bit mutation}, in which the binomial choice of the search radius is replaced by a normal distribution. Normalized standard bit mutation allows a straightforward way to control its variance, and hence the degree of randomness involved. We experiment with a self-adjusting choice of this variance, and demonstrate its effectiveness for the two classic benchmark problems LeadingOnes and OneMax. Our work thereby also touches a largely ignored question in discrete evolutionary computation: multi-dimensional parameter control. |
Tasks
Published 2019-01-17
URL http://arxiv.org/abs/1901.05573v1
PDF http://arxiv.org/pdf/1901.05573v1.pdf
PWC https://paperswithcode.com/paper/interpolating-local-and-global-search-by
Repo
Framework

Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization

Title Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization
Authors Homa Hosseinmardi, Hsien-Te Kao, Kristina Lerman, Emilio Ferrara
Abstract In recent years, the rapid growth in technology has increased the opportunity for longitudinal human behavioral studies. Rich multimodal data, from wearables like Fitbit, online social networks, mobile phones etc. can be collected in natural environments. Uncovering the underlying low-dimensional structure of noisy multi-way data in an unsupervised setting is a challenging problem. Tensor factorization has been successful in extracting the interconnected low-dimensional descriptions of multi-way data. In this paper, we apply non-negative tensor factorization on a real-word wearable sensor data, StudentLife, to find latent temporal factors and group of similar individuals. Meta data is available for the semester schedule, as well as the individuals’ performance and personality. We demonstrate that non-negative tensor factorization can successfully discover clusters of individuals who exhibit higher academic performance, as well as those who frequently engage in leisure activities. The recovered latent temporal patterns associated with these groups are validated against ground truth data to demonstrate the accuracy of our framework.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08846v1
PDF https://arxiv.org/pdf/1905.08846v1.pdf
PWC https://paperswithcode.com/paper/discovering-hidden-structure-in-high
Repo
Framework

A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation

Title A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation
Authors Hilde Kuehne, Alexander Richard, Juergen Gall
Abstract Action recognition has become a rapidly developing research field within the last decade. But with the increasing demand for large scale data, the need of hand annotated data for the training becomes more and more impractical. One way to avoid frame-based human annotation is the use of action order information to learn the respective action classes. In this context, we propose a hierarchical approach to address the problem of weakly supervised learning of human actions from ordered action labels by structuring recognition in a coarse-to-fine manner. Given a set of videos and an ordered list of the occurring actions, the task is to infer start and end frames of the related action classes within the video and to train the respective action classifiers without any need for hand labeled frame boundaries. We address this problem by combining a framewise RNN model with a coarse probabilistic inference. This combination allows for the temporal alignment of long sequences and thus, for an iterative training of both elements. While this system alone already generates good results, we show that the performance can be further improved by approximating the number of subactions to the characteristics of the different action classes as well as by the introduction of a regularizing length prior. The proposed system is evaluated on two benchmark datasets, the Breakfast and the Hollywood extended dataset, showing a competitive performance on various weak learning tasks such as temporal action segmentation and action alignment.
Tasks action segmentation
Published 2019-06-03
URL https://arxiv.org/abs/1906.01028v1
PDF https://arxiv.org/pdf/1906.01028v1.pdf
PWC https://paperswithcode.com/paper/a-hybrid-rnn-hmm-approach-for-weakly
Repo
Framework

Intriguing properties of adversarial training at scale

Title Intriguing properties of adversarial training at scale
Authors Cihang Xie, Alan Yuille
Abstract Adversarial training is one of the main defenses against adversarial attacks. In this paper, we provide the first rigorous study on diagnosing elements of adversarial training, which reveals two intriguing properties. First, we study the role of normalization. Batch normalization (BN) is a crucial element for achieving state-of-the-art performance on many vision tasks, but we show it may prevent networks from obtaining strong robustness in adversarial training. One unexpected observation is that, for models trained with BN, simply removing clean images from training data largely boosts adversarial robustness, i.e., 18.3%. We relate this phenomenon to the hypothesis that clean images and adversarial images are drawn from two different domains. This two-domain hypothesis may explain the issue of BN when training with a mixture of clean and adversarial images, as estimating normalization statistics of this mixture distribution is challenging. Guided by this two-domain hypothesis, we show disentangling the mixture distribution for normalization, i.e., applying separate BNs to clean and adversarial images for statistics estimation, achieves much stronger robustness. Additionally, we find that enforcing BNs to behave consistently at training and testing can further enhance robustness. Second, we study the role of network capacity. We find our so-called “deep” networks are still shallow for the task of adversarial learning. Unlike traditional classification tasks where accuracy is only marginally improved by adding more layers to “deep” networks (e.g., ResNet-152), adversarial training exhibits a much stronger demand on deeper networks to achieve higher adversarial robustness. This robustness improvement can be observed substantially and consistently even by pushing the network capacity to an unprecedented scale, i.e., ResNet-638.
Tasks
Published 2019-06-10
URL https://arxiv.org/abs/1906.03787v2
PDF https://arxiv.org/pdf/1906.03787v2.pdf
PWC https://paperswithcode.com/paper/intriguing-properties-of-adversarial-training
Repo
Framework

The Deeper, the Better: Analysis of Person Attributes Recognition

Title The Deeper, the Better: Analysis of Person Attributes Recognition
Authors Esube Bekele, Wallace Lawson
Abstract In person attributes recognition, we describe a person in terms of their appearance. Typically, this includes a wide range of traits including age, gender, clothing, and footwear. Although this could be used in a wide variety of scenarios, it generally is applied to video surveillance, where attribute recognition is impacted by low resolution, and other issues such as variable pose, occlusion and shadow. Recent approaches have used deep convolutional neural networks (CNNs) to improve the accuracy in person attribute recognition. However, many of these networks are relatively shallow and it is unclear to what extent they use contextual cues to improve classification accuracy. In this paper, we propose deeper methods for person attribute recognition. Interpreting the reasons behind the classification is highly important, as it can provide insight into how the classifier is making decisions. Interpretation suggests that deeper networks generally take more contextual information into consideration, which helps improve classification accuracy and generalizability. We present experimental analysis and results for whole body attributes using the PA-100K and PETA datasets and facial attributes using the CelebA dataset.
Tasks
Published 2019-01-11
URL http://arxiv.org/abs/1901.03756v1
PDF http://arxiv.org/pdf/1901.03756v1.pdf
PWC https://paperswithcode.com/paper/the-deeper-the-better-analysis-of-person
Repo
Framework

Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation

Title Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation
Authors Xipeng Chen, Kwan-Yee Lin, Wentao Liu, Chen Qian, Xiaogang Wang, Liang Lin
Abstract Recent studies have shown remarkable advances in 3D human pose estimation from monocular images, with the help of large-scale in-door 3D datasets and sophisticated network architectures. However, the generalizability to different environments remains an elusive goal. In this work, we propose a geometry-aware 3D representation for the human pose to address this limitation by using multiple views in a simple auto-encoder model at the training stage and only 2D keypoint information as supervision. A view synthesis framework is proposed to learn the shared 3D representation between viewpoints with synthesizing the human pose from one viewpoint to the other one. Instead of performing a direct transfer in the raw image-level, we propose a skeleton-based encoder-decoder mechanism to distil only pose-related representation in the latent space. A learning-based representation consistency constraint is further introduced to facilitate the robustness of latent 3D representation. Since the learnt representation encodes 3D geometry information, mapping it to 3D pose will be much easier than conventional frameworks that use an image or 2D coordinates as the input of 3D pose estimator. We demonstrate our approach on the task of 3D human pose estimation. Comprehensive experiments on three popular benchmarks show that our model can significantly improve the performance of state-of-the-art methods with simply injecting the representation as a robust 3D prior.
Tasks 3D Human Pose Estimation, Pose Estimation
Published 2019-03-21
URL http://arxiv.org/abs/1903.08839v2
PDF http://arxiv.org/pdf/1903.08839v2.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-discovery-of-geometry-aware
Repo
Framework

Soft Options Critic

Title Soft Options Critic
Authors Elita Lobo, Scott Jordan
Abstract The option-critic architecture (Bacon, Harb, and Precup 2017) and several variants have successfully demonstrated the use of the options framework proposed by Sutton et al (Sutton, Precup, and Singh1999) to scale learning and planning in hierarchical tasks. Although most of these frameworks use entropy as a regularizer to improve exploration, they do not maximize entropy along with returns at every time step. (Haarnoja et al., 2018d) recently introduced an off-policy actor critic algorithm in theSoft Actor Critic paper that maximize returns while maximizing entropy in a constrained manner thus enabling learning of robust options in continuous and discrete action spaces In this paper we adopt the architecture of soft-actor critic to investigate the effect of maximizing entropy of each options and inter-option policy in options framework. We derive the soft options improvement theorem and propose a novel soft-options framework to incorporate maximization of entropy of actions and options in a constrained manner. Our experiments show that the modified options-critic framework generates robust policies which allows fast recovery when environment is subjected to perturbations and outperforms vanilla options-critic framework in most hierarchical tasks
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.11222v2
PDF https://arxiv.org/pdf/1905.11222v2.pdf
PWC https://paperswithcode.com/paper/soft-options-critic
Repo
Framework

A new Potential-Based Reward Shaping for Reinforcement Learning Agent

Title A new Potential-Based Reward Shaping for Reinforcement Learning Agent
Authors Babak Badnava, Nasser Mozayani
Abstract Potential-based reward shaping (PBRS) is a particular category of machine learning methods which aims to improve the learning speed of a reinforcement learning agent by extracting and utilizing extra knowledge while performing a task. There are two steps in the process of transfer learning: extracting knowledge from previously learned tasks and transferring that knowledge to use it in a target task. The latter step is well discussed in the literature with various methods being proposed for it, while the former has been explored less. With this in mind, the type of knowledge that is transmitted is very important and can lead to considerable improvement. Among the literature of both the transfer learning and the potential-based reward shaping, a subject that has never been addressed is the knowledge gathered during the learning process itself. In this paper, we presented a novel potential-based reward shaping method that attempted to extract knowledge from the learning process. The proposed method extracts knowledge from episodes’ cumulative rewards. The proposed method has been evaluated in the Arcade learning environment and the results indicate an improvement in the learning process in both the single-task and the multi-task reinforcement learner agents.
Tasks Atari Games, Transfer Learning
Published 2019-02-17
URL https://arxiv.org/abs/1902.06239v2
PDF https://arxiv.org/pdf/1902.06239v2.pdf
PWC https://paperswithcode.com/paper/a-new-potential-based-reward-shaping-for
Repo
Framework

Development of a hand pose recognition system on an embedded computer using CNNs

Title Development of a hand pose recognition system on an embedded computer using CNNs
Authors Dennis Núñez Fernández
Abstract Demand of hand pose recognition systems are growing in the last years in technologies like human-machine interfaces. This work suggests an approach for hand pose recognition in embedded computers using hand tracking and CNNs. Results show a fast time response with an accuracy of 94.50% and low power consumption.
Tasks
Published 2019-10-18
URL https://arxiv.org/abs/1910.11100v1
PDF https://arxiv.org/pdf/1910.11100v1.pdf
PWC https://paperswithcode.com/paper/development-of-a-hand-pose-recognition-system
Repo
Framework

Run-Time Efficient RNN Compression for Inference on Edge Devices

Title Run-Time Efficient RNN Compression for Inference on Edge Devices
Authors Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina
Abstract Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints. As a result, there is a need for compression techniques that can achieve significant compression without negatively impacting inference run-time and task accuracy. This paper explores a new compressed RNN cell implementation called Hybrid Matrix Decomposition (HMD) that achieves this dual objective. This scheme divides the weight matrix into two parts - an unconstrained upper half and a lower half composed of rank-1 blocks. This results in output features where the upper sub-vector has “richer” features while the lower-sub vector has “constrained features”. HMD can compress RNNs by a factor of 2-4x while having a faster run-time than pruning (Zhu &Gupta, 2017) and retaining more model accuracy than matrix factorization (Grachev et al., 2017). We evaluate this technique on 5 benchmarks spanning 3 different applications, illustrating its generality in the domain of edge computing.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.04886v3
PDF https://arxiv.org/pdf/1906.04886v3.pdf
PWC https://paperswithcode.com/paper/run-time-efficient-rnn-compression-for
Repo
Framework

Nonparametric Functional Approximation with Delaunay Triangulation

Title Nonparametric Functional Approximation with Delaunay Triangulation
Authors Yehong Liu, Guosheng Yin
Abstract We propose a differentiable nonparametric algorithm, the Delaunay triangulation learner (DTL), to solve the functional approximation problem on the basis of a $p$-dimensional feature space. By conducting the Delaunay triangulation algorithm on the data points, the DTL partitions the feature space into a series of $p$-dimensional simplices in a geometrically optimal way, and fits a linear model within each simplex. We study its theoretical properties by exploring the geometric properties of the Delaunay triangulation, and compare its performance with other statistical learners in numerical studies.
Tasks
Published 2019-06-02
URL https://arxiv.org/abs/1906.00350v1
PDF https://arxiv.org/pdf/1906.00350v1.pdf
PWC https://paperswithcode.com/paper/190600350
Repo
Framework

Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation

Title Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation
Authors Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Man-Chung Yue, Daniel Kuhn, Wolfram Wiesemann
Abstract The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the given sample point. We show that when the neighborhood is constructed by the Kullback-Leibler divergence, by moment conditions or by the Wasserstein distance, then our \textit{optimistic likelihood} can be determined through the solution of a convex optimization problem, and it admits an analytical expression in particular cases. We also show that the posterior inference problem with our optimistic likelihood approximation enjoys strong theoretical performance guarantees, and it performs competitively in a probabilistic classification task.
Tasks
Published 2019-10-23
URL https://arxiv.org/abs/1910.10583v1
PDF https://arxiv.org/pdf/1910.10583v1.pdf
PWC https://paperswithcode.com/paper/optimistic-distributionally-robust
Repo
Framework

Countering Inconsistent Labelling by Google’s Vision API for Rotated Images

Title Countering Inconsistent Labelling by Google’s Vision API for Rotated Images
Authors Aman Apte, Aritra Bandyopadhyay, K Akhilesh Shenoy, Jason Peter Andrews, Aditya Rathod, Manish Agnihotri, Aditya Jajodia
Abstract Google’s Vision API analyses images and provides a variety of output predictions, one such type is context-based labelling. In this paper, it is shown that adversarial examples that cause incorrect label prediction and spoofing can be generated by rotating the images. Due to the black-boxed nature of the API, a modular context-based pre-processing pipeline is proposed consisting of a Res-Net50 model, that predicts the angle by which the image must be rotated to correct its orientation. The pipeline successfully performs the correction whilst maintaining the image’s resolution and feeds it to the API which generates labels similar to the original correctly oriented image and using a Percentage Error metric, the performance of the corrected images as compared to its rotated counter-parts is found to be significantly higher. These observations imply that the API can benefit from such a pre-processing pipeline to increase robustness to rotational perturbances.
Tasks
Published 2019-11-17
URL https://arxiv.org/abs/1911.07201v1
PDF https://arxiv.org/pdf/1911.07201v1.pdf
PWC https://paperswithcode.com/paper/countering-inconsistent-labelling-by-googles
Repo
Framework

Deep One-Class Classification Using Intra-Class Splitting

Title Deep One-Class Classification Using Intra-Class Splitting
Authors Patrick Schlachter, Yiwen Liao, Bin Yang
Abstract This paper introduces a generic method which enables to use conventional deep neural networks as end-to-end one-class classifiers. The method is based on splitting given data from one class into two subsets. In one-class classification, only samples of one normal class are available for training. During inference, a closed and tight decision boundary around the training samples is sought which conventional binary or multi-class neural networks are not able to provide. By splitting data into typical and atypical normal subsets, the proposed method can use a binary loss and defines an auxiliary subnetwork for distance constraints in the latent space. Various experiments on three well-known image datasets showed the effectiveness of the proposed method which outperformed seven baselines and had a better or comparable performance to the state-of-the-art.
Tasks
Published 2019-02-04
URL https://arxiv.org/abs/1902.01194v4
PDF https://arxiv.org/pdf/1902.01194v4.pdf
PWC https://paperswithcode.com/paper/deep-one-class-classification-using-data
Repo
Framework
comments powered by Disqus