April 1, 2020

2989 words 15 mins read

Paper Group ANR 459

Deep Frequent Spatial Temporal Learning for Face Anti-Spoofing. On the Robustness of Cooperative Multi-Agent Reinforcement Learning. Safe Crossover of Neural Networks Through Neuron Alignment. Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar. Estimation for Compositional Data using Measurements from Nonlinear Sy …

Deep Frequent Spatial Temporal Learning for Face Anti-Spoofing


Title	Deep Frequent Spatial Temporal Learning for Face Anti-Spoofing
Authors	Ying Huang, Wenwei Zhang, Jinzhuo Wang
Abstract	Face anti-spoofing is crucial for the security of face recognition system, by avoiding invaded with presentation attack. Previous works have shown the effectiveness of using depth and temporal supervision for this task. However, depth supervision is often considered only in a single frame, and temporal supervision is explored by utilizing certain signals which is not robust to the change of scenes. In this work, motivated by two stream ConvNets, we propose a novel two stream FreqSaptialTemporalNet for face anti-spoofing which simultaneously takes advantage of frequent, spatial and temporal information. Compared with existing methods which mine spoofing cues in multi-frame RGB image, we make multi-frame spectrum image as one input stream for the discriminative deep neural network, encouraging the primary difference between live and fake video to be automatically unearthed. Extensive experiments show promising improvement results using the proposed architecture. Meanwhile, we proposed a concise method to obtain a large amount of spoofing training data by utilizing a frequent augmentation pipeline, which contributes detail visualization between live and fake images as well as data insufficiency issue when training large networks.
Tasks	Face Anti-Spoofing, Face Recognition
Published	2020-01-20
URL	https://arxiv.org/abs/2002.03723v1
PDF	https://arxiv.org/pdf/2002.03723v1.pdf
PWC	https://paperswithcode.com/paper/deep-frequent-spatial-temporal-learning-for
Repo
Framework

On the Robustness of Cooperative Multi-Agent Reinforcement Learning


Title	On the Robustness of Cooperative Multi-Agent Reinforcement Learning
Authors	Jieyu Lin, Kristina Dzeparoska, Sai Qian Zhang, Alberto Leon-Garcia, Nicolas Papernot
Abstract	In cooperative multi-agent reinforcement learning (c-MARL), agents learn to cooperatively take actions as a team to maximize a total team reward. We analyze the robustness of c-MARL to adversaries capable of attacking one of the agents on a team. Through the ability to manipulate this agent’s observations, the adversary seeks to decrease the total team reward. Attacking c-MARL is challenging for three reasons: first, it is difficult to estimate team rewards or how they are impacted by an agent mispredicting; second, models are non-differentiable; and third, the feature space is low-dimensional. Thus, we introduce a novel attack. The attacker first trains a policy network with reinforcement learning to find a wrong action it should encourage the victim agent to take. Then, the adversary uses targeted adversarial examples to force the victim to take this action. Our results on the StartCraft II multi-agent benchmark demonstrate that c-MARL teams are highly vulnerable to perturbations applied to one of their agent’s observations. By attacking a single agent, our attack method has highly negative impact on the overall team reward, reducing it from 20 to 9.4. This results in the team’s winning rate to go down from 98.9% to 0%.
Tasks	Multi-agent Reinforcement Learning
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03722v1
PDF	https://arxiv.org/pdf/2003.03722v1.pdf
PWC	https://paperswithcode.com/paper/on-the-robustness-of-cooperative-multi-agent
Repo
Framework

Safe Crossover of Neural Networks Through Neuron Alignment


Title	Safe Crossover of Neural Networks Through Neuron Alignment
Authors	Thomas Uriot, Dario Izzo
Abstract	One of the main and largely unexplored challenges in evolving the weights of neural networks using genetic algorithms is to find a sensible crossover operation between parent networks. Indeed, naive crossover leads to functionally damaged offspring that do not retain information from the parents. This is because neural networks are invariant to permutations of neurons, giving rise to multiple ways of representing the same solution. This is often referred to as the competing conventions problem. In this paper, we propose a two-step safe crossover(SC) operator. First, the neurons of the parents are functionally aligned by computing how well they correlate, and only then are the parents recombined. We compare two ways of measuring relationships between neurons: Pairwise Correlation (PwC) and Canonical Correlation Analysis (CCA). We test our safe crossover operators (SC-PwC and SC-CCA) on MNIST and CIFAR-10 by performing arithmetic crossover on the weights of feed-forward neural network pairs. We show that it effectively transmits information from parents to offspring and significantly improves upon naive crossover. Our method is computationally fast,can serve as a way to explore the fitness landscape more efficiently and makes safe crossover a potentially promising operator in future neuroevolution research and applications.
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10306v2
PDF	https://arxiv.org/pdf/2003.10306v2.pdf
PWC	https://paperswithcode.com/paper/safe-crossover-of-neural-networks-through
Repo
Framework


Title	Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar
Authors	Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone, Kelly D. Sherbondy
Abstract	In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a congested spectral environment, and the ability to share 100MHz spectrum with an uncooperative communications system. We examine policy iteration, which solves an environment posed as a Markov Decision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well as Deep RL techniques, which utilize a form of Q-Learning to approximate a parameterized function that is used by the radar to select optimal actions. We show that RL techniques are beneficial over a Sense-and-Avoid (SAA) scheme and discuss the conditions under which each approach is most effective.
Tasks	Q-Learning
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01799v2
PDF	https://arxiv.org/pdf/2001.01799v2.pdf
PWC	https://paperswithcode.com/paper/experimental-analysis-of-reinforcement
Repo
Framework

Estimation for Compositional Data using Measurements from Nonlinear Systems using Artificial Neural Networks


Title	Estimation for Compositional Data using Measurements from Nonlinear Systems using Artificial Neural Networks
Authors	Se Un Park
Abstract	Our objective is to estimate the unknown compositional input from its output response through an unknown system after estimating the inverse of the original system with a training set. The proposed methods using artificial neural networks (ANNs) can compete with the optimal bounds for linear systems, where convex optimization theory applies, and demonstrate promising results for nonlinear system inversions. We performed extensive experiments by designing numerous different types of nonlinear systems.
Tasks
Published	2020-01-24
URL	https://arxiv.org/abs/2001.09040v1
PDF	https://arxiv.org/pdf/2001.09040v1.pdf
PWC	https://paperswithcode.com/paper/estimation-for-compositional-data-using
Repo
Framework

DCT-Conv: Coding filters in convolutional networks with Discrete Cosine Transform


Title	DCT-Conv: Coding filters in convolutional networks with Discrete Cosine Transform
Authors	Karol Chęciński, Paweł Wawrzyński
Abstract	Convolutional neural networks are based on a huge number of trained weights. Consequently, they are often data-greedy, sensitive to overtraining, and learn slowly. We follow the line of research in which filters of convolutional neural layers are determined on the basis of a smaller number of trained parameters. In this paper, the trained parameters define a frequency spectrum which is transformed into convolutional filters with Inverse Discrete Cosine Transform (IDCT, the same is applied in decompression from JPEG). We analyze how switching off selected components of the spectra, thereby reducing the number of trained weights of the network, affects its performance. Our experiments show that coding the filters with trained DCT parameters leads to improvement over traditional convolution. Also, the performance of the networks modified this way decreases very slowly with the increasing extent of switching off these parameters. In some experiments, a good performance is observed when even 99.9% of these parameters are switched off.
Tasks
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08517v3
PDF	https://arxiv.org/pdf/2001.08517v3.pdf
PWC	https://paperswithcode.com/paper/dct-conv-coding-filters-in-convolutional
Repo
Framework

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning


Title	Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Authors	Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou
Abstract	Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \cite{liu18breaking} proposed an approach that avoids the \emph{curse of horizon} suffered by typical importance-sampling-based methods. While showing promising results, this approach is limited in practice as it requires data be drawn from the \emph{stationary distribution} of a \emph{known} behavior policy. In this work, we propose a novel approach that eliminates such limitations. In particular, we formulate the problem as solving for the fixed point of a certain operator. Using tools from Reproducing Kernel Hilbert Spaces (RKHSs), we develop a new estimator that computes importance ratios of stationary distributions, without knowledge of how the off-policy data are collected. We analyze its asymptotic consistency and finite-sample generalization. Experiments on benchmarks verify the effectiveness of our approach.
Tasks
Published	2020-03-24
URL	https://arxiv.org/abs/2003.11126v1
PDF	https://arxiv.org/pdf/2003.11126v1.pdf
PWC	https://paperswithcode.com/paper/black-box-off-policy-estimation-for-infinite-1
Repo
Framework

Polarimetric Guided Nonlocal Means Covariance Matrix Estimation for Defoliation Mapping


Title	Polarimetric Guided Nonlocal Means Covariance Matrix Estimation for Defoliation Mapping
Authors	Jørgen A. Agersborg, Stian Normann Anfinsen, Jane Uhd Jepsen
Abstract	In this study we investigate the potential for using Synthetic Aperture Radar (SAR) data to provide high resolution defoliation and regrowth mapping of trees in the tundra-forest ecotone. Using in situ measurements collected in 2017 we calculated the proportion of both live and defoliated tree crown for 165 $10 m \times 10 m$ ground plots along six transects. Quad-polarimetric SAR data from RADARSAT-2 was collected from the same area, and the complex multilook polarimetric covariance matrix was calculated using a novel extension of guided nonlocal means speckle filtering. The nonlocal approach allows us to preserve the high spatial resolution of single-look complex data, which is essential for accurate mapping of the sparsely scattered trees in the study area. Using a standard random forest classification algorithm, our filtering results in a $73.8 %$ classification accuracy, higher than traditional speckle filtering methods, and on par with the classification accuracy based on optical data.
Tasks
Published	2020-01-24
URL	https://arxiv.org/abs/2001.08976v1
PDF	https://arxiv.org/pdf/2001.08976v1.pdf
PWC	https://paperswithcode.com/paper/polarimetric-guided-nonlocal-means-covariance
Repo
Framework

A continuum limit for the PageRank algorithm


Title	A continuum limit for the PageRank algorithm
Authors	Amber Yuan, Jeff Calder, Braxton Osting
Abstract	Semi-supervised and unsupervised machine learning methods often rely on graphs to model data, prompting research on how theoretical properties of operators on graphs are leveraged in learning problems. While most of the existing literature focuses on undirected graphs, directed graphs are very important in practice, giving models for physical, biological, or transportation networks, among many other applications. In this paper, we propose a new framework for rigorously studying continuum limits of learning algorithms on directed graphs. We use the new framework to study the PageRank algorithm, and show how it can be interpreted as a numerical scheme on a directed graph involving a type of normalized graph Laplacian. We show that the corresponding continuum limit problem, which is taken as the number of webpages grows to infinity, is a second-order, possibly degenerate, elliptic equation that contains reaction, diffusion, and advection terms. We prove that the numerical scheme is consistent and stable and compute explicit rates of convergence of the discrete solution to the solution of the continuum limit PDE. We give applications to proving stability and asymptotic regularity of the PageRank vector.
Tasks
Published	2020-01-24
URL	https://arxiv.org/abs/2001.08973v2
PDF	https://arxiv.org/pdf/2001.08973v2.pdf
PWC	https://paperswithcode.com/paper/a-continuum-limit-for-the-pagerank-algorithm
Repo
Framework

Bag of Tricks for Retail Product Image Classification


Title	Bag of Tricks for Retail Product Image Classification
Authors	Muktabh Mayank Srivastava
Abstract	Retail Product Image Classification is an important Computer Vision and Machine Learning problem for building real world systems like self-checkout stores and automated retail execution evaluation. In this work, we present various tricks to increase accuracy of Deep Learning models on different types of retail product image classification datasets. These tricks enable us to increase the accuracy of fine tuned convnets for retail product image classification by a large margin. As the most prominent trick, we introduce a new neural network layer called Local-Concepts-Accumulation (LCA) layer which gives consistent gains across multiple datasets. Two other tricks we find to increase accuracy on retail product identification are using an instagram-pretrained Convnet and using Maximum Entropy as an auxiliary loss for classification.
Tasks	Image Classification
Published	2020-01-12
URL	https://arxiv.org/abs/2001.03992v1
PDF	https://arxiv.org/pdf/2001.03992v1.pdf
PWC	https://paperswithcode.com/paper/bag-of-tricks-for-retail-product-image
Repo
Framework

On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views


Title	On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views
Authors	Abdellatif Zaidi, Inaki Estella Aguerri, Shlomo Shamai
Abstract	This tutorial paper focuses on the variants of the bottleneck problem taking an information theoretic perspective and discusses practical methods to solve it, as well as its connection to coding and learning aspects. The intimate connections of this setting to remote source-coding under logarithmic loss distortion measure, information combining, common reconstruction, the Wyner-Ahlswede-Korner problem, the efficiency of investment information, as well as, generalization, variational inference, representation learning, autoencoders, and others are highlighted. We discuss its extension to the distributed information bottleneck problem with emphasis on the Gaussian model and highlight the basic connections to the uplink Cloud Radio Access Networks (CRAN) with oblivious processing. For this model, the optimal trade-offs between relevance (i.e., information) and complexity (i.e., rates) in the discrete and vector Gaussian frameworks is determined. In the concluding outlook, some interesting problems are mentioned such as the characterization of the optimal inputs (“features”) distributions under power limitations maximizing the “relevance” for the Gaussian information bottleneck, under “complexity” constraints.
Tasks	Representation Learning
Published	2020-01-31
URL	https://arxiv.org/abs/2002.00008v1
PDF	https://arxiv.org/pdf/2002.00008v1.pdf
PWC	https://paperswithcode.com/paper/on-the-information-bottleneck-problems-models
Repo
Framework

PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference


Title	PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference
Authors	Masashi Okada, Norio Kosaka, Tadahiro Taniguchi
Abstract	In the present paper, we propose an extension of the Deep Planning Network (PlaNet), also referred to as PlaNet of the Bayesians (PlaNet-Bayes). There has been a growing demand in model predictive control (MPC) in partially observable environments in which complete information is unavailable because of, for example, lack of expensive sensors. PlaNet is a promising solution to realize such latent MPC, as it is used to train state-space models via model-based reinforcement learning (MBRL) and to conduct planning in the latent space. However, recent state-of-the-art strategies mentioned in MBRR literature, such as involving uncertainty into training and planning, have not been considered, significantly suppressing the training performance. The proposed extension is to make PlaNet uncertainty-aware on the basis of Bayesian inference, in which both model and action uncertainty are incorporated. Uncertainty in latent models is represented using a neural network ensemble to approximately infer model posteriors. The ensemble of optimal action candidates is also employed to capture multimodal uncertainty in the optimality. The concept of the action ensemble relies on a general variational inference MPC (VI-MPC) framework and its instance, probabilistic action ensemble with trajectory sampling (PaETS). In this paper, we extend VI-MPC and PaETS, which have been originally introduced in previous literature, to address partially observable cases. We experimentally compare the performances on continuous control tasks, and conclude that our method can consistently improve the asymptotic performance compared with PlaNet.
Tasks	Bayesian Inference, Continuous Control
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00370v1
PDF	https://arxiv.org/pdf/2003.00370v1.pdf
PWC	https://paperswithcode.com/paper/planet-of-the-bayesians-reconsidering-and
Repo
Framework

A Note on Portfolio Optimization with Quadratic Transaction Costs


Title	A Note on Portfolio Optimization with Quadratic Transaction Costs
Authors	Pierre Chen, Edmond Lezmi, Thierry Roncalli, Jiali Xu
Abstract	In this short note, we consider mean-variance optimized portfolios with transaction costs. We show that introducing quadratic transaction costs makes the optimization problem more difficult than using linear transaction costs. The reason lies in the specification of the budget constraint, which is no longer linear. We provide numerical algorithms for solving this issue and illustrate how transaction costs may considerably impact the expected returns of optimized portfolios.
Tasks	Portfolio Optimization
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01612v1
PDF	https://arxiv.org/pdf/2001.01612v1.pdf
PWC	https://paperswithcode.com/paper/a-note-on-portfolio-optimization-with
Repo
Framework

Generalized Sliced Distances for Probability Distributions


Title	Generalized Sliced Distances for Probability Distributions
Authors	Soheil Kolouri, Kimia Nadjahi, Umut Simsekli, Shahin Shahrampour
Abstract	Probability metrics have become an indispensable part of modern statistics and machine learning, and they play a quintessential role in various applications, including statistical hypothesis testing and generative modeling. However, in a practical setting, the convergence behavior of the algorithms built upon these distances have not been well established, except for a few specific cases. In this paper, we introduce a broad family of probability metrics, coined as Generalized Sliced Probability Metrics (GSPMs), that are deeply rooted in the generalized Radon transform. We first verify that GSPMs are metrics. Then, we identify a subset of GSPMs that are equivalent to maximum mean discrepancy (MMD) with novel positive definite kernels, which come with a unique geometric interpretation. Finally, by exploiting this connection, we consider GSPM-based gradient flows for generative modeling applications and show that under mild assumptions, the gradient flow converges to the global optimum. We illustrate the utility of our approach on both real and synthetic problems.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12537v1
PDF	https://arxiv.org/pdf/2002.12537v1.pdf
PWC	https://paperswithcode.com/paper/generalized-sliced-distances-for-probability
Repo
Framework

$MC^2RAM$: Markov Chain Monte Carlo Sampling in SRAM for Fast Bayesian Inference


Title	$MC^2RAM$: Markov Chain Monte Carlo Sampling in SRAM for Fast Bayesian Inference
Authors	Priyesh Shukla, Ahish Shylendra, Theja Tulabandhula, Amit Ranjan Trivedi
Abstract	This work discusses the implementation of Markov Chain Monte Carlo (MCMC) sampling from an arbitrary Gaussian mixture model (GMM) within SRAM. We show a novel architecture of SRAM by embedding it with random number generators (RNGs), digital-to-analog converters (DACs), and analog-to-digital converters (ADCs) so that SRAM arrays can be used for high performance Metropolis-Hastings (MH) algorithm-based MCMC sampling. Most of the expensive computations are performed within the SRAM and can be parallelized for high speed sampling. Our iterative compute flow minimizes data movement during sampling. We characterize power-performance trade-off of our design by simulating on 45 nm CMOS technology. For a two-dimensional, two mixture GMM, the implementation consumes ~ 91 micro-Watts power per sampling iteration and produces 500 samples in 2000 clock cycles on an average at 1 GHz clock frequency. Our study highlights interesting insights on how low-level hardware non-idealities can affect high-level sampling characteristics, and recommends ways to optimally operate SRAM within area/power constraints for high performance sampling.
Tasks	Bayesian Inference
Published	2020-02-28
URL	https://arxiv.org/abs/2003.02629v1
PDF	https://arxiv.org/pdf/2003.02629v1.pdf
PWC	https://paperswithcode.com/paper/mc2ram-markov-chain-monte-carlo-sampling-in
Repo
Framework