April 2, 2020

2869 words 14 mins read

Paper Group ANR 179

Paper Group ANR 179

Distributionally Robust Chance Constrained Programming with Generative Adversarial Networks (GANs). Roto-Translation Equivariant Convolutional Networks: Application to Histopathology Image Analysis. On the Compressibility of Affinely Singular Random Vectors. Neuron Shapley: Discovering the Responsible Neurons. MSE-Optimal Neural Network Initializat …

Distributionally Robust Chance Constrained Programming with Generative Adversarial Networks (GANs)

Title Distributionally Robust Chance Constrained Programming with Generative Adversarial Networks (GANs)
Authors Shipu Zhao, Fengqi You
Abstract This paper presents a novel deep learning based data-driven optimization method. A novel generative adversarial network (GAN) based data-driven distributionally robust chance constrained programming framework is proposed. GAN is applied to fully extract distributional information from historical data in a nonparametric and unsupervised way without a priori approximation or assumption. Since GAN utilizes deep neural networks, complicated data distributions and modes can be learned, and it can model uncertainty efficiently and accurately. Distributionally robust chance constrained programming takes into consideration ambiguous probability distributions of uncertain parameters. To tackle the computational challenges, sample average approximation method is adopted, and the required data samples are generated by GAN in an end-to-end way through the differentiable networks. The proposed framework is then applied to supply chain optimization under demand uncertainty. The applicability of the proposed approach is illustrated through a county-level case study of a spatially explicit biofuel supply chain in Illinois.
Published 2020-02-28
URL https://arxiv.org/abs/2002.12486v1
PDF https://arxiv.org/pdf/2002.12486v1.pdf
PWC https://paperswithcode.com/paper/distributionally-robust-chance-constrained

Roto-Translation Equivariant Convolutional Networks: Application to Histopathology Image Analysis

Title Roto-Translation Equivariant Convolutional Networks: Application to Histopathology Image Analysis
Authors Maxime W. Lafarge, Erik J. Bekkers, Josien P. W. Pluim, Remco Duits, Mitko Veta
Abstract Rotation-invariance is a desired property of machine-learning models for medical image analysis and in particular for computational pathology applications. We propose a framework to encode the geometric structure of the special Euclidean motion group SE(2) in convolutional networks to yield translation and rotation equivariance via the introduction of SE(2)-group convolution layers. This structure enables models to learn feature representations with a discretized orientation dimension that guarantees that their outputs are invariant under a discrete set of rotations. Conventional approaches for rotation invariance rely mostly on data augmentation, but this does not guarantee the robustness of the output when the input is rotated. At that, trained conventional CNNs may require test-time rotation augmentation to reach their full capability. This study is focused on histopathology image analysis applications for which it is desirable that the arbitrary global orientation information of the imaged tissues is not captured by the machine learning models. The proposed framework is evaluated on three different histopathology image analysis tasks (mitosis detection, nuclei segmentation and tumor classification). We present a comparative analysis for each problem and show that consistent increase of performances can be achieved when using the proposed framework.
Tasks Data Augmentation, Mitosis Detection
Published 2020-02-20
URL https://arxiv.org/abs/2002.08725v1
PDF https://arxiv.org/pdf/2002.08725v1.pdf
PWC https://paperswithcode.com/paper/roto-translation-equivariant-convolutional

On the Compressibility of Affinely Singular Random Vectors

Title On the Compressibility of Affinely Singular Random Vectors
Authors Mohammad-Amin Charusaie, Stefano Rini, Arash Amini
Abstract There are several ways to measure the compressibility of a random measure; they include the general rate-distortion curve, as well as more specific notions such as Renyi information dimension (RID), and dimensional-rate bias (DRB). The RID parameter indicates the concentration of the measure around lower-dimensional subsets of the space while the DRB parameter specifies the compressibility of the distribution over these lower-dimensional subsets. While the evaluation of such compressibility parameters is well-studied for continuous and discrete measures (e.g., the DRB is closely related to the entropy and differential entropy in discrete and continuous cases, respectively), the case of discrete-continuous measures is quite subtle. In this paper, we focus on a class of multi-dimensional random measures that have singularities on affine lower-dimensional subsets. These cases are of interest when working with linear transformation of component-wise independent discrete-continuous random variables. Here, we evaluate the RID and DRB for such probability measures. We further provide an upper-bound for the RID of multi-dimensional random measures that are obtained by Lipschitz functions of component-wise independent discrete-continuous random variables (x). The upper-bound is shown to be achievable when the Lipschitz function is Ax, where A satisfies SPARK(A) = rank (A)+1 (e.g., Vandermonde matrices). The above results in the case of discrete-domain moving-average processes with non-Gaussian excitation noise allow us to evaluate the block-average RID and to find a relationship between this parameter and other existing compressibility measures.
Published 2020-01-12
URL https://arxiv.org/abs/2001.03884v2
PDF https://arxiv.org/pdf/2001.03884v2.pdf
PWC https://paperswithcode.com/paper/on-the-compressibility-of-affinely-singular

Neuron Shapley: Discovering the Responsible Neurons

Title Neuron Shapley: Discovering the Responsible Neurons
Authors Amirata Ghorbani, James Zou
Abstract We develop Neuron Shapley as a new framework to quantify the contribution of individual neurons to the prediction and performance of a deep network. By accounting for interactions across neurons, Neuron Shapley is more effective in identifying important filters compared to common approaches based on activation patterns. Interestingly, removing just 30 filters with the highest Shapley scores effectively destroys the prediction accuracy of Inception-v3 on ImageNet. Visualization of these few critical filters provides insights into how the network functions. Neuron Shapley is a flexible framework and can be applied to identify responsible neurons in many tasks. We illustrate additional applications of identifying filters that are responsible for biased prediction in facial recognition and filters that are vulnerable to adversarial attacks. Removing these filters is a quick way to repair models. Enabling all these applications is a new multi-arm bandit algorithm that we developed to efficiently estimate Neuron Shapley values.
Published 2020-02-23
URL https://arxiv.org/abs/2002.09815v2
PDF https://arxiv.org/pdf/2002.09815v2.pdf
PWC https://paperswithcode.com/paper/neuron-shapley-discovering-the-responsible

MSE-Optimal Neural Network Initialization via Layer Fusion

Title MSE-Optimal Neural Network Initialization via Layer Fusion
Authors Ramina Ghods, Andrew S. Lan, Tom Goldstein, Christoph Studer
Abstract Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. However, the use of stochastic gradient descent combined with the nonconvexity of the underlying optimization problems renders parameter learning susceptible to initialization. To address this issue, a variety of methods that rely on random parameter initialization or knowledge distillation have been proposed in the past. In this paper, we propose FuseInit, a novel method to initialize shallower networks by fusing neighboring layers of deeper networks that are trained with random initialization. We develop theoretical results and efficient algorithms for mean-square error (MSE)-optimal fusion of neighboring dense-dense, convolutional-dense, and convolutional-convolutional layers. We show experiments for a range of classification and regression datasets, which suggest that deeper neural networks are less sensitive to initialization and shallower networks can perform better (sometimes as well as their deeper counterparts) if initialized with FuseInit.
Published 2020-01-28
URL https://arxiv.org/abs/2001.10509v1
PDF https://arxiv.org/pdf/2001.10509v1.pdf
PWC https://paperswithcode.com/paper/mse-optimal-neural-network-initialization-via

On the Consistency of Optimal Bayesian Feature Selection in the Presence of Correlations

Title On the Consistency of Optimal Bayesian Feature Selection in the Presence of Correlations
Authors Ali Foroughi pour, Lori A. Dalton
Abstract Optimal Bayesian feature selection (OBFS) is a multivariate supervised screening method designed from the ground up for biomarker discovery. In this work, we prove that Gaussian OBFS is strongly consistent under mild conditions, and provide rates of convergence for key posteriors in the framework. These results are of enormous importance, since they identify precisely what features are selected by OBFS asymptotically, characterize the relative rates of convergence for posteriors on different types of features, provide conditions that guarantee convergence, justify the use of OBFS when its internal assumptions are invalid, and set the stage for understanding the asymptotic behavior of other algorithms based on the OBFS framework.
Tasks Feature Selection
Published 2020-02-01
URL https://arxiv.org/abs/2002.00120v1
PDF https://arxiv.org/pdf/2002.00120v1.pdf
PWC https://paperswithcode.com/paper/on-the-consistency-of-optimal-bayesian

NeurOpt: Neural network based optimization for building energy management and climate control

Title NeurOpt: Neural network based optimization for building energy management and climate control
Authors Achin Jain, Francesco Smarra, Enrico Reticcioli, Alessandro D’Innocenzo, Manfred Morari
Abstract Model predictive control (MPC) can provide significant energy cost savings in building operations in the form of energy-efficient control with better occupant comfort, lower peak demand charges, and risk-free participation in demand response. However, the engineering effort required to obtain physics-based models of buildings for MPC is considered to be the biggest bottleneck in making MPC scalable to real buildings. In this paper, we propose a data-driven control algorithm based on neural networks to reduce this cost of model identification. Our approach does not require building domain expertise or retrofitting of the existing heating and cooling systems. We validate our learning and control algorithms on a two-story building with 10 independently controlled zones, located in Italy. We learn dynamical models of energy consumption and zone temperatures with high accuracy and demonstrate energy savings and better occupant comfort compared to the default system controller.
Published 2020-01-22
URL https://arxiv.org/abs/2001.07831v1
PDF https://arxiv.org/pdf/2001.07831v1.pdf
PWC https://paperswithcode.com/paper/neuropt-neural-network-based-optimization-for

Event sequence metric learning

Title Event sequence metric learning
Authors Dmitrii Babaev, Ivan Kireev, Nikita Ovsov, Mariya Ivanova, Gleb Gusev, Alexander Tuzhilin
Abstract In this paper we consider a challenging problem of learning discriminative vector representations for event sequences generated by real-world users. Vector representations map behavioral client raw data to the low-dimensional fixed-length vectors in the latent space. We propose a novel method of learning those vector embeddings based on metric learning approach. We propose a strategy of raw data subsequences generation to apply a metric learning approach in a fully self-supervised way. We evaluated the method over several public bank transactions datasets and showed that self-supervised embeddings outperform other methods when applied to downstream classification tasks. Moreover, embeddings are compact and provide additional user privacy protection.
Tasks Metric Learning
Published 2020-02-19
URL https://arxiv.org/abs/2002.08232v1
PDF https://arxiv.org/pdf/2002.08232v1.pdf
PWC https://paperswithcode.com/paper/event-sequence-metric-learning

See, Attend and Brake: An Attention-based Saliency Map Prediction Model for End-to-End Driving

Title See, Attend and Brake: An Attention-based Saliency Map Prediction Model for End-to-End Driving
Authors Ekrem Aksoy, Ahmet Yazıcı, Mahmut Kasap
Abstract Visual perception is the most critical input for driving decisions. In this study, our aim is to understand relationship between saliency and driving decisions. We present a novel attention-based saliency map prediction model for making braking decisions This approach constructs a holistic model to the driving task and can be extended for other driving decisions like steering and acceleration. The proposed model is a deep neural network model that feeds extracted features from input image to a recurrent neural network with an attention mechanism. Then predicted saliency map is used to make braking decision. We trained and evaluated using driving attention dataset BDD-A, and saliency dataset CAT2000.
Published 2020-02-24
URL https://arxiv.org/abs/2002.11020v1
PDF https://arxiv.org/pdf/2002.11020v1.pdf
PWC https://paperswithcode.com/paper/see-attend-and-brake-an-attention-based

Automatic Curriculum Learning For Deep RL: A Short Survey

Title Automatic Curriculum Learning For Deep RL: A Short Survey
Authors Rémy Portelas, Cédric Colas, Lilian Weng, Katja Hofmann, Pierre-Yves Oudeyer
Abstract Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.
Published 2020-03-10
URL https://arxiv.org/abs/2003.04664v1
PDF https://arxiv.org/pdf/2003.04664v1.pdf
PWC https://paperswithcode.com/paper/automatic-curriculum-learning-for-deep-rl-a

On Localizing a Camera from a Single Image

Title On Localizing a Camera from a Single Image
Authors Pradipta Ghosh, Xiaochen Liu, Hang Qiu, Marcos A. M. Vieira, Gaurav S. Sukhatme, Ramesh Govindan
Abstract Public cameras often have limited metadata describing their attributes. A key missing attribute is the precise location of the camera, using which it is possible to precisely pinpoint the location of events seen in the camera. In this paper, we explore the following question: under what conditions is it possible to estimate the location of a camera from a single image taken by the camera? We show that, using a judicious combination of projective geometry, neural networks, and crowd-sourced annotations from human workers, it is possible to position 95% of the images in our test data set to within 12 m. This performance is two orders of magnitude better than PoseNet, a state-of-the-art neural network that, when trained on a large corpus of images in an area, can estimate the pose of a single image. Finally, we show that the camera’s inferred position and intrinsic parameters can help design a number of virtual sensors, all of which are reasonably accurate.
Published 2020-03-24
URL https://arxiv.org/abs/2003.10664v1
PDF https://arxiv.org/pdf/2003.10664v1.pdf
PWC https://paperswithcode.com/paper/on-localizing-a-camera-from-a-single-image

RePose: Learning Deep Kinematic Priors for Fast Human Pose Estimation

Title RePose: Learning Deep Kinematic Priors for Fast Human Pose Estimation
Authors Hossam Isack, Christian Haene, Cem Keskin, Sofien Bouaziz, Yuri Boykov, Shahram Izadi, Sameh Khamis
Abstract We propose a novel efficient and lightweight model for human pose estimation from a single image. Our model is designed to achieve competitive results at a fraction of the number of parameters and computational cost of various state-of-the-art methods. To this end, we explicitly incorporate part-based structural and geometric priors in a hierarchical prediction framework. At the coarsest resolution, and in a manner similar to classical part-based approaches, we leverage the kinematic structure of the human body to propagate convolutional feature updates between the keypoints or body parts. Unlike classical approaches, we adopt end-to-end training to learn this geometric prior through feature updates from data. We then propagate the feature representation at the coarsest resolution up the hierarchy to refine the predicted pose in a coarse-to-fine fashion. The final network effectively models the geometric prior and intuition within a lightweight deep neural network, yielding state-of-the-art results for a model of this size on two standard datasets, Leeds Sports Pose and MPII Human Pose.
Tasks Pose Estimation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03933v1
PDF https://arxiv.org/pdf/2002.03933v1.pdf
PWC https://paperswithcode.com/paper/repose-learning-deep-kinematic-priors-for

BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward

Title BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward
Authors Florian Schmidt, Thomas Hofmann
Abstract Measuring the quality of a generated sequence against a set of references is a central problem in many learning frameworks, be it to compute a score, to assign a reward, or to perform discrimination. Despite great advances in model architectures, metrics that scale independently of the number of references are still based on n-gram estimates. We show that the underlying operations, counting words and comparing counts, can be lifted to embedding words and comparing embeddings. An in-depth analysis of BERT embeddings shows empirically that contextual embeddings can be employed to capture the required dependencies while maintaining the necessary scalability through appropriate pruning and smoothing techniques. We cast unconditional generation as a reinforcement learning problem and show that our reward function indeed provides a more effective learning signal than n-gram reward in this challenging setting.
Published 2020-03-05
URL https://arxiv.org/abs/2003.02738v1
PDF https://arxiv.org/pdf/2003.02738v1.pdf
PWC https://paperswithcode.com/paper/bert-as-a-teacher-contextual-embeddings-for

ABBA: Saliency-Regularized Motion-Based Adversarial Blur Attack

Title ABBA: Saliency-Regularized Motion-Based Adversarial Blur Attack
Authors Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, Jian Wang, Wei Feng, Yang Liu
Abstract Deep neural networks are vulnerable to noise-based adversarial examples, which can mislead the networks by adding random-like noise. However, such examples are hardly found in the real world and easily perceived when thumping noises are used to keep their high transferability across different models. In this paper, we identify a new attacking method termed motion-based adversarial blur attack (ABBA) that can generate visually natural motion-blurred adversarial examples even with relatively high perturbation, allowing much better transferability than noise-based methods. To this end, we first formulate the kernel-prediction-based attack where an input image is convolved with kernels in a pixel-wise way, and the misclassification capability is achieved by tuning the kernel weights. To generate visually more natural and plausible examples, we further propose the saliency-regularized adversarial kernel prediction where the salient region serves as a moving object, and the predicted kernel is regularized to achieve naturally visual effects. Besides, the attack can be further enhanced by adaptively tuning the translations of object and background. Extensive experimental results on the NeurIPS’17 adversarial competition dataset validate the effectiveness of ABBA by considering various kernel sizes, translations, and regions. Furthermore, we study the effects of state-of-the-art GAN-based deblurring mechanisms to our methods.
Tasks Deblurring
Published 2020-02-10
URL https://arxiv.org/abs/2002.03500v1
PDF https://arxiv.org/pdf/2002.03500v1.pdf
PWC https://paperswithcode.com/paper/abba-saliency-regularized-motion-based

Lossless Compression of Mosaic Images with Convolutional Neural Network Prediction

Title Lossless Compression of Mosaic Images with Convolutional Neural Network Prediction
Authors Seyed Mehdi Ayyoubzadeh, Xiaolin Wu
Abstract We present a CNN-based predictive lossless compression scheme for raw color mosaic images of digital cameras. This specialized application problem was previously understudied but it is now becoming increasingly important, because modern CNN methods for image restoration tasks (e.g., superresolution, low lighting enhancement, deblurring), must operate on original raw mosaic images to obtain the best possible results. The key innovation of this paper is a high-order nonlinear CNN predictor of spatial-spectral mosaic patterns. The deep learning prediction can model highly complex sample dependencies in spatial-spectral mosaic images more accurately and hence remove statistical redundancies more thoroughly than existing image predictors. Experiments show that the proposed CNN predictor achieves unprecedented lossless compression performance on camera raw images.
Tasks Deblurring, Image Restoration
Published 2020-01-28
URL https://arxiv.org/abs/2001.10484v1
PDF https://arxiv.org/pdf/2001.10484v1.pdf
PWC https://paperswithcode.com/paper/lossless-compression-of-mosaic-images-with
comments powered by Disqus