February 1, 2020

2994 words 15 mins read

Paper Group AWR 189

Paper Group AWR 189

Adversarial attacks hidden in plain sight. Machine Learning meets Stochastic Geometry: Determinantal Subset Selection for Wireless Networks. NGBoost: Natural Gradient Boosting for Probabilistic Prediction. From Importance Sampling to Doubly Robust Policy Gradient. A flexible EM-like clustering algorithm for noisy data. Adaptive Loss Scaling for Mix …

Adversarial attacks hidden in plain sight

Title Adversarial attacks hidden in plain sight
Authors Jan Philip Göpfert, André Artelt, Heiko Wersing, Barbara Hammer
Abstract Convolutional neural networks have been used to achieve a string of successes during recent years, but their lack of interpretability remains a serious issue. Adversarial examples are designed to deliberately fool neural networks into making any desired incorrect classification, potentially with very high certainty. Several defensive approaches increase robustness against adversarial attacks, demanding attacks of greater magnitude, which lead to visible artifacts. By considering human visual perception, we compose a technique that allows to hide such adversarial attacks in regions of high complexity, such that they are imperceptible even to an astute observer. We carry out a user study on classifying adversarially modified images to validate the perceptual quality of our approach and find significant evidence for its concealment with regards to human visual perception.
Tasks
Published 2019-02-25
URL https://arxiv.org/abs/1902.09286v2
PDF https://arxiv.org/pdf/1902.09286v2.pdf
PWC https://paperswithcode.com/paper/adversarial-attacks-hidden-in-plain-sight
Repo https://github.com/jangop/entropy-based-adversarials
Framework none

Machine Learning meets Stochastic Geometry: Determinantal Subset Selection for Wireless Networks

Title Machine Learning meets Stochastic Geometry: Determinantal Subset Selection for Wireless Networks
Authors Chiranjib Saha, Harpreet S. Dhillon
Abstract In wireless networks, many problems can be formulated as subset selection problems where the goal is to select a subset from the ground set with the objective of maximizing some objective function. These problems are typically NP-hard and hence solved through carefully constructed heuristics, which are themselves mostly NP-complete and thus not easily applicable to large networks. On the other hand, subset selection problems occur in slightly different context in machine learning (ML) where the goal is to select a subset of high quality yet diverse items from a ground set. In this paper, we introduce a novel DPP-based learning (DPPL) framework for efficiently solving subset selection problems in wireless networks. The DPPL is intended to replace the traditional optimization algorithms for subset selection by learning the quality-diversity trade-off in the optimal subsets selected by an optimization routine. As a case study, we apply DPPL to the wireless link scheduling problem, where the goal is to determine the subset of simultaneously active links which maximizes the network-wide sum-rate. We demonstrate that the proposed DPPL approaches the optimal solution with significantly lower computational complexity than the popular optimization algorithms used for this problem in the literature.
Tasks
Published 2019-05-01
URL http://arxiv.org/abs/1905.00504v1
PDF http://arxiv.org/pdf/1905.00504v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-meets-stochastic-geometry
Repo https://github.com/stochastic-geometry/DPPL
Framework none

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

Title NGBoost: Natural Gradient Boosting for Probabilistic Prediction
Authors Tony Duan, Anand Avati, Daisy Yi Ding, Khanh K. Thai, Sanjay Basu, Andrew Y. Ng, Alejandro Schuler
Abstract We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation — crucial in applications like healthcare and weather forecasting. NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm. Furthermore, we show how the \emph{Natural Gradient} is required to correct the training dynamics of our multiparameter boosting approach. NGBoost can be used with any base learner, any family of distributions with continuous parameters, and any scoring rule. NGBoost matches or exceeds the performance of existing methods for probabilistic prediction while offering additional benefits in flexibility, scalability, and usability.
Tasks Weather Forecasting
Published 2019-10-08
URL https://arxiv.org/abs/1910.03225v3
PDF https://arxiv.org/pdf/1910.03225v3.pdf
PWC https://paperswithcode.com/paper/ngboost-natural-gradient-boosting-for
Repo https://github.com/stanfordmlgroup/ngboost
Framework none

From Importance Sampling to Doubly Robust Policy Gradient

Title From Importance Sampling to Doubly Robust Policy Gradient
Authors Jiawei Huang, Nan Jiang
Abstract We show that on-policy policy gradient (PG) and its variance reduction variants can be derived by taking finite difference of function evaluations supplied by estimators from the importance sampling (IS) family for off-policy evaluation (OPE). Starting from the doubly robust (DR) estimator (Jiang & Li, 2016), we provide a simple derivation of a very general and flexible form of PG, which subsumes the state-of-the-art variance reduction technique (Cheng et al., 2019) as its special case and immediately hints at further variance reduction opportunities overlooked by existing literature. We analyze the variance of the new DR-PG estimator, compare it to existing methods as well as the Cramer-Rao lower bound of policy gradient, and empirically show its effectiveness.
Tasks
Published 2019-10-20
URL https://arxiv.org/abs/1910.09066v2
PDF https://arxiv.org/pdf/1910.09066v2.pdf
PWC https://paperswithcode.com/paper/from-importance-sampling-to-doubly-robust
Repo https://github.com/Leonardo-H/DR-PG
Framework none

A flexible EM-like clustering algorithm for noisy data

Title A flexible EM-like clustering algorithm for noisy data
Authors Violeta Roizman, Matthieu Jonckheere, Frédéric Pascal
Abstract Though very popular, it is well known that the EM algorithm suffers from non-Gaussian distribution shapes, outliers and high-dimensionality. In this paper, we design a new robust clustering algorithm that can efficiently deal with high-dimensionality, noise and outliers in diverse data sets. As an EM-like algorithm, it is based on both estimations of clusters centers and covariances. In addition, using a semi-parametric paradigm, the method estimates an unknown scale parameter per data-point. This allows the algorithm to leverage high-dimensionality and to accommodate for heavier tails distributions and outliers without significantly loosing efficiency in various classical scenarios. After deriving and analyzing the proposed algorithm, we study the convergence and accuracy of the algorithm by considering first synthetic data. Then, we show that the proposed algorithm outperforms other classical unsupervised methods of the literature such as $k$-means, the EM algorithm and its recent modifications or spectral clustering when applied to real data sets as MNIST, NORB and $20newsgroups$.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01660v2
PDF https://arxiv.org/pdf/1907.01660v2.pdf
PWC https://paperswithcode.com/paper/a-flexible-em-like-clustering-algorithm-for
Repo https://github.com/violetr/frem
Framework none

Adaptive Loss Scaling for Mixed Precision Training

Title Adaptive Loss Scaling for Mixed Precision Training
Authors Ruizhe Zhao, Brian Vogel, Tanvir Ahmed
Abstract Mixed precision training (MPT) is becoming a practical technique to improve the speed and energy efficiency of training deep neural networks by leveraging the fast hardware support for IEEE half-precision floating point that is available in existing GPUs. MPT is typically used in combination with a technique called loss scaling, that works by scaling up the loss value up before the start of backpropagation in order to minimize the impact of numerical underflow on training. Unfortunately, existing methods make this loss scale value a hyperparameter that needs to be tuned per-model, and a single scale cannot be adapted to different layers at different training stages. We introduce a loss scaling-based training method called adaptive loss scaling that makes MPT easier and more practical to use, by removing the need to tune a model-specific loss scale hyperparameter. We achieve this by introducing layer-wise loss scale values which are automatically computed during training to deal with underflow more effectively than existing methods. We present experimental results on a variety of networks and tasks that show our approach can shorten the time to convergence and improve accuracy compared to the existing state-of-the-art MPT and single-precision floating point
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12385v1
PDF https://arxiv.org/pdf/1910.12385v1.pdf
PWC https://paperswithcode.com/paper/adaptive-loss-scaling-for-mixed-precision
Repo https://github.com/kumasento/ada-loss
Framework none

Gendered Ambiguous Pronouns Shared Task: Boosting Model Confidence by Evidence Pooling

Title Gendered Ambiguous Pronouns Shared Task: Boosting Model Confidence by Evidence Pooling
Authors Sandeep Attree
Abstract This paper presents a strong set of results for resolving gendered ambiguous pronouns on the Gendered Ambiguous Pronouns shared task. The model presented here draws upon the strengths of state-of-the-art language and coreference resolution models, and introduces a novel evidence-based deep learning architecture. Injecting evidence from the coreference models compliments the base architecture, and analysis shows that the model is not hindered by their weaknesses, specifically gender bias. The modularity and simplicity of the architecture make it very easy to extend for further improvement and applicable to other NLP problems. Evaluation on GAP test data results in a state-of-the-art performance at 92.5% F1 (gender bias of 0.97), edging closer to the human performance of 96.6%. The end-to-end solution presented here placed 1st in the Kaggle competition, winning by a significant lead. The code is available at https://github.com/sattree/gap.
Tasks Coreference Resolution
Published 2019-06-03
URL https://arxiv.org/abs/1906.00839v1
PDF https://arxiv.org/pdf/1906.00839v1.pdf
PWC https://paperswithcode.com/paper/190600839
Repo https://github.com/sattree/gap
Framework tf

Least Squares Approximation for a Distributed System

Title Least Squares Approximation for a Distributed System
Authors Xuening Zhu, Feng Li, Hansheng Wang
Abstract In this work, we develop a distributed least squares approximation (DLSA) method that is able to solve a large family of regression problems (e.g., linear regression, logistic regression, and Cox’s model) on a distributed system. By approximating the local objective function using a local quadratic form, we are able to obtain a combined estimator by taking a weighted average of local estimators. The resulting estimator is proved to be statistically as efficient as the global estimator. Moreover, it requires only one round of communication. We further conduct shrinkage estimation based on the DLSA estimation using an adaptive Lasso approach. The solution can be easily obtained by using the LARS algorithm on the master node. It is theoretically shown that the resulting estimator possesses the oracle property and is selection consistent by using a newly designed distributed Bayesian information criterion (DBIC). The finite sample performance and the computational efficiency are further illustrated by an extensive numerical study and an airline dataset. The airline dataset is 52 GB in size. The entire methodology has been implemented in Python for a de-facto standard Spark system. The proposed DLSA algorithm on the Spark system takes 26 minutes to obtain a logistic regression estimator, whereas a full likelihood algorithm takes 15 hours to obtain an inferior result.
Tasks
Published 2019-08-14
URL https://arxiv.org/abs/1908.04904v2
PDF https://arxiv.org/pdf/1908.04904v2.pdf
PWC https://paperswithcode.com/paper/least-squares-approximation-for-a-distributed
Repo https://github.com/feng-li/dlsa
Framework none

Generating High-fidelity, Synthetic Time Series Datasets with DoppelGANger

Title Generating High-fidelity, Synthetic Time Series Datasets with DoppelGANger
Authors Zinan Lin, Alankar Jain, Chen Wang, Giulia Fanti, Vyas Sekar
Abstract Limited data access is a substantial barrier to data-driven networking research and development. Although many organizations are motivated to share data, privacy concerns often prevent the sharing of proprietary data, including between teams in the same organization and with outside stakeholders (e.g., researchers, vendors). Many researchers have therefore proposed synthetic data models, most of which have not gained traction because of their narrow scope. In this work, we present DoppelGANger, a synthetic data generation framework based on generative adversarial networks (GANs). DoppelGANger is designed to work on time series datasets with both continuous features (e.g. traffic measurements) and discrete ones (e.g., protocol name). Modeling time series and mixed-type data is known to be difficult; DoppelGANger circumvents these problems through a new conditional architecture that isolates the generation of metadata from time series, but uses metadata to strongly influence time series generation. We demonstrate the efficacy of DoppelGANger on three real-world datasets. We show that DoppelGANger achieves up to 43% better fidelity than baseline models, and captures structural properties of data that baseline methods are unable to learn. Additionally, it gives data holders an easy mechanism for protecting attributes of their data without substantial loss of data utility.
Tasks Synthetic Data Generation, Time Series
Published 2019-09-30
URL https://arxiv.org/abs/1909.13403v1
PDF https://arxiv.org/pdf/1909.13403v1.pdf
PWC https://paperswithcode.com/paper/generating-high-fidelity-synthetic-time
Repo https://github.com/fjxmlzn/DoppelGANger
Framework tf

An Atomistic Machine Learning Package for Surface Science and Catalysis

Title An Atomistic Machine Learning Package for Surface Science and Catalysis
Authors Martin Hangaard Hansen, José A. Garrido Torres, Paul C. Jennings, Ziyun Wang, Jacob R. Boes, Osman G. Mamun, Thomas Bligaard
Abstract We present work flows and a software module for machine learning model building in surface science and heterogeneous catalysis. This includes fingerprinting atomic structures from 3D structure and/or connectivity information, it includes descriptor selection methods and benchmarks, and it includes active learning frameworks for atomic structure optimization, acceleration of screening studies and for exploration of the structure space of nano particles, which are all atomic structure problems relevant for surface science and heterogeneous catalysis. Our overall goal is to provide a repository to ease machine learning model building for catalysis, to advance the models beyond the chemical intuition of the user and to increase autonomy for exploration of chemical space.
Tasks Active Learning
Published 2019-04-01
URL http://arxiv.org/abs/1904.00904v1
PDF http://arxiv.org/pdf/1904.00904v1.pdf
PWC https://paperswithcode.com/paper/an-atomistic-machine-learning-package-for
Repo https://github.com/SUNCAT-Center/CatLearn
Framework none

Branched Multi-Task Networks: Deciding What Layers To Share

Title Branched Multi-Task Networks: Deciding What Layers To Share
Authors Simon Vandenhende, Stamatios Georgoulis, Bert De Brabandere, Luc Van Gool
Abstract In the context of multi-task learning, neural networks with branched architectures have often been employed to jointly tackle the tasks at hand. Such ramified networks typically start with a number of shared layers, after which different tasks branch out into their own sequence of layers. Understandably, as the number of possible network configurations is combinatorially large, deciding what layers to share and where to branch out becomes cumbersome. Prior works have either relied on ad hoc methods to determine the level of layer sharing, which is suboptimal, or utilized neural architecture search techniques to establish the network design, which is considerably expensive. In this paper, we go beyond these limitations and propose a principled approach to automatically construct branched multi-task networks, by leveraging the employed tasks’ affinities. Given a specific budget, i.e. number of learnable parameters, the proposed approach generates architectures, in which shallow layers are task-agnostic, whereas deeper ones gradually grow more task-specific. Extensive experimental analysis across numerous, diverse multi-tasking datasets shows that, for a given budget, our method consistently yields networks with the highest performance, while for a certain performance threshold it requires the least amount of learnable parameters.
Tasks Multi-Task Learning, Neural Architecture Search
Published 2019-04-05
URL https://arxiv.org/abs/1904.02920v2
PDF https://arxiv.org/pdf/1904.02920v2.pdf
PWC https://paperswithcode.com/paper/branched-multi-task-networks-deciding-what
Repo https://github.com/SimonVandenhende/Branched-Multi-Task-Networks-Deciding-What-Layers-To-Share
Framework none

Linear colour segmentation revisited

Title Linear colour segmentation revisited
Authors Anna Smagina, Valentina Bozhkova, Sergey Gladilin, Dmitry Nikolaev
Abstract In this work we discuss the known algorithms for linear colour segmentation based on a physical approach and propose a new modification of segmentation algorithm. This algorithm is based on a region adjacency graph framework without a pre-segmentation stage. Proposed edge weight functions are defined from linear image model with normal noise. The colour space projective transform is introduced as a novel pre-processing technique for better handling of shadow and highlight areas. The resulting algorithm is tested on a benchmark dataset consisting of the images of 19 natural scenes selected from the Barnard’s DXC-930 SFU dataset and 12 natural scene images newly published for common use. The dataset is provided with pixel-by-pixel ground truth colour segmentation for every image. Using this dataset, we show that the proposed algorithm modifications lead to qualitative advantages over other model-based segmentation algorithms, and also show the positive effect of each proposed modification. The source code and datasets for this work are available for free access at http://github.com/visillect/segmentation.
Tasks
Published 2019-01-02
URL http://arxiv.org/abs/1901.00534v1
PDF http://arxiv.org/pdf/1901.00534v1.pdf
PWC https://paperswithcode.com/paper/linear-colour-segmentation-revisited
Repo https://github.com/visillect/segmentation
Framework none

Normalized Diversification

Title Normalized Diversification
Authors Shaohui Liu, Xiao Zhang, Jianqiao Wangni, Jianbo Shi
Abstract Generating diverse yet specific data is the goal of the generative adversarial network (GAN), but it suffers from the problem of mode collapse. We introduce the concept of normalized diversity which force the model to preserve the normalized pairwise distance between the sparse samples from a latent parametric distribution and their corresponding high-dimensional outputs. The normalized diversification aims to unfold the manifold of unknown topology and non-uniform distribution, which leads to safe interpolation between valid latent variables. By alternating the maximization over the pairwise distance and updating the total distance (normalizer), we encourage the model to actively explore in the high-dimensional output space. We demonstrate that by combining the normalized diversity loss and the adversarial loss, we generate diverse data without suffering from mode collapsing. Experimental results show that our method achieves consistent improvement on unsupervised image generation, conditional image generation and hand pose estimation over strong baselines.
Tasks Conditional Image Generation, Hand Pose Estimation, Image Generation, Pose Estimation
Published 2019-04-07
URL http://arxiv.org/abs/1904.03608v2
PDF http://arxiv.org/pdf/1904.03608v2.pdf
PWC https://paperswithcode.com/paper/normalized-diversification
Repo https://github.com/B1ueber2y/NDiv
Framework pytorch

Correlating neural and symbolic representations of language

Title Correlating neural and symbolic representations of language
Authors Grzegorz Chrupała, Afra Alishahi
Abstract Analysis methods which enable us to better understand the representations and functioning of neural models of language are increasingly needed as deep learning becomes the dominant approach in NLP. Here we present two methods based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which allow us to directly quantify how strongly the information encoded in neural activation patterns corresponds to information represented by symbolic structures such as syntax trees. We first validate our methods on the case of a simple synthetic language for arithmetic expressions with clearly defined syntax and semantics, and show that they exhibit the expected pattern of results. We then apply our methods to correlate neural representations of English sentences with their constituency parse trees.
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.06401v2
PDF https://arxiv.org/pdf/1905.06401v2.pdf
PWC https://paperswithcode.com/paper/correlating-neural-and-symbolic
Repo https://github.com/gchrupala/correlating-neural-and-symbolic-representations-of-language
Framework pytorch

Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

Title Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
Authors Edgar Riba, Dmytro Mishkin, Daniel Ponsa, Ethan Rublee, Gary Bradski
Abstract This work presents Kornia – an open source computer vision library which consists of a set of differentiable routines and modules to solve generic computer vision problems. The package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by OpenCV, Kornia is composed of a set of modules containing operators that can be inserted inside neural networks to train models to perform image transformations, camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations. Examples of classical vision problems implemented using our framework are provided including a benchmark comparing to existing vision libraries.
Tasks Calibration, Edge Detection
Published 2019-10-05
URL https://arxiv.org/abs/1910.02190v2
PDF https://arxiv.org/pdf/1910.02190v2.pdf
PWC https://paperswithcode.com/paper/kornia-an-open-source-differentiable-computer
Repo https://github.com/kornia/kornia
Framework pytorch
comments powered by Disqus