April 3, 2020

3038 words 15 mins read

Paper Group AWR 50

The Reciprocal Bayesian LASSO. Certified Defenses for Adversarial Patches. Training Binary Neural Networks using the Bayesian Learning Rule. AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC. Tidying Deep Saliency Prediction Architectures. Directional Deep Embedding and Appearance Learning for Fast Video Object Segment …

The Reciprocal Bayesian LASSO


Title	The Reciprocal Bayesian LASSO
Authors	Himel Mallick, Rahim Alhamzawi, Vladimir Svetnik
Abstract	A reciprocal LASSO (rLASSO) regularization employs a decreasing penalty function as opposed to conventional penalization methods that use increasing penalties on the coefficients, leading to stronger parsimony and superior model selection relative to traditional shrinkage methods. Here we consider a fully Bayesian formulation of the rLASSO problem, which is based on the observation that the rLASSO estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters are assigned independent inverse Laplace priors. Bayesian inference from this posterior is possible using an expanded hierarchy motivated by a scale mixture of double Pareto or truncated normal distributions. On simulated and real datasets, we show that the Bayesian formulation outperforms its classical cousin in estimation, prediction, and variable selection across a wide range of scenarios while offering the advantage of posterior inference. Finally, we discuss other variants of this new approach and provide a unified framework for variable selection using flexible reciprocal penalties. All methods described in this paper are publicly available as an R package at: https://github.com/himelmallick/BayesRecipe.
Tasks	Bayesian Inference, Model Selection
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08327v1
PDF	https://arxiv.org/pdf/2001.08327v1.pdf
PWC	https://paperswithcode.com/paper/the-reciprocal-bayesian-lasso
Repo	https://github.com/himelmallick/BayesRecipe
Framework	none

Certified Defenses for Adversarial Patches


Title	Certified Defenses for Adversarial Patches
Authors	Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studor, Tom Goldstein
Abstract	Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems. This paper studies certified and empirical defenses against patch attacks. We begin with a set of experiments showing that most existing defenses, which work by pre-processing input images to mitigate adversarial patches, are easily broken by simple white-box adversaries. Motivated by this finding, we propose the first certified defense against patch attacks, and propose faster methods for its training. Furthermore, we experiment with different patch shapes for testing, obtaining surprisingly good robustness transfer across shapes, and present preliminary results on certified defense against sparse attacks. Our complete implementation can be found on: https://github.com/Ping-C/certifiedpatchdefense.
Tasks
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06693v1
PDF	https://arxiv.org/pdf/2003.06693v1.pdf
PWC	https://paperswithcode.com/paper/certified-defenses-for-adversarial-patches-1
Repo	https://github.com/Ping-C/certifiedpatchdefense
Framework	pytorch

Training Binary Neural Networks using the Bayesian Learning Rule


Title	Training Binary Neural Networks using the Bayesian Learning Rule
Authors	Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan
Abstract	Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.
Tasks	Continual Learning
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10778v2
PDF	https://arxiv.org/pdf/2002.10778v2.pdf
PWC	https://paperswithcode.com/paper/training-binary-neural-networks-using-the
Repo	https://github.com/team-approx-bayes/BayesBiNN
Framework	pytorch

AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC


Title	AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
Authors	Ruqi Zhang, A. Feder Cooper, Christopher De Sa
Abstract	Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions. It is a faster alternative to HMC: instead of using the whole dataset at each iteration, SGHMC uses only a subsample. This improves performance, but introduces bias that can cause SGHMC to converge to the wrong distribution. One can prevent this using a step size that decays to zero, but such a step size schedule can drastically slow down convergence. To address this tension, we propose a novel second-order SG-MCMC algorithm—AMAGOLD—that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias. The infrequency of corrections amortizes their cost. We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline. We empirically demonstrate AMAGOLD’s effectiveness on synthetic distributions, Bayesian logistic regression, and Bayesian neural networks.
Tasks
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00193v1
PDF	https://arxiv.org/pdf/2003.00193v1.pdf
PWC	https://paperswithcode.com/paper/amagold-amortized-metropolis-adjustment-for
Repo	https://github.com/ruqizhang/amagold
Framework	pytorch

Tidying Deep Saliency Prediction Architectures


Title	Tidying Deep Saliency Prediction Architectures
Authors	Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi
Abstract	Learning computational models for visual attention (saliency estimation) is an effort to inch machines/robots closer to human visual cognitive abilities. Data-driven efforts have dominated the landscape since the introduction of deep neural network architectures. In deep learning research, the choices in architecture design are often empirical and frequently lead to more complex models than necessary. The complexity, in turn, hinders the application requirements. In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions. We review the existing state of the art models on these four components and propose novel and simpler alternatives. As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks. SimpleNet is an optimized encoder-decoder architecture and brings notable performance gains on the SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model that directly predicts parameters of a GMM distribution and is aimed to bring more interpretability to the prediction maps. The proposed saliency models can be inferred at 25fps, making them suitable for real-time applications. Code and pre-trained models are available at https://github.com/samyak0210/saliency.
Tasks	Saliency Prediction
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04942v1
PDF	https://arxiv.org/pdf/2003.04942v1.pdf
PWC	https://paperswithcode.com/paper/tidying-deep-saliency-prediction
Repo	https://github.com/samyak0210/saliency
Framework	pytorch

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation


Title	Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation
Authors	Yingjie Yin, De Xu, Xingang Wang, Lei Zhang
Abstract	Most recent semi-supervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module, which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the global directional matching module and the directional appearance model learning module, DDEAL learns static cues from the labeled first frame and dynamically updates cues of the subsequent frames for object segmentation. Our method exhibits state-of-the-art VOS performance without using online fine-tuning. Specifically, it achieves a J & F mean score of 74.8% on DAVIS 2017 dataset and an overall score G of 71.3% on the large-scale YouTube-VOS dataset, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss. Our code and trained networks are available at https://github.com/YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation.
Tasks	Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06736v1
PDF	https://arxiv.org/pdf/2002.06736v1.pdf
PWC	https://paperswithcode.com/paper/directional-deep-embedding-and-appearance
Repo	https://github.com/YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation
Framework	pytorch

Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits


Title	Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits
Authors	Thibaut Cuvelier, Richard Combes, Eric Gourdin
Abstract	We consider combinatorial semi-bandits over a set of arms ${\cal X} \subset {0,1}^d$ where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound $R(T) = {\cal O}\Big( {d (\ln m)^2 (\ln T) \over \Delta_{\min} }\Big)$, but it has computational complexity ${\cal O}({\cal X})$ which is typically exponential in $d$, and cannot be used in large dimensions. We propose the first algorithm which is both computationally and statistically efficient for this problem with regret $R(T) = {\cal O} \Big({d (\ln m)^2 (\ln T)\over \Delta_{\min} }\Big)$ and computational complexity ${\cal O}(T {\bf poly}(d))$. Our approach involves carefully designing an approximate version of ESCB with the same regret guarantees, showing that this approximate algorithm can be implemented in time ${\cal O}(T {\bf poly}(d))$ by repeatedly maximizing a linear function over ${\cal X}$ subject to a linear budget constraint, and showing how to solve this maximization problems efficiently.
Tasks
Published	2020-02-17
URL	https://arxiv.org/abs/2002.07258v1
PDF	https://arxiv.org/pdf/2002.07258v1.pdf
PWC	https://paperswithcode.com/paper/statistically-efficient-polynomial-time
Repo	https://github.com/dourouc05/CombinatorialBandits.jl
Framework	none

Convolutional Kernel Networks for Graph-Structured Data


Title	Convolutional Kernel Networks for Graph-Structured Data
Authors	Dexiong Chen, Laurent Jacob, Julien Mairal
Abstract	We introduce a family of multilayer graph kernels and establish new links between graph convolutional neural networks and kernel methods. Our approach generalizes convolutional kernel networks to graph-structured data, by representing graphs as a sequence of kernel feature maps, where each node carries information about local graph substructures. On the one hand, the kernel point of view offers an unsupervised, expressive, and easy-to-regularize data representation, which is useful when limited samples are available. On the other hand, our model can also be trained end-to-end on large-scale data, leading to new types of graph convolutional neural networks. We show that our method achieves competitive performance on several graph classification benchmarks, while offering simple model interpretation. Our code is freely available at https://github.com/claying/GCKN.
Tasks	Graph Classification
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05189v1
PDF	https://arxiv.org/pdf/2003.05189v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-kernel-networks-for-graph
Repo	https://github.com/claying/GCKN
Framework	pytorch

COVID-CT-Dataset: A CT Scan Dataset about COVID-19


Title	COVID-CT-Dataset: A CT Scan Dataset about COVID-19
Authors	Jinyu Zhao, Yichen Zhang, Xuehai He, Pengtao Xie
Abstract	CT scans are promising in providing accurate, fast, and cheap screening and testing of COVID-19. In this paper, we build a publicly available COVID-CT dataset, containing 275 CT scans that are positive for COVID-19, to foster the research and development of deep learning methods which predict whether a person is affected with COVID-19 by analyzing his/her CTs. We train a deep convolutional neural network on this dataset and achieve an F1 of 0.85 which is a promising performance but yet to be further improved. The data and code are available at https://github.com/UCSD-AI4H/COVID-CT
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13865v1
PDF	https://arxiv.org/pdf/2003.13865v1.pdf
PWC	https://paperswithcode.com/paper/covid-ct-dataset-a-ct-scan-dataset-about
Repo	https://github.com/UCSD-AI4H/COVID-CT
Framework	pytorch

Flows for simultaneous manifold learning and density estimation


Title	Flows for simultaneous manifold learning and density estimation
Authors	Johann Brehmer, Kyle Cranmer
Abstract	We introduce manifold-modeling flows (MFMFs), a new class of generative models that simultaneously learn the data manifold as well as a tractable probability density on that manifold. Combining aspects of normalizing flows, GANs, autoencoders, and energy-based models, they have the potential to represent data sets with a manifold structure more faithfully and provide handles on dimensionality reduction, denoising, and out-of-distribution detection. We argue why such models should not be trained by maximum likelihood alone and present a new training algorithm that separates manifold and density updates. With two pedagogical examples we demonstrate how manifold-modeling flows let us learn the data manifold and allow for better inference than standard flows in the ambient data space.
Tasks	Denoising, Density Estimation, Dimensionality Reduction, Out-of-Distribution Detection
Published	2020-03-31
URL	https://arxiv.org/abs/2003.13913v1
PDF	https://arxiv.org/pdf/2003.13913v1.pdf
PWC	https://paperswithcode.com/paper/flows-for-simultaneous-manifold-learning-and
Repo	https://github.com/johannbrehmer/manifold-flow
Framework	pytorch

Unsupervised Neural Universal Denoiser for Finite-Input General-Output Noisy Channel


Title	Unsupervised Neural Universal Denoiser for Finite-Input General-Output Noisy Channel
Authors	Tae-Eon Park, Taesup Moon
Abstract	We devise a novel neural network-based universal denoiser for the finite-input, general-output (FIGO) channel. Based on the assumption of known noisy channel densities, which is realistic in many practical scenarios, we train the network such that it can denoise as well as the best sliding window denoiser for any given underlying clean source data. Our algorithm, dubbed as Generalized CUDE (Gen-CUDE), enjoys several desirable properties; it can be trained in an unsupervised manner (solely based on the noisy observation data), has much smaller computational complexity compared to the previously developed universal denoiser for the same setting, and has much tighter upper bound on the denoising performance, which is obtained by a theoretical analysis. In our experiments, we show such tighter upper bound is also realized in practice by showing that Gen-CUDE achieves much better denoising results compared to other strong baselines for both synthetic and real underlying clean sequences.
Tasks	Denoising
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02623v1
PDF	https://arxiv.org/pdf/2003.02623v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-neural-universal-denoiser-for
Repo	https://github.com/pte1236/Gen-CUDE
Framework	none

Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models


Title	Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models
Authors	Théo Galy-Fajou, Florian Wenzel, Manfred Opper
Abstract	We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a fast and scalable stochastic variational inference method that uses efficient block coordinate ascent updates, which are computed in closed form. Second, an asymptotically correct Gibbs sampler that is useful for small datasets. Our experiments show that our method are up two orders of magnitude faster and more robust than existing state-of-the-art black-box methods.
Tasks	Gaussian Processes
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11451v1
PDF	https://arxiv.org/pdf/2002.11451v1.pdf
PWC	https://paperswithcode.com/paper/automated-augmented-conjugate-inference-for
Repo	https://github.com/theogf/AutoConjGP_Exp
Framework	tf

Bulbar ALS Detection Based on Analysis of Voice Perturbation and Vibrato


Title	Bulbar ALS Detection Based on Analysis of Voice Perturbation and Vibrato
Authors	Maxim Vashkevich, Alexander Petrovsky, Yuliya Rushkevich
Abstract	On average the lack of biological markers causes a one year diagnostic delay to detect amyotrophic lateral sclerosis (ALS). To improve the diagnostic process an automatic voice assessment based on acoustic analysis can be used. The purpose of this work was to verify the sutability of the sustain vowel phonation test for automatic detection of patients with ALS. We proposed enhanced procedure for separation of voice signal into fundamental periods that requires for calculation of perturbation measurements (such as jitter and shimmer). Also we proposed method for quantitative assessment of pathological vibrato manifestations in sustain vowel phonation. The study’s experiments show that using the proposed acoustic analysis methods, the classifier based on linear discriminant analysis attains 90.7% accuracy with 86.7% sensitivity and 92.2% specificity.
Tasks	ALS Detection
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10806v1
PDF	https://arxiv.org/pdf/2003.10806v1.pdf
PWC	https://paperswithcode.com/paper/bulbar-als-detection-based-on-analysis-of
Repo	https://github.com/Mak-Sim/Troparion
Framework	none

On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping


Title	On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
Authors	Sanghyun Hong, Varun Chandrasekaran, Yiğitcan Kaya, Tudor Dumitraş, Nicolas Papernot
Abstract	Machine learning algorithms are vulnerable to data poisoning attacks. Prior taxonomies that focus on specific scenarios, e.g., indiscriminate or targeted, have enabled defenses for the corresponding subset of known attacks. Yet, this introduces an inevitable arms race between adversaries and defenders. In this work, we study the feasibility of an attack-agnostic defense relying on artifacts that are common to all poisoning attacks. Specifically, we focus on a common element between all attacks: they modify gradients computed to train the model. We identify two main artifacts of gradients computed in the presence of poison: (1) their $\ell_2$ norms have significantly higher magnitudes than those of clean gradients, and (2) their orientation differs from clean gradients. Based on these observations, we propose the prerequisite for a generic poisoning defense: it must bound gradient magnitudes and minimize differences in orientation. We call this gradient shaping. As an exemplar tool to evaluate the feasibility of gradient shaping, we use differentially private stochastic gradient descent (DP-SGD), which clips and perturbs individual gradients during training to obtain privacy guarantees. We find that DP-SGD, even in configurations that do not result in meaningful privacy guarantees, increases the model’s robustness to indiscriminate attacks. It also mitigates worst-case targeted attacks and increases the adversary’s cost in multi-poison scenarios. The only attack we find DP-SGD to be ineffective against is a strong, yet unrealistic, indiscriminate attack. Our results suggest that, while we currently lack a generic poisoning defense, gradient shaping is a promising direction for future research.
Tasks	data poisoning
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11497v2
PDF	https://arxiv.org/pdf/2002.11497v2.pdf
PWC	https://paperswithcode.com/paper/on-the-effectiveness-of-mitigating-data
Repo	https://github.com/Sanghyun-Hong/Gradient-Shaping
Framework	jax

What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization


Title	What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
Authors	Caleb Belth, Xinyi Zheng, Jilles Vreeken, Danai Koutra
Abstract	Knowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a unified solution to KG characterization by formulating the problem as unsupervised KG summarization with a set of inductive, soft rules, which describe what is normal in a KG, and thus can be used to identify what is abnormal, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type, and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose KGist, Knowledge Graph Inductive SummarizaTion, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle—a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that KGist outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93% of missing entities—over 10% more than baselines), while also being efficient for large knowledge graphs.
Tasks	Knowledge Graphs, Question Answering
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10412v1
PDF	https://arxiv.org/pdf/2003.10412v1.pdf
PWC	https://paperswithcode.com/paper/what-is-normal-what-is-strange-and-what-is
Repo	https://github.com/GemsLab/KGist
Framework	none