Paper Group AWR 50
The Reciprocal Bayesian LASSO. Certified Defenses for Adversarial Patches. Training Binary Neural Networks using the Bayesian Learning Rule. AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC. Tidying Deep Saliency Prediction Architectures. Directional Deep Embedding and Appearance Learning for Fast Video Object Segment …
The Reciprocal Bayesian LASSO
Title | The Reciprocal Bayesian LASSO |
Authors | Himel Mallick, Rahim Alhamzawi, Vladimir Svetnik |
Abstract | A reciprocal LASSO (rLASSO) regularization employs a decreasing penalty function as opposed to conventional penalization methods that use increasing penalties on the coefficients, leading to stronger parsimony and superior model selection relative to traditional shrinkage methods. Here we consider a fully Bayesian formulation of the rLASSO problem, which is based on the observation that the rLASSO estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters are assigned independent inverse Laplace priors. Bayesian inference from this posterior is possible using an expanded hierarchy motivated by a scale mixture of double Pareto or truncated normal distributions. On simulated and real datasets, we show that the Bayesian formulation outperforms its classical cousin in estimation, prediction, and variable selection across a wide range of scenarios while offering the advantage of posterior inference. Finally, we discuss other variants of this new approach and provide a unified framework for variable selection using flexible reciprocal penalties. All methods described in this paper are publicly available as an R package at: https://github.com/himelmallick/BayesRecipe. |
Tasks | Bayesian Inference, Model Selection |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08327v1 |
https://arxiv.org/pdf/2001.08327v1.pdf | |
PWC | https://paperswithcode.com/paper/the-reciprocal-bayesian-lasso |
Repo | https://github.com/himelmallick/BayesRecipe |
Framework | none |
Certified Defenses for Adversarial Patches
Title | Certified Defenses for Adversarial Patches |
Authors | Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studor, Tom Goldstein |
Abstract | Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems. This paper studies certified and empirical defenses against patch attacks. We begin with a set of experiments showing that most existing defenses, which work by pre-processing input images to mitigate adversarial patches, are easily broken by simple white-box adversaries. Motivated by this finding, we propose the first certified defense against patch attacks, and propose faster methods for its training. Furthermore, we experiment with different patch shapes for testing, obtaining surprisingly good robustness transfer across shapes, and present preliminary results on certified defense against sparse attacks. Our complete implementation can be found on: https://github.com/Ping-C/certifiedpatchdefense. |
Tasks | |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06693v1 |
https://arxiv.org/pdf/2003.06693v1.pdf | |
PWC | https://paperswithcode.com/paper/certified-defenses-for-adversarial-patches-1 |
Repo | https://github.com/Ping-C/certifiedpatchdefense |
Framework | pytorch |
Training Binary Neural Networks using the Bayesian Learning Rule
Title | Training Binary Neural Networks using the Bayesian Learning Rule |
Authors | Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan |
Abstract | Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches. |
Tasks | Continual Learning |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10778v2 |
https://arxiv.org/pdf/2002.10778v2.pdf | |
PWC | https://paperswithcode.com/paper/training-binary-neural-networks-using-the |
Repo | https://github.com/team-approx-bayes/BayesBiNN |
Framework | pytorch |
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
Title | AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC |
Authors | Ruqi Zhang, A. Feder Cooper, Christopher De Sa |
Abstract | Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions. It is a faster alternative to HMC: instead of using the whole dataset at each iteration, SGHMC uses only a subsample. This improves performance, but introduces bias that can cause SGHMC to converge to the wrong distribution. One can prevent this using a step size that decays to zero, but such a step size schedule can drastically slow down convergence. To address this tension, we propose a novel second-order SG-MCMC algorithm—AMAGOLD—that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias. The infrequency of corrections amortizes their cost. We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline. We empirically demonstrate AMAGOLD’s effectiveness on synthetic distributions, Bayesian logistic regression, and Bayesian neural networks. |
Tasks | |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00193v1 |
https://arxiv.org/pdf/2003.00193v1.pdf | |
PWC | https://paperswithcode.com/paper/amagold-amortized-metropolis-adjustment-for |
Repo | https://github.com/ruqizhang/amagold |
Framework | pytorch |
Tidying Deep Saliency Prediction Architectures
Title | Tidying Deep Saliency Prediction Architectures |
Authors | Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi |
Abstract | Learning computational models for visual attention (saliency estimation) is an effort to inch machines/robots closer to human visual cognitive abilities. Data-driven efforts have dominated the landscape since the introduction of deep neural network architectures. In deep learning research, the choices in architecture design are often empirical and frequently lead to more complex models than necessary. The complexity, in turn, hinders the application requirements. In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions. We review the existing state of the art models on these four components and propose novel and simpler alternatives. As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks. SimpleNet is an optimized encoder-decoder architecture and brings notable performance gains on the SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model that directly predicts parameters of a GMM distribution and is aimed to bring more interpretability to the prediction maps. The proposed saliency models can be inferred at 25fps, making them suitable for real-time applications. Code and pre-trained models are available at https://github.com/samyak0210/saliency. |
Tasks | Saliency Prediction |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04942v1 |
https://arxiv.org/pdf/2003.04942v1.pdf | |
PWC | https://paperswithcode.com/paper/tidying-deep-saliency-prediction |
Repo | https://github.com/samyak0210/saliency |
Framework | pytorch |
Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation
Title | Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation |
Authors | Yingjie Yin, De Xu, Xingang Wang, Lei Zhang |
Abstract | Most recent semi-supervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module, which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the global directional matching module and the directional appearance model learning module, DDEAL learns static cues from the labeled first frame and dynamically updates cues of the subsequent frames for object segmentation. Our method exhibits state-of-the-art VOS performance without using online fine-tuning. Specifically, it achieves a J & F mean score of 74.8% on DAVIS 2017 dataset and an overall score G of 71.3% on the large-scale YouTube-VOS dataset, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss. Our code and trained networks are available at https://github.com/YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation. |
Tasks | Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06736v1 |
https://arxiv.org/pdf/2002.06736v1.pdf | |
PWC | https://paperswithcode.com/paper/directional-deep-embedding-and-appearance |
Repo | https://github.com/YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation |
Framework | pytorch |
Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits
Title | Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits |
Authors | Thibaut Cuvelier, Richard Combes, Eric Gourdin |
Abstract | We consider combinatorial semi-bandits over a set of arms ${\cal X} \subset {0,1}^d$ where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound $R(T) = {\cal O}\Big( {d (\ln m)^2 (\ln T) \over \Delta_{\min} }\Big)$, but it has computational complexity ${\cal O}({\cal X})$ which is typically exponential in $d$, and cannot be used in large dimensions. We propose the first algorithm which is both computationally and statistically efficient for this problem with regret $R(T) = {\cal O} \Big({d (\ln m)^2 (\ln T)\over \Delta_{\min} }\Big)$ and computational complexity ${\cal O}(T {\bf poly}(d))$. Our approach involves carefully designing an approximate version of ESCB with the same regret guarantees, showing that this approximate algorithm can be implemented in time ${\cal O}(T {\bf poly}(d))$ by repeatedly maximizing a linear function over ${\cal X}$ subject to a linear budget constraint, and showing how to solve this maximization problems efficiently. |
Tasks | |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.07258v1 |
https://arxiv.org/pdf/2002.07258v1.pdf | |
PWC | https://paperswithcode.com/paper/statistically-efficient-polynomial-time |
Repo | https://github.com/dourouc05/CombinatorialBandits.jl |
Framework | none |
Convolutional Kernel Networks for Graph-Structured Data
Title | Convolutional Kernel Networks for Graph-Structured Data |
Authors | Dexiong Chen, Laurent Jacob, Julien Mairal |
Abstract | We introduce a family of multilayer graph kernels and establish new links between graph convolutional neural networks and kernel methods. Our approach generalizes convolutional kernel networks to graph-structured data, by representing graphs as a sequence of kernel feature maps, where each node carries information about local graph substructures. On the one hand, the kernel point of view offers an unsupervised, expressive, and easy-to-regularize data representation, which is useful when limited samples are available. On the other hand, our model can also be trained end-to-end on large-scale data, leading to new types of graph convolutional neural networks. We show that our method achieves competitive performance on several graph classification benchmarks, while offering simple model interpretation. Our code is freely available at https://github.com/claying/GCKN. |
Tasks | Graph Classification |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05189v1 |
https://arxiv.org/pdf/2003.05189v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-kernel-networks-for-graph |
Repo | https://github.com/claying/GCKN |
Framework | pytorch |
COVID-CT-Dataset: A CT Scan Dataset about COVID-19
Title | COVID-CT-Dataset: A CT Scan Dataset about COVID-19 |
Authors | Jinyu Zhao, Yichen Zhang, Xuehai He, Pengtao Xie |
Abstract | CT scans are promising in providing accurate, fast, and cheap screening and testing of COVID-19. In this paper, we build a publicly available COVID-CT dataset, containing 275 CT scans that are positive for COVID-19, to foster the research and development of deep learning methods which predict whether a person is affected with COVID-19 by analyzing his/her CTs. We train a deep convolutional neural network on this dataset and achieve an F1 of 0.85 which is a promising performance but yet to be further improved. The data and code are available at https://github.com/UCSD-AI4H/COVID-CT |
Tasks | |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13865v1 |
https://arxiv.org/pdf/2003.13865v1.pdf | |
PWC | https://paperswithcode.com/paper/covid-ct-dataset-a-ct-scan-dataset-about |
Repo | https://github.com/UCSD-AI4H/COVID-CT |
Framework | pytorch |
Flows for simultaneous manifold learning and density estimation
Title | Flows for simultaneous manifold learning and density estimation |
Authors | Johann Brehmer, Kyle Cranmer |
Abstract | We introduce manifold-modeling flows (MFMFs), a new class of generative models that simultaneously learn the data manifold as well as a tractable probability density on that manifold. Combining aspects of normalizing flows, GANs, autoencoders, and energy-based models, they have the potential to represent data sets with a manifold structure more faithfully and provide handles on dimensionality reduction, denoising, and out-of-distribution detection. We argue why such models should not be trained by maximum likelihood alone and present a new training algorithm that separates manifold and density updates. With two pedagogical examples we demonstrate how manifold-modeling flows let us learn the data manifold and allow for better inference than standard flows in the ambient data space. |
Tasks | Denoising, Density Estimation, Dimensionality Reduction, Out-of-Distribution Detection |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.13913v1 |
https://arxiv.org/pdf/2003.13913v1.pdf | |
PWC | https://paperswithcode.com/paper/flows-for-simultaneous-manifold-learning-and |
Repo | https://github.com/johannbrehmer/manifold-flow |
Framework | pytorch |
Unsupervised Neural Universal Denoiser for Finite-Input General-Output Noisy Channel
Title | Unsupervised Neural Universal Denoiser for Finite-Input General-Output Noisy Channel |
Authors | Tae-Eon Park, Taesup Moon |
Abstract | We devise a novel neural network-based universal denoiser for the finite-input, general-output (FIGO) channel. Based on the assumption of known noisy channel densities, which is realistic in many practical scenarios, we train the network such that it can denoise as well as the best sliding window denoiser for any given underlying clean source data. Our algorithm, dubbed as Generalized CUDE (Gen-CUDE), enjoys several desirable properties; it can be trained in an unsupervised manner (solely based on the noisy observation data), has much smaller computational complexity compared to the previously developed universal denoiser for the same setting, and has much tighter upper bound on the denoising performance, which is obtained by a theoretical analysis. In our experiments, we show such tighter upper bound is also realized in practice by showing that Gen-CUDE achieves much better denoising results compared to other strong baselines for both synthetic and real underlying clean sequences. |
Tasks | Denoising |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02623v1 |
https://arxiv.org/pdf/2003.02623v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-neural-universal-denoiser-for |
Repo | https://github.com/pte1236/Gen-CUDE |
Framework | none |
Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models
Title | Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models |
Authors | Théo Galy-Fajou, Florian Wenzel, Manfred Opper |
Abstract | We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a fast and scalable stochastic variational inference method that uses efficient block coordinate ascent updates, which are computed in closed form. Second, an asymptotically correct Gibbs sampler that is useful for small datasets. Our experiments show that our method are up two orders of magnitude faster and more robust than existing state-of-the-art black-box methods. |
Tasks | Gaussian Processes |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11451v1 |
https://arxiv.org/pdf/2002.11451v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-augmented-conjugate-inference-for |
Repo | https://github.com/theogf/AutoConjGP_Exp |
Framework | tf |
Bulbar ALS Detection Based on Analysis of Voice Perturbation and Vibrato
Title | Bulbar ALS Detection Based on Analysis of Voice Perturbation and Vibrato |
Authors | Maxim Vashkevich, Alexander Petrovsky, Yuliya Rushkevich |
Abstract | On average the lack of biological markers causes a one year diagnostic delay to detect amyotrophic lateral sclerosis (ALS). To improve the diagnostic process an automatic voice assessment based on acoustic analysis can be used. The purpose of this work was to verify the sutability of the sustain vowel phonation test for automatic detection of patients with ALS. We proposed enhanced procedure for separation of voice signal into fundamental periods that requires for calculation of perturbation measurements (such as jitter and shimmer). Also we proposed method for quantitative assessment of pathological vibrato manifestations in sustain vowel phonation. The study’s experiments show that using the proposed acoustic analysis methods, the classifier based on linear discriminant analysis attains 90.7% accuracy with 86.7% sensitivity and 92.2% specificity. |
Tasks | ALS Detection |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10806v1 |
https://arxiv.org/pdf/2003.10806v1.pdf | |
PWC | https://paperswithcode.com/paper/bulbar-als-detection-based-on-analysis-of |
Repo | https://github.com/Mak-Sim/Troparion |
Framework | none |
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
Title | On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping |
Authors | Sanghyun Hong, Varun Chandrasekaran, Yiğitcan Kaya, Tudor Dumitraş, Nicolas Papernot |
Abstract | Machine learning algorithms are vulnerable to data poisoning attacks. Prior taxonomies that focus on specific scenarios, e.g., indiscriminate or targeted, have enabled defenses for the corresponding subset of known attacks. Yet, this introduces an inevitable arms race between adversaries and defenders. In this work, we study the feasibility of an attack-agnostic defense relying on artifacts that are common to all poisoning attacks. Specifically, we focus on a common element between all attacks: they modify gradients computed to train the model. We identify two main artifacts of gradients computed in the presence of poison: (1) their $\ell_2$ norms have significantly higher magnitudes than those of clean gradients, and (2) their orientation differs from clean gradients. Based on these observations, we propose the prerequisite for a generic poisoning defense: it must bound gradient magnitudes and minimize differences in orientation. We call this gradient shaping. As an exemplar tool to evaluate the feasibility of gradient shaping, we use differentially private stochastic gradient descent (DP-SGD), which clips and perturbs individual gradients during training to obtain privacy guarantees. We find that DP-SGD, even in configurations that do not result in meaningful privacy guarantees, increases the model’s robustness to indiscriminate attacks. It also mitigates worst-case targeted attacks and increases the adversary’s cost in multi-poison scenarios. The only attack we find DP-SGD to be ineffective against is a strong, yet unrealistic, indiscriminate attack. Our results suggest that, while we currently lack a generic poisoning defense, gradient shaping is a promising direction for future research. |
Tasks | data poisoning |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11497v2 |
https://arxiv.org/pdf/2002.11497v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effectiveness-of-mitigating-data |
Repo | https://github.com/Sanghyun-Hong/Gradient-Shaping |
Framework | jax |
What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
Title | What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization |
Authors | Caleb Belth, Xinyi Zheng, Jilles Vreeken, Danai Koutra |
Abstract | Knowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a unified solution to KG characterization by formulating the problem as unsupervised KG summarization with a set of inductive, soft rules, which describe what is normal in a KG, and thus can be used to identify what is abnormal, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type, and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose KGist, Knowledge Graph Inductive SummarizaTion, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle—a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that KGist outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93% of missing entities—over 10% more than baselines), while also being efficient for large knowledge graphs. |
Tasks | Knowledge Graphs, Question Answering |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10412v1 |
https://arxiv.org/pdf/2003.10412v1.pdf | |
PWC | https://paperswithcode.com/paper/what-is-normal-what-is-strange-and-what-is |
Repo | https://github.com/GemsLab/KGist |
Framework | none |