Paper Group ANR 323
Towards Probabilistic Verification of Machine Unlearning. Uncertainty Estimation in Cancer Survival Prediction. iNALU: Improved Neural Arithmetic Logic Unit. Fairness-Aware Learning with Prejudice Free Representations. 3D Point Cloud Processing and Learning for Autonomous Driving. Training Adversarial Agents to Exploit Weaknesses in Deep Control Po …
Towards Probabilistic Verification of Machine Unlearning
Title | Towards Probabilistic Verification of Machine Unlearning |
Authors | David Marco Sommer, Liwei Song, Sameer Wagh, Prateek Mittal |
Abstract | Right to be forgotten, also known as the right to erasure, is the right of individuals to have their data erased from an entity storing it. The General Data Protection Regulation in the European Union legally solidified the status of this long held notion. As a consequence, there is a growing need for the development of mechanisms whereby users can verify if service providers comply with their deletion requests. In this work, we take the first step in proposing a formal framework to study the design of such verification mechanisms for data deletion requests – also known as machine unlearning – in the context of systems that provide machine learning as a service. We propose a backdoor-based verification mechanism and demonstrate its effectiveness in certifying data deletion with high confidence using the above framework. Our mechanism makes a novel use of backdoor attacks in ML as a basis for quantitatively inferring machine unlearning. In our mechanism, each user poisons part of its training data by injecting a user-specific backdoor trigger associated with a user-specific target label. The prediction of target labels on test samples with the backdoor trigger is then used as an indication of the user’s data being used to train the ML model. We formalize the verification process as a hypothesis testing problem, and provide theoretical guarantees on the statistical power of the hypothesis test. We experimentally demonstrate that our approach has minimal effect on the machine learning service but provides high confidence verification of unlearning. We show that with a $30%$ poison ratio and merely $20$ test queries, our verification mechanism has both false positive and false negative ratios below $10^{-5}$. Furthermore, we also show the effectiveness of our approach by testing it against an adaptive adversary that uses a state-of-the-art backdoor defense method. |
Tasks | |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04247v1 |
https://arxiv.org/pdf/2003.04247v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-probabilistic-verification-of-machine |
Repo | |
Framework | |
Uncertainty Estimation in Cancer Survival Prediction
Title | Uncertainty Estimation in Cancer Survival Prediction |
Authors | Hrushikesh Loya, Pranav Poduval, Deepak Anand, Neeraj Kumar, Amit Sethi |
Abstract | Survival models are used in various fields, such as the development of cancer treatment protocols. Although many statistical and machine learning models have been proposed to achieve accurate survival predictions, little attention has been paid to obtain well-calibrated uncertainty estimates associated with each prediction. The currently popular models are opaque and untrustworthy in that they often express high confidence even on those test cases that are not similar to the training samples, and even when their predictions are wrong. We propose a Bayesian framework for survival models that not only gives more accurate survival predictions but also quantifies the survival uncertainty better. Our approach is a novel combination of variational inference for uncertainty estimation, neural multi-task logistic regression for estimating nonlinear and time-varying risk models, and an additional sparsity-inducing prior to work with high dimensional data. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08573v2 |
https://arxiv.org/pdf/2003.08573v2.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-estimation-in-cancer-survival |
Repo | |
Framework | |
iNALU: Improved Neural Arithmetic Logic Unit
Title | iNALU: Improved Neural Arithmetic Logic Unit |
Authors | Daniel Schlör, Markus Ring, Andreas Hotho |
Abstract | Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication. Although NALUs have been shown to perform well on various downstream tasks, an in-depth analysis reveals practical shortcomings by design, such as the inability to multiply or divide negative input values or training stability issues for deeper networks. We address these issues and propose an improved model architecture. We evaluate our model empirically in various settings from learning basic arithmetic operations to more complex functions. Our experiments indicate that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence. |
Tasks | |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07629v1 |
https://arxiv.org/pdf/2003.07629v1.pdf | |
PWC | https://paperswithcode.com/paper/inalu-improved-neural-arithmetic-logic-unit |
Repo | |
Framework | |
Fairness-Aware Learning with Prejudice Free Representations
Title | Fairness-Aware Learning with Prejudice Free Representations |
Authors | Ramanujam Madhavan, Mohit Wadhwa |
Abstract | Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from the data; however, a model could pick up prejudice from latent sensitive attributes that may exist in the training data. This has led to the growing apprehension about the fairness of the employed models. In this paper, we propose a novel algorithm that can effectively identify and treat latent discriminating features. The approach is agnostic of the learning algorithm and generalizes well for classification as well as regression tasks. It can also be used as a key aid in proving that the model is free of discrimination towards regulatory compliance if the need arises. The approach helps to collect discrimination-free features that would improve the model performance while ensuring the fairness of the model. The experimental results from our evaluations on publicly available real-world datasets show a near-ideal fairness measurement in comparison to other methods. |
Tasks | |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.12143v1 |
https://arxiv.org/pdf/2002.12143v1.pdf | |
PWC | https://paperswithcode.com/paper/fairness-aware-learning-with-prejudice-free |
Repo | |
Framework | |
3D Point Cloud Processing and Learning for Autonomous Driving
Title | 3D Point Cloud Processing and Learning for Autonomous Driving |
Authors | Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, Carl Wellington |
Abstract | We present a review of 3D point cloud processing and learning for autonomous driving. As one of the most important sensors in autonomous vehicles, light detection and ranging (LiDAR) sensors collect 3D point clouds that precisely record the external surfaces of objects and scenes. The tools for 3D point cloud processing and learning are critical to the map creation, localization, and perception modules in an autonomous vehicle. While much attention has been paid to data collected from cameras, such as images and videos, an increasing number of researchers have recognized the importance and significance of LiDAR in autonomous driving and have proposed processing and learning algorithms to exploit 3D point clouds. We review the recent progress in this research area and summarize what has been tried and what is needed for practical and safe autonomous vehicles. We also offer perspectives on open issues that are needed to be solved in the future. |
Tasks | Autonomous Driving, Autonomous Vehicles |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.00601v1 |
https://arxiv.org/pdf/2003.00601v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-point-cloud-processing-and-learning-for |
Repo | |
Framework | |
Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies
Title | Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies |
Authors | Sampo Kuutti, Saber Fallah, Richard Bowden |
Abstract | Deep learning has become an increasingly common technique for various control problems, such as robotic arm manipulation, robot navigation, and autonomous vehicles. However, the downside of using deep neural networks to learn control policies is their opaque nature and the difficulties of validating their safety. As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand. This presents an issue, since in safety-critical applications the safety of the control policy must be ensured to a high confidence level. In this paper, we propose an automated black box testing framework based on adversarial reinforcement learning. The technique uses an adversarial agent, whose goal is to degrade the performance of the target model under test. We test the approach on an autonomous vehicle problem, by training an adversarial reinforcement learning agent, which aims to cause a deep neural network-driven autonomous vehicle to collide. Two neural networks trained for autonomous driving are compared, and the results from the testing are used to compare the robustness of their learned control policies. We show that the proposed framework is able to find weaknesses in both control policies that were not evident during online testing and therefore, demonstrate a significant benefit over manual testing methods. |
Tasks | Autonomous Driving, Autonomous Vehicles, Robot Navigation |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12078v1 |
https://arxiv.org/pdf/2002.12078v1.pdf | |
PWC | https://paperswithcode.com/paper/training-adversarial-agents-to-exploit |
Repo | |
Framework | |
Unifying Deep Local and Global Features for Image Search
Title | Unifying Deep Local and Global Features for Image Search |
Authors | Bingyi Cao, Andre Araujo, Jack Sim |
Abstract | Image retrieval is the problem of searching an image database for items that are similar to a query image. To address this task, two main types of image representations have been studied: global and local image features. In this work, our key contribution is to unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction. We refer to the new model as DELG, standing for DEep Local and Global features. We leverage lessons from recent feature learning work and propose a model that combines generalized mean pooling for global features and attentive selection for local features. The entire network can be learned end-to-end by carefully balancing the gradient flow between two heads – requiring only image-level labels. We also introduce an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance. Experiments on the Revisited Oxford and Paris datasets demonstrate that our jointly learned ResNet-50 based features outperform all previous results using deep global features (most with heavier backbones), and those that further re-rank with local features. Code and models will be released. |
Tasks | Dimensionality Reduction, Image Retrieval |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.05027v2 |
https://arxiv.org/pdf/2001.05027v2.pdf | |
PWC | https://paperswithcode.com/paper/unifying-deep-local-and-global-features-for |
Repo | |
Framework | |
Deconfounded Image Captioning: A Causal Retrospect
Title | Deconfounded Image Captioning: A Causal Retrospect |
Authors | Xu Yang, Hanwang Zhang, Jianfei Cai |
Abstract | The dataset bias in vision-language tasks is becoming one of the main problems that hinder the progress of our community. However, recent studies lack a principled analysis of the bias. In this paper, we present a novel perspective: Deconfounded Image Captioning (DIC), to find out the cause of the bias in image captioning, then retrospect modern neural image captioners, and finally propose a DIC framework: DICv1.0. DIC is based on causal inference, whose two principles: the backdoor and front-door adjustments, help us to review previous works and design the effective models. In particular, we showcase that DICv1.0 can strengthen two prevailing captioning models and achieves a single-model 130.7 CIDEr-D and 128.4 c40 CIDEr-D on Karpathy split and online split of the challenging MS-COCO dataset, respectively. Last but not least, DICv1.0 is merely a natural derivation from our causal retrospect, which opens a promising direction for image captioning. |
Tasks | Causal Inference, Image Captioning |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.03923v1 |
https://arxiv.org/pdf/2003.03923v1.pdf | |
PWC | https://paperswithcode.com/paper/deconfounded-image-captioning-a-causal |
Repo | |
Framework | |
Stability and Learning in Strategic Queuing Systems
Title | Stability and Learning in Strategic Queuing Systems |
Authors | Jason Gaitonde, Eva Tardos |
Abstract | Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research. In this paper, we study this phenomenon in the context of a game modeling queuing systems: routers compete for servers, where packets that do not get service will be resent at future rounds, resulting in a system where the number of packets at each round depends on the success of the routers in the previous rounds. We model this as an (infinitely) repeated game, where the system holds a state (number of packets held by each queue) that arises from the results of the previous round. We assume that routers satisfy the no-regret condition, e.g. they use learning strategies to identify the server where their packets get the best service. Classical work on repeated games makes the strong assumption that the subsequent rounds of the repeated games are independent (beyond the influence on learning from past history). The carryover effect caused by packets remaining in this system makes learning in our context result in a highly dependent random process. We analyze this random process and find that if the capacity of the servers is high enough to allow a centralized and knowledgeable scheduler to get all packets served even with double the packet arrival rate, and queues use no-regret learning algorithms, then the expected number of packets in the queues will remain bounded throughout time, assuming older packets have priority. This paper is the first to study the effect of selfish learning in a queuing system, where the learners compete for resources, but rounds are not all independent: the number of packets to be routed at each round depends on the success of the routers in the previous rounds. |
Tasks | |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07009v1 |
https://arxiv.org/pdf/2003.07009v1.pdf | |
PWC | https://paperswithcode.com/paper/stability-and-learning-in-strategic-queuing |
Repo | |
Framework | |
Distributed Learning with Dependent Samples
Title | Distributed Learning with Dependent Samples |
Authors | Shao-Bo Lin |
Abstract | This paper focuses on learning rate analysis of distributed kernel ridge regression for strong mixing sequences. Using a recently developed integral operator approach and a classical covariance inequality for Banach-valued strong mixing sequences, we succeed in deriving optimal learning rate for distributed kernel ridge regression. As a byproduct, we also deduce a sufficient condition for the mixing property to guarantee the optimal learning rates for kernel ridge regression. Our results extend the applicable range of distributed learning from i.i.d. samples to non-i.i.d. sequences. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03757v1 |
https://arxiv.org/pdf/2002.03757v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-learning-with-dependent-samples |
Repo | |
Framework | |
Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings
Title | Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings |
Authors | Hongyu Ren, Weihua Hu, Jure Leskovec |
Abstract | Answering complex logical queries on large-scale incomplete knowledge graphs (KGs) is a fundamental yet challenging task. Recently, a promising approach to this problem has been to embed KG entities as well as the query into a vector space such that entities that answer the query are embedded close to the query. However, prior work models queries as single points in the vector space, which is problematic because a complex query represents a potentially large set of its answer entities, but it is unclear how such a set can be represented as a single point. Furthermore, prior work can only handle queries that use conjunctions ($\wedge$) and existential quantifiers ($\exists$). Handling queries with logical disjunctions ($\vee$) remains an open problem. Here we propose query2box, an embedding-based framework for reasoning over arbitrary queries with $\wedge$, $\vee$, and $\exists$ operators in massive and incomplete KGs. Our main insight is that queries can be embedded as boxes (i.e., hyper-rectangles), where a set of points inside the box corresponds to a set of answer entities of the query. We show that conjunctions can be naturally represented as intersections of boxes and also prove a negative result that handling disjunctions would require embedding with dimension proportional to the number of KG entities. However, we show that by transforming queries into a Disjunctive Normal Form, query2box is capable of handling arbitrary logical queries with $\wedge$, $\vee$, $\exists$ in a scalable manner. We demonstrate the effectiveness of query2box on three large KGs and show that query2box achieves up to 25% relative improvement over the state of the art. |
Tasks | Knowledge Graphs |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.05969v2 |
https://arxiv.org/pdf/2002.05969v2.pdf | |
PWC | https://paperswithcode.com/paper/query2box-reasoning-over-knowledge-graphs-in-1 |
Repo | |
Framework | |
Linear predictor on linearly-generated data with missing values: non consistency and solutions
Title | Linear predictor on linearly-generated data with missing values: non consistency and solutions |
Authors | Marine Le Morvan, Nicolas Prost, Julie Josse, Erwan Scornet, Gaël Varoquaux |
Abstract | We consider building predictors when the data have missing values. We study the seemingly-simple case where the target to predict is a linear function of the fully-observed data and we show that, in the presence of missing values, the optimal predictor may not be linear. In the particular Gaussian case, it can be written as a linear function of multiway interactions between the observed data and the various missing-value indicators. Due to its intrinsic complexity, we study a simple approximation and prove generalization bounds with finite samples, highlighting regimes for which each method performs best. We then show that multilayer perceptrons with ReLU activation functions can be consistent, and can explore good trade-offs between the true model and approximations. Our study highlights the interesting family of models that are beneficial to fit with missing values depending on the amount of data available. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00658v1 |
https://arxiv.org/pdf/2002.00658v1.pdf | |
PWC | https://paperswithcode.com/paper/linear-predictor-on-linearly-generated-data |
Repo | |
Framework | |
Neural Enhanced Belief Propagation on Factor Graphs
Title | Neural Enhanced Belief Propagation on Factor Graphs |
Authors | Victor Garcia Satorras, Max Welling |
Abstract | A graphical model is a structured representation of locally dependent random variables. A traditional method to reason over these random variables is to perform inference using belief propagation. When provided with the true data generating process, belief propagation can infer the optimal posterior probability estimates in tree structured factor graphs. However, in many cases we may only have access to a poor approximation of the data generating process, or we may face loops in the factor graph, leading to suboptimal estimates. In this work we first extend graph neural networks to factor graphs (FG-GNN). We then propose a new hybrid model that runs conjointly a FG-GNN with belief propagation. The FG-GNN receives as input messages from belief propagation at every inference iteration and outputs a corrected version of them. As a result, we obtain a more accurate algorithm that combines the benefits of both belief propagation and graph neural networks. We apply our ideas to error correction decoding tasks, and we show that our algorithm can outperform belief propagation for LDPC codes on bursty channels. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.01998v1 |
https://arxiv.org/pdf/2003.01998v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-enhanced-belief-propagation-on-factor |
Repo | |
Framework | |
Gradient Statistics Aware Power Control for Over-the-Air Federated Learning in Fading Channels
Title | Gradient Statistics Aware Power Control for Over-the-Air Federated Learning in Fading Channels |
Authors | Naifu Zhang, Meixia Tao |
Abstract | To enable communication-efficient federated learning, fast model aggregation can be designed using over-the-air computation (AirComp). In order to implement a reliable and high-performance AirComp over fading channels, power control at edge devices is crucial. Existing works focus on the traditional data aggregation which often assumes that the local data collected at different devices are identically distributed and can be normalized with zero mean and unit variance. This assumption, however, does not hold for gradient aggregation in machine learning. In this paper, we study the optimal power control problem for efficient over-the-air FL by taking gradient statistics into account. Our goal is to minimize the model aggregation error measured by mean square error (MSE) by jointly optimizing the transmit power of each device and the denoising factor at the edge server. We first derive the optimal solution in closed form where the gradient first-order and second-order statistics are known. The derived optimal power control structure depends on multivariate coefficient of variation of gradient. We then propose a method to estimate the gradient statistics based on the historical aggregated gradients and then dynamically adjust the transmit power on devices over each training iteration. Experiment results show that our proposed power control is better than full power transmission and threshold-based power control in both model accuracy and convergence rate. |
Tasks | Denoising |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02089v1 |
https://arxiv.org/pdf/2003.02089v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-statistics-aware-power-control-for |
Repo | |
Framework | |
On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery
Title | On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery |
Authors | Qian Wang, Neelanjan Bhowmik, Toby P. Breckon |
Abstract | X-ray Computed Tomography (CT) based 3D imaging is widely used in airports for aviation security screening whilst prior work on prohibited item detection focuses primarily on 2D X-ray imagery. In this paper, we aim to evaluate the possibility of extending the automatic prohibited item detection from 2D X-ray imagery to volumetric 3D CT baggage security screening imagery. To these ends, we take advantage of 3D Convolutional Neural Neworks (CNN) and popular object detection frameworks such as RetinaNet and Faster R-CNN in our work. As the first attempt to use 3D CNN for volumetric 3D CT baggage security screening, we first evaluate different CNN architectures on the classification of isolated prohibited item volumes and compare against traditional methods which use hand-crafted features. Subsequently, we evaluate object detection performance of different architectures on volumetric 3D CT baggage images. The results of our experiments on Bottle and Handgun datasets demonstrate that 3D CNN models can achieve comparable performance (98% true positive rate and 1.5% false positive rate) to traditional methods but require significantly less time for inference (0.014s per volume). Furthermore, the extended 3D object detection models achieve promising performance in detecting prohibited items within volumetric 3D CT baggage imagery with 76% mAP for bottles and 88% mAP for handguns, which shows both the challenge and promise of such threat detection within 3D CT X-ray security imagery. |
Tasks | 3D Object Detection, Computed Tomography (CT), Object Detection |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12625v1 |
https://arxiv.org/pdf/2003.12625v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-evaluation-of-prohibited-item |
Repo | |
Framework | |