Paper Group ANR 429
Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection. Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration. An AI-based, Multi-stage detection system of banking botnets. Federated Learning with Bayesian Differential Privacy. A Novel Adap …
Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection
Title | Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection |
Authors | Ilia Nouretdinov, James Gammerman, Matteo Fontana, Daljit Rehal |
Abstract | In this work we present a clustering technique called \textit{multi-level conformal clustering (MLCC)}. The technique is hierarchical in nature because it can be performed at multiple significance levels which yields greater insight into the data than performing it at just one level. We describe the theoretical underpinnings of MLCC, compare and contrast it with the hierarchical clustering algorithm, and then apply it to real world datasets to assess its performance. There are several advantages to using MLCC over more classical clustering techniques: Once a significance level has been set, MLCC is able to automatically select the number of clusters. Furthermore, thanks to the conformal prediction framework the resulting clustering model has a clear statistical meaning without any assumptions about the distribution of the data. This statistical robustness also allows us to perform clustering and anomaly detection simultaneously. Moreover, due to the flexibility of the conformal prediction framework, our algorithm can be used on top of many other machine learning algorithms. |
Tasks | Anomaly Detection |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08105v2 |
https://arxiv.org/pdf/1910.08105v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-conformal-clustering-a |
Repo | |
Framework | |
Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration
Title | Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration |
Authors | Linda Wang, Alexander Wong |
Abstract | Recent improvements in object detection have shown potential to aid in tasks where previous solutions were not able to achieve. A particular area is assistive devices for individuals with visual impairment. While state-of-the-art deep neural networks have been shown to achieve superior object detection performance, their high computational and memory requirements make them cost prohibitive for on-device operation. Alternatively, cloud-based operation leads to privacy concerns, both not attractive to potential users. To address these challenges, this study investigates creating an efficient object detection network specifically for OLIV, an AI-powered assistant for object localization for the visually impaired, via micro-architecture design exploration. In particular, we formulate the problem of finding an optimal network micro-architecture as an numerical optimization problem, where we find the set of hyperparameters controlling the MobileNetV2-SSD network micro-architecture that maximizes a modified NetScore objective function for the MSCOCO-OLIV dataset of indoor objects. Experimental results show that such a micro-architecture design exploration strategy leads to a compact deep neural network with a balanced trade-off between accuracy, size, and speed, making it well-suited for enabling on-device computer vision driven assistive devices for the visually impaired. |
Tasks | Object Detection, Object Localization |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.07836v1 |
https://arxiv.org/pdf/1905.07836v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-computer-vision-driven-assistive |
Repo | |
Framework | |
An AI-based, Multi-stage detection system of banking botnets
Title | An AI-based, Multi-stage detection system of banking botnets |
Authors | Li Ling, Zhiqiang Gao, Michael A Silas, Ian Lee, Erwan A Le Doeuff |
Abstract | Banking Trojans, botnets are primary drivers of financially-motivated cybercrime. In this paper, we first analyzed how an APT-based banking botnet works step by step through the whole lifecycle. Specifically, we present a multi-stage system that detects malicious banking botnet activities which potentially target the organizations. The system leverages Cyber Data Lake as well as multiple artificial intelligence techniques at different stages. The evaluation results using public datasets showed that Deep Learning based detections were highly successful compared with baseline models. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08276v3 |
https://arxiv.org/pdf/1907.08276v3.pdf | |
PWC | https://paperswithcode.com/paper/an-ai-based-multi-stage-detection-system-of |
Repo | |
Framework | |
Federated Learning with Bayesian Differential Privacy
Title | Federated Learning with Bayesian Differential Privacy |
Authors | Aleksei Triastcyn, Boi Faltings |
Abstract | We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at different levels. Our experiments show significant advantage over the state-of-the-art differential privacy bounds for federated learning on image classification tasks, including a medical application, bringing the privacy budget below 1 at the client level, and below 0.1 at the instance level. Lower amounts of noise also benefit the model accuracy and reduce the number of communication rounds. |
Tasks | Image Classification |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.10071v1 |
https://arxiv.org/pdf/1911.10071v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-learning-with-bayesian-differential |
Repo | |
Framework | |
A Novel Adaptive Kernel for the RBF Neural Networks
Title | A Novel Adaptive Kernel for the RBF Neural Networks |
Authors | Shujaat Khan, Imran Naseem, Roberto Togneri, Mohammed Bennamoun |
Abstract | In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed framework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation. |
Tasks | |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03546v1 |
https://arxiv.org/pdf/1905.03546v1.pdf | |
PWC | https://paperswithcode.com/paper/190503546 |
Repo | |
Framework | |
Real-Time Boiler Control Optimization with Machine Learning
Title | Real-Time Boiler Control Optimization with Machine Learning |
Authors | Yukun Ding, Yiyu Shi |
Abstract | In coal-fired power plants, it is critical to improve the operational efficiency of boilers for sustainability. In this work, we formulate real-time boiler control as an optimization problem that looks for the best distribution of temperature in different zones and oxygen content from the flue to improve the boiler’s stability and energy efficiency. We employ an efficient algorithm by integrating appropriate machine learning and optimization techniques. We obtain a large dataset collected from a real boiler for more than two months from our industry partner, and conduct extensive experiments to demonstrate the effectiveness and efficiency of the proposed algorithm. |
Tasks | |
Published | 2019-03-07 |
URL | http://arxiv.org/abs/1903.04958v1 |
http://arxiv.org/pdf/1903.04958v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-boiler-control-optimization-with |
Repo | |
Framework | |
A Survey on Neural Architecture Search
Title | A Survey on Neural Architecture Search |
Authors | Martin Wistuba, Ambrish Rawat, Tejaswini Pedapati |
Abstract | The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of a wide variety of automated methods for neural architecture search. The choice of the network architecture has proven to be critical, and many advances in deep learning spring from its immediate improvements. However, deep learning techniques are computationally intensive and their application requires a high level of domain knowledge. Therefore, even partial automation of this process helps to make deep learning more accessible to both researchers and practitioners. With this survey, we provide a formalism which unifies and categorizes the landscape of existing methods along with a detailed analysis that compares and contrasts the different approaches. We achieve this via a comprehensive discussion of the commonly adopted architecture search spaces and architecture optimization algorithms based on principles of reinforcement learning and evolutionary algorithms along with approaches that incorporate surrogate and one-shot models. Additionally, we address the new research directions which include constrained and multi-objective architecture search as well as automated data augmentation, optimizer and activation function search. |
Tasks | Data Augmentation, Neural Architecture Search |
Published | 2019-05-04 |
URL | https://arxiv.org/abs/1905.01392v2 |
https://arxiv.org/pdf/1905.01392v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-neural-architecture-search |
Repo | |
Framework | |
Federated Multi-task Hierarchical Attention Model for Sensor Analytics
Title | Federated Multi-task Hierarchical Attention Model for Sensor Analytics |
Authors | Yujing Chen, Yue Ning, Zheng Chai, Huzefa Rangwala |
Abstract | Sensors are an integral part of modern Internet of Things (IoT) applications. There is a critical need for the analysis of heterogeneous multivariate temporal data obtained from the individual sensors of these systems. In this paper we particularly focus on the problem of the scarce amount of training data available per sensor. We propose a novel federated multi-task hierarchical attention model (FATHOM) that jointly trains classification/regression models from multiple sensors. The attention mechanism of the proposed model seeks to extract feature representations from the input and learn a shared representation focused on time dimensions across multiple sensors. The underlying temporal and non-linear relationships are modeled using a combination of attention mechanism and long-short term memory (LSTM) networks. We find that our proposed method outperforms a wide range of competitive baselines in both classification and regression settings on activity recognition and environment monitoring datasets. We further provide visualization of feature representations learned by our model at the input sensor level and central time level. |
Tasks | Activity Recognition |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05142v1 |
https://arxiv.org/pdf/1905.05142v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-multi-task-hierarchical-attention |
Repo | |
Framework | |
Towards Inverse Reinforcement Learning for Limit Order Book Dynamics
Title | Towards Inverse Reinforcement Learning for Limit Order Book Dynamics |
Authors | Jacobo Roa-Vicens, Cyrine Chtourou, Angelos Filos, Francisco Rullan, Yarin Gal, Ricardo Silva |
Abstract | Multi-agent learning is a promising method to simulate aggregate competitive behaviour in finance. Learning expert agents’ reward functions through their external demonstrations is hence particularly relevant for subsequent design of realistic agent-based simulations. Inverse Reinforcement Learning (IRL) aims at acquiring such reward functions through inference, allowing to generalize the resulting policy to states not observed in the past. This paper investigates whether IRL can infer such rewards from agents within real financial stochastic environments: limit order books (LOB). We introduce a simple one-level LOB, where the interactions of a number of stochastic agents and an expert trading agent are modelled as a Markov decision process. We consider two cases for the expert’s reward: either a simple linear function of state features; or a complex, more realistic non-linear function. Given the expert agent’s demonstrations, we attempt to discover their strategy by modelling their latent reward function using linear and Gaussian process (GP) regressors from previous literature, and our own approach through Bayesian neural networks (BNN). While the three methods can learn the linear case, only the GP-based and our proposed BNN methods are able to discover the non-linear reward case. Our BNN IRL algorithm outperforms the other two approaches as the number of samples increases. These results illustrate that complex behaviours, induced by non-linear reward functions amid agent-based stochastic scenarios, can be deduced through inference, encouraging the use of inverse reinforcement learning for opponent-modelling in multi-agent systems. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04813v1 |
https://arxiv.org/pdf/1906.04813v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-inverse-reinforcement-learning-for |
Repo | |
Framework | |
Face Recognition: A Novel Multi-Level Taxonomy based Survey
Title | Face Recognition: A Novel Multi-Level Taxonomy based Survey |
Authors | Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia |
Abstract | In a world where security issues have been gaining growing importance, face recognition systems have attracted increasing attention in multiple application areas, ranging from forensics and surveillance to commerce and entertainment. To help understanding the landscape and abstraction levels relevant for face recognition systems, face recognition taxonomies allow a deeper dissection and comparison of the existing solutions. This paper proposes a new, more encompassing and richer multi-level face recognition taxonomy, facilitating the organization and categorization of available and emerging face recognition solutions; this taxonomy may also guide researchers in the development of more efficient face recognition solutions. The proposed multi-level taxonomy considers levels related to the face structure, feature support and feature extraction approach. Following the proposed taxonomy, a comprehensive survey of representative face recognition solutions is presented. The paper concludes with a discussion on current algorithmic and application related challenges which may define future research directions for face recognition. |
Tasks | Face Recognition |
Published | 2019-01-03 |
URL | http://arxiv.org/abs/1901.00713v1 |
http://arxiv.org/pdf/1901.00713v1.pdf | |
PWC | https://paperswithcode.com/paper/face-recognition-a-novel-multi-level-taxonomy |
Repo | |
Framework | |
Parametric Gaussian Process Regressors
Title | Parametric Gaussian Process Regressors |
Authors | Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner |
Abstract | The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et.~al.~2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods–often by as much as half a nat per datapoint. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07123v2 |
https://arxiv.org/pdf/1910.07123v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-gaussian-process-regression-beyond |
Repo | |
Framework | |
Multi-View Multiple Clustering
Title | Multi-View Multiple Clustering |
Authors | Shixing Yao, Guoxian Yu, Jun Wang, Carlotta Domeniconi, Xiangliang Zhang |
Abstract | Multiple clustering aims at exploring alternative clusterings to organize the data into meaningful groups from different perspectives. Existing multiple clustering algorithms are designed for single-view data. We assume that the individuality and commonality of multi-view data can be leveraged to generate high-quality and diverse clusterings. To this end, we propose a novel multi-view multiple clustering (MVMC) algorithm. MVMC first adapts multi-view self-representation learning to explore the individuality encoding matrices and the shared commonality matrix of multi-view data. It additionally reduces the redundancy (i.e., enhancing the individuality) among the matrices using the Hilbert-Schmidt Independence Criterion (HSIC), and collects shared information by forcing the shared matrix to be smooth across all views. It then uses matrix factorization on the individual matrices, along with the shared matrix, to generate diverse clusterings of high-quality. We further extend multiple co-clustering on multi-view data and propose a solution called multi-view multiple co-clustering (MVMCC). Our empirical study shows that MVMC (MVMCC) can exploit multi-view data to generate multiple high-quality and diverse clusterings (co-clusterings), with superior performance to the state-of-the-art methods. |
Tasks | Representation Learning |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05053v1 |
https://arxiv.org/pdf/1905.05053v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-multiple-clustering |
Repo | |
Framework | |
Distributional property testing in a quantum world
Title | Distributional property testing in a quantum world |
Authors | András Gilyén, Tongyang Li |
Abstract | A fundamental problem in statistics and learning theory is to test properties of distributions. We show that quantum computers can solve such problems with significant speed-ups. In particular, we give fast quantum algorithms for testing closeness between unknown distributions, testing independence between two distributions, and estimating the Shannon / von Neumann entropy of distributions. The distributions can be either classical or quantum, however our quantum algorithms require coherent quantum access to a process preparing the samples. Our results build on the recent technique of quantum singular value transformation, combined with more standard tricks such as divide-and-conquer. The presented approach is a natural fit for distributional property testing both in the classical and the quantum case, demonstrating the first speed-ups for testing properties of density operators that can be accessed coherently rather than only via sampling; for classical distributions our algorithms significantly improve the precision dependence of some earlier results. |
Tasks | |
Published | 2019-02-02 |
URL | http://arxiv.org/abs/1902.00814v1 |
http://arxiv.org/pdf/1902.00814v1.pdf | |
PWC | https://paperswithcode.com/paper/distributional-property-testing-in-a-quantum |
Repo | |
Framework | |
Reducing Adversarial Example Transferability Using Gradient Regularization
Title | Reducing Adversarial Example Transferability Using Gradient Regularization |
Authors | George Adam, Petr Smirnov, Benjamin Haibe-Kains, Anna Goldenberg |
Abstract | Deep learning algorithms have increasingly been shown to lack robustness to simple adversarial examples (AdvX). An equally troubling observation is that these adversarial examples transfer between different architectures trained on different datasets. We investigate the transferability of adversarial examples between models using the angle between the input-output Jacobians of different models. To demonstrate the relevance of this approach, we perform case studies that involve jointly training pairs of models. These case studies empirically justify the theoretical intuitions for why the angle between gradients is a fundamental quantity in AdvX transferability. Furthermore, we consider the asymmetry of AdvX transferability between two models of the same architecture and explain it in terms of differences in gradient norms between the models. Lastly, we provide a simple modification to existing training setups that reduces transferability of adversarial examples between pairs of models. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07980v1 |
http://arxiv.org/pdf/1904.07980v1.pdf | |
PWC | https://paperswithcode.com/paper/reducing-adversarial-example-transferability |
Repo | |
Framework | |
Compression of Recurrent Neural Networks for Efficient Language Modeling
Title | Compression of Recurrent Neural Networks for Efficient Language Modeling |
Authors | Artem M. Grachev, Dmitry I. Ignatov, Andrey V. Savchenko |
Abstract | Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression-perplexity balance are obtained by matrix decomposition techniques. |
Tasks | Language Modelling, Quantization |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02380v1 |
http://arxiv.org/pdf/1902.02380v1.pdf | |
PWC | https://paperswithcode.com/paper/compression-of-recurrent-neural-networks-for |
Repo | |
Framework | |