January 30, 2020

2840 words 14 mins read

Paper Group ANR 429

Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection. Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration. An AI-based, Multi-stage detection system of banking botnets. Federated Learning with Bayesian Differential Privacy. A Novel Adap …

Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection


Title	Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection
Authors	Ilia Nouretdinov, James Gammerman, Matteo Fontana, Daljit Rehal
Abstract	In this work we present a clustering technique called \textit{multi-level conformal clustering (MLCC)}. The technique is hierarchical in nature because it can be performed at multiple significance levels which yields greater insight into the data than performing it at just one level. We describe the theoretical underpinnings of MLCC, compare and contrast it with the hierarchical clustering algorithm, and then apply it to real world datasets to assess its performance. There are several advantages to using MLCC over more classical clustering techniques: Once a significance level has been set, MLCC is able to automatically select the number of clusters. Furthermore, thanks to the conformal prediction framework the resulting clustering model has a clear statistical meaning without any assumptions about the distribution of the data. This statistical robustness also allows us to perform clustering and anomaly detection simultaneously. Moreover, due to the flexibility of the conformal prediction framework, our algorithm can be used on top of many other machine learning algorithms.
Tasks	Anomaly Detection
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08105v2
PDF	https://arxiv.org/pdf/1910.08105v2.pdf
PWC	https://paperswithcode.com/paper/multi-level-conformal-clustering-a
Repo
Framework

Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration


Title	Enabling Computer Vision Driven Assistive Devices for the Visually Impaired via Micro-architecture Design Exploration
Authors	Linda Wang, Alexander Wong
Abstract	Recent improvements in object detection have shown potential to aid in tasks where previous solutions were not able to achieve. A particular area is assistive devices for individuals with visual impairment. While state-of-the-art deep neural networks have been shown to achieve superior object detection performance, their high computational and memory requirements make them cost prohibitive for on-device operation. Alternatively, cloud-based operation leads to privacy concerns, both not attractive to potential users. To address these challenges, this study investigates creating an efficient object detection network specifically for OLIV, an AI-powered assistant for object localization for the visually impaired, via micro-architecture design exploration. In particular, we formulate the problem of finding an optimal network micro-architecture as an numerical optimization problem, where we find the set of hyperparameters controlling the MobileNetV2-SSD network micro-architecture that maximizes a modified NetScore objective function for the MSCOCO-OLIV dataset of indoor objects. Experimental results show that such a micro-architecture design exploration strategy leads to a compact deep neural network with a balanced trade-off between accuracy, size, and speed, making it well-suited for enabling on-device computer vision driven assistive devices for the visually impaired.
Tasks	Object Detection, Object Localization
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07836v1
PDF	https://arxiv.org/pdf/1905.07836v1.pdf
PWC	https://paperswithcode.com/paper/enabling-computer-vision-driven-assistive
Repo
Framework

An AI-based, Multi-stage detection system of banking botnets


Title	An AI-based, Multi-stage detection system of banking botnets
Authors	Li Ling, Zhiqiang Gao, Michael A Silas, Ian Lee, Erwan A Le Doeuff
Abstract	Banking Trojans, botnets are primary drivers of financially-motivated cybercrime. In this paper, we first analyzed how an APT-based banking botnet works step by step through the whole lifecycle. Specifically, we present a multi-stage system that detects malicious banking botnet activities which potentially target the organizations. The system leverages Cyber Data Lake as well as multiple artificial intelligence techniques at different stages. The evaluation results using public datasets showed that Deep Learning based detections were highly successful compared with baseline models.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08276v3
PDF	https://arxiv.org/pdf/1907.08276v3.pdf
PWC	https://paperswithcode.com/paper/an-ai-based-multi-stage-detection-system-of
Repo
Framework

Federated Learning with Bayesian Differential Privacy


Title	Federated Learning with Bayesian Differential Privacy
Authors	Aleksei Triastcyn, Boi Faltings
Abstract	We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at different levels. Our experiments show significant advantage over the state-of-the-art differential privacy bounds for federated learning on image classification tasks, including a medical application, bringing the privacy budget below 1 at the client level, and below 0.1 at the instance level. Lower amounts of noise also benefit the model accuracy and reduce the number of communication rounds.
Tasks	Image Classification
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10071v1
PDF	https://arxiv.org/pdf/1911.10071v1.pdf
PWC	https://paperswithcode.com/paper/federated-learning-with-bayesian-differential
Repo
Framework

A Novel Adaptive Kernel for the RBF Neural Networks


Title	A Novel Adaptive Kernel for the RBF Neural Networks
Authors	Shujaat Khan, Imran Naseem, Roberto Togneri, Mohammed Bennamoun
Abstract	In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed framework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03546v1
PDF	https://arxiv.org/pdf/1905.03546v1.pdf
PWC	https://paperswithcode.com/paper/190503546
Repo
Framework

Real-Time Boiler Control Optimization with Machine Learning


Title	Real-Time Boiler Control Optimization with Machine Learning
Authors	Yukun Ding, Yiyu Shi
Abstract	In coal-fired power plants, it is critical to improve the operational efficiency of boilers for sustainability. In this work, we formulate real-time boiler control as an optimization problem that looks for the best distribution of temperature in different zones and oxygen content from the flue to improve the boiler’s stability and energy efficiency. We employ an efficient algorithm by integrating appropriate machine learning and optimization techniques. We obtain a large dataset collected from a real boiler for more than two months from our industry partner, and conduct extensive experiments to demonstrate the effectiveness and efficiency of the proposed algorithm.
Tasks
Published	2019-03-07
URL	http://arxiv.org/abs/1903.04958v1
PDF	http://arxiv.org/pdf/1903.04958v1.pdf
PWC	https://paperswithcode.com/paper/real-time-boiler-control-optimization-with
Repo
Framework

A Survey on Neural Architecture Search


Title	A Survey on Neural Architecture Search
Authors	Martin Wistuba, Ambrish Rawat, Tejaswini Pedapati
Abstract	The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of a wide variety of automated methods for neural architecture search. The choice of the network architecture has proven to be critical, and many advances in deep learning spring from its immediate improvements. However, deep learning techniques are computationally intensive and their application requires a high level of domain knowledge. Therefore, even partial automation of this process helps to make deep learning more accessible to both researchers and practitioners. With this survey, we provide a formalism which unifies and categorizes the landscape of existing methods along with a detailed analysis that compares and contrasts the different approaches. We achieve this via a comprehensive discussion of the commonly adopted architecture search spaces and architecture optimization algorithms based on principles of reinforcement learning and evolutionary algorithms along with approaches that incorporate surrogate and one-shot models. Additionally, we address the new research directions which include constrained and multi-objective architecture search as well as automated data augmentation, optimizer and activation function search.
Tasks	Data Augmentation, Neural Architecture Search
Published	2019-05-04
URL	https://arxiv.org/abs/1905.01392v2
PDF	https://arxiv.org/pdf/1905.01392v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-neural-architecture-search
Repo
Framework

Federated Multi-task Hierarchical Attention Model for Sensor Analytics


Title	Federated Multi-task Hierarchical Attention Model for Sensor Analytics
Authors	Yujing Chen, Yue Ning, Zheng Chai, Huzefa Rangwala
Abstract	Sensors are an integral part of modern Internet of Things (IoT) applications. There is a critical need for the analysis of heterogeneous multivariate temporal data obtained from the individual sensors of these systems. In this paper we particularly focus on the problem of the scarce amount of training data available per sensor. We propose a novel federated multi-task hierarchical attention model (FATHOM) that jointly trains classification/regression models from multiple sensors. The attention mechanism of the proposed model seeks to extract feature representations from the input and learn a shared representation focused on time dimensions across multiple sensors. The underlying temporal and non-linear relationships are modeled using a combination of attention mechanism and long-short term memory (LSTM) networks. We find that our proposed method outperforms a wide range of competitive baselines in both classification and regression settings on activity recognition and environment monitoring datasets. We further provide visualization of feature representations learned by our model at the input sensor level and central time level.
Tasks	Activity Recognition
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05142v1
PDF	https://arxiv.org/pdf/1905.05142v1.pdf
PWC	https://paperswithcode.com/paper/federated-multi-task-hierarchical-attention
Repo
Framework

Towards Inverse Reinforcement Learning for Limit Order Book Dynamics


Title	Towards Inverse Reinforcement Learning for Limit Order Book Dynamics
Authors	Jacobo Roa-Vicens, Cyrine Chtourou, Angelos Filos, Francisco Rullan, Yarin Gal, Ricardo Silva
Abstract	Multi-agent learning is a promising method to simulate aggregate competitive behaviour in finance. Learning expert agents’ reward functions through their external demonstrations is hence particularly relevant for subsequent design of realistic agent-based simulations. Inverse Reinforcement Learning (IRL) aims at acquiring such reward functions through inference, allowing to generalize the resulting policy to states not observed in the past. This paper investigates whether IRL can infer such rewards from agents within real financial stochastic environments: limit order books (LOB). We introduce a simple one-level LOB, where the interactions of a number of stochastic agents and an expert trading agent are modelled as a Markov decision process. We consider two cases for the expert’s reward: either a simple linear function of state features; or a complex, more realistic non-linear function. Given the expert agent’s demonstrations, we attempt to discover their strategy by modelling their latent reward function using linear and Gaussian process (GP) regressors from previous literature, and our own approach through Bayesian neural networks (BNN). While the three methods can learn the linear case, only the GP-based and our proposed BNN methods are able to discover the non-linear reward case. Our BNN IRL algorithm outperforms the other two approaches as the number of samples increases. These results illustrate that complex behaviours, induced by non-linear reward functions amid agent-based stochastic scenarios, can be deduced through inference, encouraging the use of inverse reinforcement learning for opponent-modelling in multi-agent systems.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04813v1
PDF	https://arxiv.org/pdf/1906.04813v1.pdf
PWC	https://paperswithcode.com/paper/towards-inverse-reinforcement-learning-for
Repo
Framework

Face Recognition: A Novel Multi-Level Taxonomy based Survey


Title	Face Recognition: A Novel Multi-Level Taxonomy based Survey
Authors	Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia
Abstract	In a world where security issues have been gaining growing importance, face recognition systems have attracted increasing attention in multiple application areas, ranging from forensics and surveillance to commerce and entertainment. To help understanding the landscape and abstraction levels relevant for face recognition systems, face recognition taxonomies allow a deeper dissection and comparison of the existing solutions. This paper proposes a new, more encompassing and richer multi-level face recognition taxonomy, facilitating the organization and categorization of available and emerging face recognition solutions; this taxonomy may also guide researchers in the development of more efficient face recognition solutions. The proposed multi-level taxonomy considers levels related to the face structure, feature support and feature extraction approach. Following the proposed taxonomy, a comprehensive survey of representative face recognition solutions is presented. The paper concludes with a discussion on current algorithmic and application related challenges which may define future research directions for face recognition.
Tasks	Face Recognition
Published	2019-01-03
URL	http://arxiv.org/abs/1901.00713v1
PDF	http://arxiv.org/pdf/1901.00713v1.pdf
PWC	https://paperswithcode.com/paper/face-recognition-a-novel-multi-level-taxonomy
Repo
Framework

Parametric Gaussian Process Regressors


Title	Parametric Gaussian Process Regressors
Authors	Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner
Abstract	The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et.~al.~2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods–often by as much as half a nat per datapoint.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07123v2
PDF	https://arxiv.org/pdf/1910.07123v2.pdf
PWC	https://paperswithcode.com/paper/sparse-gaussian-process-regression-beyond
Repo
Framework

Multi-View Multiple Clustering


Title	Multi-View Multiple Clustering
Authors	Shixing Yao, Guoxian Yu, Jun Wang, Carlotta Domeniconi, Xiangliang Zhang
Abstract	Multiple clustering aims at exploring alternative clusterings to organize the data into meaningful groups from different perspectives. Existing multiple clustering algorithms are designed for single-view data. We assume that the individuality and commonality of multi-view data can be leveraged to generate high-quality and diverse clusterings. To this end, we propose a novel multi-view multiple clustering (MVMC) algorithm. MVMC first adapts multi-view self-representation learning to explore the individuality encoding matrices and the shared commonality matrix of multi-view data. It additionally reduces the redundancy (i.e., enhancing the individuality) among the matrices using the Hilbert-Schmidt Independence Criterion (HSIC), and collects shared information by forcing the shared matrix to be smooth across all views. It then uses matrix factorization on the individual matrices, along with the shared matrix, to generate diverse clusterings of high-quality. We further extend multiple co-clustering on multi-view data and propose a solution called multi-view multiple co-clustering (MVMCC). Our empirical study shows that MVMC (MVMCC) can exploit multi-view data to generate multiple high-quality and diverse clusterings (co-clusterings), with superior performance to the state-of-the-art methods.
Tasks	Representation Learning
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05053v1
PDF	https://arxiv.org/pdf/1905.05053v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-multiple-clustering
Repo
Framework

Distributional property testing in a quantum world


Title	Distributional property testing in a quantum world
Authors	András Gilyén, Tongyang Li
Abstract	A fundamental problem in statistics and learning theory is to test properties of distributions. We show that quantum computers can solve such problems with significant speed-ups. In particular, we give fast quantum algorithms for testing closeness between unknown distributions, testing independence between two distributions, and estimating the Shannon / von Neumann entropy of distributions. The distributions can be either classical or quantum, however our quantum algorithms require coherent quantum access to a process preparing the samples. Our results build on the recent technique of quantum singular value transformation, combined with more standard tricks such as divide-and-conquer. The presented approach is a natural fit for distributional property testing both in the classical and the quantum case, demonstrating the first speed-ups for testing properties of density operators that can be accessed coherently rather than only via sampling; for classical distributions our algorithms significantly improve the precision dependence of some earlier results.
Tasks
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00814v1
PDF	http://arxiv.org/pdf/1902.00814v1.pdf
PWC	https://paperswithcode.com/paper/distributional-property-testing-in-a-quantum
Repo
Framework

Reducing Adversarial Example Transferability Using Gradient Regularization


Title	Reducing Adversarial Example Transferability Using Gradient Regularization
Authors	George Adam, Petr Smirnov, Benjamin Haibe-Kains, Anna Goldenberg
Abstract	Deep learning algorithms have increasingly been shown to lack robustness to simple adversarial examples (AdvX). An equally troubling observation is that these adversarial examples transfer between different architectures trained on different datasets. We investigate the transferability of adversarial examples between models using the angle between the input-output Jacobians of different models. To demonstrate the relevance of this approach, we perform case studies that involve jointly training pairs of models. These case studies empirically justify the theoretical intuitions for why the angle between gradients is a fundamental quantity in AdvX transferability. Furthermore, we consider the asymmetry of AdvX transferability between two models of the same architecture and explain it in terms of differences in gradient norms between the models. Lastly, we provide a simple modification to existing training setups that reduces transferability of adversarial examples between pairs of models.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07980v1
PDF	http://arxiv.org/pdf/1904.07980v1.pdf
PWC	https://paperswithcode.com/paper/reducing-adversarial-example-transferability
Repo
Framework

Compression of Recurrent Neural Networks for Efficient Language Modeling


Title	Compression of Recurrent Neural Networks for Efficient Language Modeling
Authors	Artem M. Grachev, Dmitry I. Ignatov, Andrey V. Savchenko
Abstract	Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression-perplexity balance are obtained by matrix decomposition techniques.
Tasks	Language Modelling, Quantization
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02380v1
PDF	http://arxiv.org/pdf/1902.02380v1.pdf
PWC	https://paperswithcode.com/paper/compression-of-recurrent-neural-networks-for
Repo
Framework