January 27, 2020

3412 words 17 mins read

Paper Group ANR 1203

Paper Group ANR 1203

Cardiac MRI Segmentation with Strong Anatomical Guarantees. A necessary and sufficient stability notion for adaptive generalization. Necessary and Sufficient Geometries for Gradient Methods. Adaptive Configuration Oracle for Online Portfolio Selection Methods. Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay …

Cardiac MRI Segmentation with Strong Anatomical Guarantees

Title Cardiac MRI Segmentation with Strong Anatomical Guarantees
Authors Nathan Painchaud, Youssef Skandarani, Thierry Judge, Olivier Bernard, Alain Lalande, Pierre-Marc Jodoin
Abstract Recent publications have shown that the segmentation accuracy of modern-day convolutional neural networks (CNN) applied on cardiac MRI can reach the inter-expert variability, a great achievement in this area of research. However, despite these successes, CNNs still produce anatomically inaccurate segmentations as they provide no guarantee on the anatomical plausibility of their outcome, even when using a shape prior. In this paper, we propose a cardiac MRI segmentation method which always produces anatomically plausible results. At the core of the method is an adversarial variational autoencoder (aVAE) whose latent space encodes a smooth manifold on which lies a large spectrum of valid cardiac shapes. This aVAE is used to automatically warp anatomically inaccurate cardiac shapes towards a close but correct shape. Our method can accommodate any cardiac segmentation method and convert its anatomically implausible results to plausible ones without affecting its overall geometric and clinical metrics. With our method, CNNs can now produce results that are both within the inter-expert variability and always anatomically plausible.
Tasks Cardiac Segmentation
Published 2019-07-05
URL https://arxiv.org/abs/1907.02865v2
PDF https://arxiv.org/pdf/1907.02865v2.pdf
PWC https://paperswithcode.com/paper/cardiac-mri-segmentation-with-strong
Repo
Framework

A necessary and sufficient stability notion for adaptive generalization

Title A necessary and sufficient stability notion for adaptive generalization
Authors Katrina Ligett, Moshe Shenfeld
Abstract We introduce a new notion of the stability of computations, which holds under post-processing and adaptive composition. We show that the notion is both necessary and sufficient to ensure generalization in the face of adaptivity, for any computations that respond to bounded-sensitivity linear queries while providing accuracy with respect to the data sample set. The stability notion is based on quantifying the effect of observing a computation’s outputs on the posterior over the data sample elements. We show a separation between this stability notion and previously studied notion and observe that all differentially private algorithms also satisfy this notion.
Tasks
Published 2019-06-03
URL https://arxiv.org/abs/1906.00930v2
PDF https://arxiv.org/pdf/1906.00930v2.pdf
PWC https://paperswithcode.com/paper/190600930
Repo
Framework

Necessary and Sufficient Geometries for Gradient Methods

Title Necessary and Sufficient Geometries for Gradient Methods
Authors Daniel Levy, John C. Duchi
Abstract We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex—for example, any $\ell_p$-ball for $p < 2$—the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.
Tasks
Published 2019-09-23
URL https://arxiv.org/abs/1909.10455v2
PDF https://arxiv.org/pdf/1909.10455v2.pdf
PWC https://paperswithcode.com/paper/necessary-and-sufficient-conditions-for-2
Repo
Framework

Adaptive Configuration Oracle for Online Portfolio Selection Methods

Title Adaptive Configuration Oracle for Online Portfolio Selection Methods
Authors Favour M. Nyikosa, Michael A. Osborne, Stephen J. Roberts
Abstract Financial markets are complex environments that produce enormous amounts of noisy and non-stationary data. One fundamental problem is online portfolio selection, the goal of which is to exploit this data to sequentially select portfolios of assets to achieve positive investment outcomes while managing risks. Various algorithms have been proposed for solving this problem in fields such as finance, statistics and machine learning, among others. Most of the methods have parameters that are estimated from backtests for good performance. Since these algorithms operate on non-stationary data that reflects the complexity of financial markets, we posit that adaptively tuning these parameters in an intelligent manner is a remedy for dealing with this complexity. In this paper, we model the mapping between the parameter space and the space of performance metrics using a Gaussian process prior. We then propose an oracle based on adaptive Bayesian optimization for automatically and adaptively configuring online portfolio selection methods. We test the efficacy of our solution on algorithms operating on equity and index data from various markets.
Tasks
Published 2019-08-22
URL https://arxiv.org/abs/1908.08258v1
PDF https://arxiv.org/pdf/1908.08258v1.pdf
PWC https://paperswithcode.com/paper/adaptive-configuration-oracle-for-online
Repo
Framework

Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness

Title Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness
Authors Xueru Zhang, Mohammad Mahdi Khalili, Cem Tekin, Mingyan Liu
Abstract Machine Learning (ML) models trained on data from multiple demographic groups can inherit representation disparity (Hashimoto et al., 2018) that may exist in the data: the model may be less favorable to groups contributing less to the training process; this in turn can degrade population retention in these groups over time, and exacerbate representation disparity in the long run. In this study, we seek to understand the interplay between ML decisions and the underlying group representation, how they evolve in a sequential framework, and how the use of fairness criteria plays a role in this process. We show that the representation disparity can easily worsen over time under a natural user dynamics (arrival and departure) model when decisions are made based on a commonly used objective and fairness criteria, resulting in some groups diminishing entirely from the sample pool in the long run. It highlights the fact that fairness criteria have to be defined while taking into consideration the impact of decisions on user dynamics. Toward this end, we explain how a proper fairness criterion can be selected based on a general user dynamics model.
Tasks Decision Making
Published 2019-05-02
URL https://arxiv.org/abs/1905.00569v2
PDF https://arxiv.org/pdf/1905.00569v2.pdf
PWC https://paperswithcode.com/paper/long-term-impact-of-fair-machine-learning-in
Repo
Framework

On the Noisy Gradient Descent that Generalizes as SGD

Title On the Noisy Gradient Descent that Generalizes as SGD
Authors Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu
Abstract The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning. While past studies confirm that the magnitude and the covariance structure of gradient noise are critical for regularization, it remains unclear whether or not the class of noise distributions is important. In this work we provide negative results by showing that noises in classes different from the SGD noise can also effectively regularize gradient descent. Our finding is based on a novel observation on the structure of the SGD noise: it is the multiplication of the gradient matrix and a sampling noise that arises from the mini-batch sampling procedure. Moreover, the sampling noises unify two kinds of gradient regularizing noises that belong to the Gaussian class: the one using (scaled) Fisher as covariance and the one using the gradient covariance of SGD as covariance. Finally, thanks to the flexibility of choosing noise class, an algorithm is proposed to perform noisy gradient descent that generalizes well, the variant of which even benefits large batch SGD training without hurting generalization.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07405v2
PDF https://arxiv.org/pdf/1906.07405v2.pdf
PWC https://paperswithcode.com/paper/the-multiplicative-noise-in-stochastic
Repo
Framework

Efficient Matrix Profile Computation Using Different Distance Functions

Title Efficient Matrix Profile Computation Using Different Distance Functions
Authors Reza Akbarinia, Bertrand Cloez
Abstract Matrix profile has been recently proposed as a promising technique to the problem of all-pairs-similarity search on time series. Efficient algorithms have been proposed for computing it, e.g., STAMP, STOMP and SCRIMP++. All these algorithms use the z-normalized Euclidean distance to measure the distance between subsequences. However, as we observed, for some datasets other Euclidean measurements are more useful for knowledge discovery from time series. In this paper, we propose efficient algorithms for computing matrix profile for a general class of Euclidean distances. We first propose a simple but efficient algorithm called AAMP for computing matrix profile with the “pure” (non-normalized) Euclidean distance. Then, we extend our algorithm for the p-norm distance. We also propose an algorithm, called ACAMP, that uses the same principle as AAMP, but for the case of z-normalized Euclidean distance. We implemented our algorithms, and evaluated their performance through experimentation. The experiments show excellent performance results. For example, they show that AAMP is very efficient for computing matrix profile for non-normalized Euclidean distances. The results also show that the ACAMP algorithm is significantly faster than SCRIMP++ (the state of the art matrix profile algorithm) for the case of z-normalized Euclidean distance.
Tasks Time Series
Published 2019-01-17
URL http://arxiv.org/abs/1901.05708v1
PDF http://arxiv.org/pdf/1901.05708v1.pdf
PWC https://paperswithcode.com/paper/efficient-matrix-profile-computation-using
Repo
Framework

SEdroid: A Robust Android Malware Detector using Selective Ensemble Learning

Title SEdroid: A Robust Android Malware Detector using Selective Ensemble Learning
Authors Ji Wang, Qi Jing, Jianbo Gao
Abstract For the dramatic increase of Android malware and low efficiency of manual check process, deep learning methods started to be an auxiliary means for Android malware detection these years. However, these models are highly dependent on the quality of datasets, and perform unsatisfactory results when the quality of training data is not good enough. In the real world, the quality of datasets without manually check cannot be guaranteed, even Google Play may contain malicious applications, which will cause the trained model failure. To address the challenge, we propose a robust Android malware detection approach based on selective ensemble learning, trying to provide an effective solution not that limited to the quality of datasets. The proposed model utilizes genetic algorithm to help find the best combination of the component learners and improve robustness of the model. Our results show that the proposed approach achieves a more robust performance than other approaches in the same area.
Tasks Android Malware Detection, Malware Detection
Published 2019-09-06
URL https://arxiv.org/abs/1909.03837v1
PDF https://arxiv.org/pdf/1909.03837v1.pdf
PWC https://paperswithcode.com/paper/sedroid-a-robust-android-malware-detector
Repo
Framework

On Defending Against Label Flipping Attacks on Malware Detection Systems

Title On Defending Against Label Flipping Attacks on Malware Detection Systems
Authors Rahim Taheri, Reza Javidan, Mohammad Shojafar, Zahra Pooranian, Ali Miri, Mauro Conti
Abstract Label manipulation attacks are a subclass of data poisoning attacks in adversarial machine learning used against different applications, such as malware detection. These types of attacks represent a serious threat to detection systems in environments having high noise rate or uncertainty, such as complex networks and Internet of Thing (IoT). Recent work in the literature has suggested using the $K$-Nearest Neighboring (KNN) algorithm to defend against such attacks. However, such an approach can suffer from low to wrong detection accuracy. In this paper, we design an architecture to tackle the Android malware detection problem in IoT systems. We develop an attack mechanism based on Silhouette clustering method, modified for mobile Android platforms. We proposed two Convolutional Neural Network (CNN)-type deep learning algorithms against this \emph{Silhouette Clustering-based Label Flipping Attack (SCLFA)}. We show the effectiveness of these two defense algorithms - \emph{Label-based Semi-supervised Defense (LSD)} and \emph{clustering-based Semi-supervised Defense (CSD)} - in correcting labels being attacked. We evaluate the performance of the proposed algorithms by varying the various machine learning parameters on three Android datasets: Drebin, Contagio, and Genome and three types of features: API, intent, and permission. Our evaluation shows that using random forest feature selection and varying ratios of features can result in an improvement of up to 19% accuracy when compared with the state-of-the-art method in the literature.
Tasks Android Malware Detection, data poisoning, Feature Selection, Malware Detection
Published 2019-08-13
URL https://arxiv.org/abs/1908.04473v2
PDF https://arxiv.org/pdf/1908.04473v2.pdf
PWC https://paperswithcode.com/paper/on-defending-against-label-flipping-attacks
Repo
Framework

Similarity-based Android Malware Detection Using Hamming Distance of Static Binary Features

Title Similarity-based Android Malware Detection Using Hamming Distance of Static Binary Features
Authors Rahim Taheri, Meysam Ghahramani, Reza Javidan, Mohammad Shojafar, Zahra Pooranian, Mauro Conti
Abstract In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can trigger the alarm if we detect an Android app is malicious. Hence, our solutions help us to avoid the spread of detected malware on a broader scale. We provide a detailed description of the proposed detection methods and related algorithms. We include an extensive analysis to asses the suitability of our proposed similarity-based detection methods. In this way, we perform our experiments on three datasets, including benign and malware Android apps like Drebin, Contagio, and Genome. Thus, to corroborate the actual effectiveness of our classifier, we carry out performance comparisons with some state-of-the-art classification and malware detection algorithms, namely Mixed and Separated solutions, the program dissimilarity measure based on entropy (PDME) and the FalDroid algorithms. We test our experiments in a different type of features: API, intent, and permission features on these three datasets. The results confirm that accuracy rates of proposed algorithms are more than 90% and in some cases (i.e., considering API features) are more than 99%, and are comparable with existing state-of-the-art solutions.
Tasks Android Malware Detection, Malware Detection
Published 2019-08-13
URL https://arxiv.org/abs/1908.05759v2
PDF https://arxiv.org/pdf/1908.05759v2.pdf
PWC https://paperswithcode.com/paper/similarity-based-android-malware-detection
Repo
Framework

Network Embedding: on Compression and Learning

Title Network Embedding: on Compression and Learning
Authors Esra Akbas, Mehmet Aktas
Abstract Recently, network embedding that encodes structural information of graphs into a vector space has become popular for network analysis. Although recent methods show promising performance for various applications, the huge sizes of graphs may hinder a direct application of existing network embedding method to them. This paper presents NECL, a novel efficient Network Embedding method with two goals. 1) Is there an ideal Compression of a network? 2) Will the compression of a network significantly boost the representation Learning of the network? For the first problem, we propose a neighborhood similarity based graph compression method that compresses the input graph to get a smaller graph without losing any/much information about the global structure of the graph and the local proximity of the vertices in the graph. For the second problem, we use the compressed graph for network embedding instead of the original large graph to bring down the embedding cost. NECL is a general meta-strategy to improve the efficiency of all of the state-of-the-art graph embedding algorithms based on random walks, including DeepWalk and Node2vec, without losing their effectiveness. Extensive experiments on large real-world networks validate the efficiency of NECL method that yields an average improvement of 23 - 57% embedding time, including walking and learning time without decreasing classification accuracy as evaluated on single and multi-label classification tasks on real-world graphs such as DBLP, BlogCatalog, Cora and Wiki.
Tasks Graph Embedding, Multi-Label Classification, Network Embedding, Representation Learning
Published 2019-07-05
URL https://arxiv.org/abs/1907.02811v2
PDF https://arxiv.org/pdf/1907.02811v2.pdf
PWC https://paperswithcode.com/paper/network-embedding-on-compression-and-learning
Repo
Framework

Geometric Insights into the Convergence of Nonlinear TD Learning

Title Geometric Insights into the Convergence of Nonlinear TD Learning
Authors David Brandfonbrener, Joan Bruna
Abstract While there are convergence guarantees for temporal difference (TD) learning when using linear function approximators, the situation for nonlinear models is far less understood, and divergent examples are known. Here we take a first step towards extending theoretical convergence guarantees to TD learning with nonlinear function approximation. More precisely, we consider the expected learning dynamics of the TD(0) algorithm for value estimation. As the step-size converges to zero, these dynamics are defined by a nonlinear ODE which depends on the geometry of the space of function approximators, the structure of the underlying Markov chain, and their interaction. We find a set of function approximators that includes ReLU networks and has geometry amenable to TD learning regardless of environment, so that the solution performs about as well as linear TD in the worst case. Then, we show how environments that are more reversible induce dynamics that are better for TD learning and prove global convergence to the true value function for well-conditioned function approximators. Finally, we generalize a divergent counterexample to a family of divergent problems to demonstrate how the interaction between approximator and environment can go wrong and to motivate the assumptions needed to prove convergence.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12185v4
PDF https://arxiv.org/pdf/1905.12185v4.pdf
PWC https://paperswithcode.com/paper/on-the-expected-dynamics-of-nonlinear-td
Repo
Framework

Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses

Title Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses
Authors Ananya B. Sai, Mithun Das Gupta, Mitesh M. Khapra, Mukundhan Srinivasan
Abstract Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem. ADEM(Lowe et al. 2017) formulated the automatic evaluation of dialogue systems as a learning problem and showed that such a model was able to predict responses which correlate significantly with human judgements, both at utterance and system level. Their system was shown to have beaten word-overlap metrics such as BLEU with large margins. We start with the question of whether an adversary can game the ADEM model. We design a battery of targeted attacks at the neural network based ADEM evaluation system and show that automatic evaluation of dialogue systems still has a long way to go. ADEM can get confused with a variation as simple as reversing the word order in the text! We report experiments on several such adversarial scenarios that draw out counterintuitive scores on the dialogue responses. We take a systematic look at the scoring function proposed by ADEM and connect it to linear system theory to predict the shortcomings evident in the system. We also devise an attack that can fool such a system to rate a response generation system as favorable. Finally, we allude to future research directions of using the adversarial attacks to design a truly automated dialogue evaluation system.
Tasks
Published 2019-02-23
URL http://arxiv.org/abs/1902.08832v1
PDF http://arxiv.org/pdf/1902.08832v1.pdf
PWC https://paperswithcode.com/paper/re-evaluating-adem-a-deeper-look-at-scoring
Repo
Framework

A time resolved clustering method revealing longterm structures and their short-term internal dynamics

Title A time resolved clustering method revealing longterm structures and their short-term internal dynamics
Authors Jonas I. Liechti, Sebastian Bonhoeffer
Abstract The last decades have not only been characterized by an explosive growth of data, but also an increasing appreciation of data as a valuable resource. Their value comes with the ability to extract meaningful patterns that are of economic, societal or scientific relevance. A particular challenge is the identification of patterns across time, including those that might only become apparent when the temporal dimension is taken into account. Here, we present a novel method that aims to achieve this by detecting dynamic clusters, i.e. structural elements that can be present over prolonged durations. It is based on an adaptive identification of majority overlaps between groups at different time points and accommodates the transient decompositions in otherwise persistent dynamic clusters. Our method enables the detection of persistent structural elements with internal dynamics and can be applied to any classifiable data, ranging from social contact networks to arbitrary sets of time stamped feature vectors. It represents a unique tool to study systems with non-trivial temporal dynamics and has a broad applicability to scientific, societal and economic data.
Tasks
Published 2019-12-09
URL https://arxiv.org/abs/1912.04261v2
PDF https://arxiv.org/pdf/1912.04261v2.pdf
PWC https://paperswithcode.com/paper/a-time-resolved-clustering-method-revealing
Repo
Framework

SAFE: Scale Aware Feature Encoder for Scene Text Recognition

Title SAFE: Scale Aware Feature Encoder for Scene Text Recognition
Authors Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong
Abstract In this paper, we address the problem of having characters with different scales in scene text recognition. We propose a novel scale aware feature encoder (SAFE) that is designed specifically for encoding characters with different scales. SAFE is composed of a multi-scale convolutional encoder and a scale attention network. The multi-scale convolutional encoder targets at extracting character features under multiple scales, and the scale attention network is responsible for selecting features from the most relevant scale(s). SAFE has two main advantages over the traditional single-CNN encoder used in current state-of-the-art text recognizers. First, it explicitly tackles the scale problem by extracting scale-invariant features from the characters. This allows the recognizer to put more effort in handling other challenges in scene text recognition, like those caused by view distortion and poor image quality. Second, it can transfer the learning of feature encoding across different character scales. This is particularly important when the training set has a very unbalanced distribution of character scales, as training with such a dataset will make the encoder biased towards extracting features from the predominant scale. To evaluate the effectiveness of SAFE, we design a simple text recognizer named scale-spatial attention network (S-SAN) that employs SAFE as its feature encoder, and carry out experiments on six public benchmarks. Experimental results demonstrate that S-SAN can achieve state-of-the-art (or, in some cases, extremely competitive) performance without any post-processing.
Tasks Scene Text Recognition
Published 2019-01-17
URL http://arxiv.org/abs/1901.05770v1
PDF http://arxiv.org/pdf/1901.05770v1.pdf
PWC https://paperswithcode.com/paper/safe-scale-aware-feature-encoder-for-scene
Repo
Framework
comments powered by Disqus