Paper Group ANR 721
BPPSA: Scaling Back-propagation by Parallel Scan Algorithm. Combining Representation Learning with Tensor Factorization for Risk Factor Analysis - an application to Epilepsy and Alzheimer’s disease. Deep Reinforcement Learning for Foreign Exchange Trading. On the Equivalence between Node Embeddings and Structural Graph Representations. Multi-node e …
BPPSA: Scaling Back-propagation by Parallel Scan Algorithm
Title | BPPSA: Scaling Back-propagation by Parallel Scan Algorithm |
Authors | Shang Wang, Yifan Bai, Gennady Pekhimenko |
Abstract | In an era when the performance of a single compute device plateaus, software must be designed to scale on massively parallel systems for better runtime performance. However, in the context of training deep learning models, the popular back-propagation (BP) algorithm imposes a strong sequential dependency in the process of gradient computation. Under model parallelism, BP takes $\Theta (n)$ steps to complete which hinders its scalability on parallel systems ($n$ represents the number of compute devices into which a model is partitioned). In this work, in order to improve the scalability of BP, we reformulate BP into a scan operation which is a primitive that performs an in-order aggregation on a sequence of values and returns the partial result at each step. We can then scale such reformulation of BP on parallel systems by our modified version of the Blelloch scan algorithm which theoretically takes $\Theta (\log n)$ steps. We evaluate our approach on a vanilla Recurrent Neural Network (RNN) training with synthetic datasets and a RNN with Gated Recurrent Units (GRU) training with the IRMAS dataset, and demonstrate up to $2.75\times$ speedup on the overall training time and $108\times$ speedup on the backward pass. We also demonstrate that the retraining of pruned networks can be a practical use case of our method. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.10134v3 |
https://arxiv.org/pdf/1907.10134v3.pdf | |
PWC | https://paperswithcode.com/paper/scaling-back-propagation-by-parallel-scan |
Repo | |
Framework | |
Combining Representation Learning with Tensor Factorization for Risk Factor Analysis - an application to Epilepsy and Alzheimer’s disease
Title | Combining Representation Learning with Tensor Factorization for Risk Factor Analysis - an application to Epilepsy and Alzheimer’s disease |
Authors | Xiaoqian Jiang, Samden Lhatoo, Guo-Qiang Zhang, Luyao Chen, Yejin Kim |
Abstract | Existing studies consider Alzheimer’s disease (AD) a comorbidity of epilepsy, but also recognize epilepsy to occur more frequently in patients with AD than those without. The goal of this paper is to understand the relationship between epilepsy and AD by studying causal relations among subgroups of epilepsy patients. We develop an approach combining representation learning with tensor factorization to provide an in-depth analysis of the risk factors among epilepsy patients for AD. An epilepsy-AD cohort of ~600,000 patients were extracted from Cerner Health Facts data (50M patients). Our experimental results not only suggested a causal relationship between epilepsy and later onset of AD ( p = 1.92e-51), but also identified five epilepsy subgroups with distinct phenotypic patterns leading to AD. While such findings are preliminary, the proposed method combining representation learning with tensor factorization seems to be an effective approach for risk factor analysis. |
Tasks | Representation Learning |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05830v1 |
https://arxiv.org/pdf/1905.05830v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-representation-learning-with-tensor |
Repo | |
Framework | |
Deep Reinforcement Learning for Foreign Exchange Trading
Title | Deep Reinforcement Learning for Foreign Exchange Trading |
Authors | Chun-Chieh Wang, Yun-Cheng Tsai |
Abstract | Reinforcement learning can interact with the environment and is suitable for applications in decision control systems. Therefore, we used the reinforcement learning method to establish a foreign exchange transaction, avoiding the long-standing problem of unstable trends in deep learning predictions. In the system design, we optimized the Sure-Fire statistical arbitrage policy, set three different actions, encoded the continuous price over a period of time into a heat-map view of the Gramian Angular Field (GAF) and compared the Deep Q Learning (DQN) and Proximal Policy Optimization (PPO) algorithms. To test feasibility, we analyzed three currency pairs, namely EUR/USD, GBP/USD, and AUD/USD. We trained the data in units of four hours from 1 August 2018 to 30 November 2018 and tested model performance using data between 1 December 2018 and 31 December 2018. The test results of the various models indicated that favorable investment performance was achieved as long as the model was able to handle complex and random processes and the state was able to describe the environment, validating the feasibility of reinforcement learning in the development of trading strategies. |
Tasks | Q-Learning |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.08036v1 |
https://arxiv.org/pdf/1908.08036v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-foreign |
Repo | |
Framework | |
On the Equivalence between Node Embeddings and Structural Graph Representations
Title | On the Equivalence between Node Embeddings and Structural Graph Representations |
Authors | Balasubramaniam Srinivasan, Bruno Ribeiro |
Abstract | This work provides the first unifying theoretical framework for node embeddings and structural graph representations, bridging methods like matrix factorization and graph neural networks. Using invariant theory, we show that the relationship between structural representations and node embeddings is analogous to that of a distribution and its samples. We prove that all tasks that can be performed by node embeddings can also be performed by structural representations and vice-versa. We also show that the concept of transductive and inductive learning is unrelated to node embeddings and graph representations, clearing another source of confusion in the literature. Finally, we introduce new practical guidelines to generating and using node embeddings, which fixes significant shortcomings of standard operating procedures used today. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00452v2 |
https://arxiv.org/pdf/1910.00452v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-equivalence-between-node-embeddings-1 |
Repo | |
Framework | |
Multi-node environment strategy for Parallel Deterministic Multi-Objective Fractal Decomposition
Title | Multi-node environment strategy for Parallel Deterministic Multi-Objective Fractal Decomposition |
Authors | Leo Souquet, Amir Nakib |
Abstract | This paper presents a new implementation of deterministic multiobjective (MO) optimization called Multiobjective Fractal Decomposition Algorithm (Mo-FDA). The original algorithm was designed for mono-objective large scale continuous optimization problems. It is based on a divide and conquer strategy and a geometric fractal decomposition of the search space using hyperspheres. Then, to deal with MO problems a scalarization approach is used. In this work, a new approach has been developed on a multi-node environment using containers. The performance of Mo-FDA was compared to state of the art algorithms from the literature on classical benchmark of multi-objective optimization |
Tasks | |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.02149v1 |
https://arxiv.org/pdf/1908.02149v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-node-environment-strategy-for-parallel |
Repo | |
Framework | |
Intrusion Detection for Industrial Control Systems: Evaluation Analysis and Adversarial Attacks
Title | Intrusion Detection for Industrial Control Systems: Evaluation Analysis and Adversarial Attacks |
Authors | Giulio Zizzo, Chris Hankin, Sergio Maffeis, Kevin Jones |
Abstract | Neural networks are increasingly used in security applications for intrusion detection on industrial control systems. In this work we examine two areas that must be considered for their effective use. Firstly, is their vulnerability to adversarial attacks when used in a time series setting. Secondly, is potential over-estimation of performance arising from data leakage artefacts. To investigate these areas we implement a long short-term memory (LSTM) based intrusion detection system (IDS) which effectively detects cyber-physical attacks on a water treatment testbed representing a strong baseline IDS. For investigating adversarial attacks we model two different white box attackers. The first attacker is able to manipulate sensor readings on a subset of the Secure Water Treatment (SWaT) system. By creating a stream of adversarial data the attacker is able to hide the cyber-physical attacks from the IDS. For the cyber-physical attacks which are detected by the IDS, the attacker required on average 2.48 out of 12 total sensors to be compromised for the cyber-physical attacks to be hidden from the IDS. The second attacker model we explore is an $L_{\infty}$ bounded attacker who can send fake readings to the IDS, but to remain imperceptible, limits their perturbations to the smallest $L_{\infty}$ value needed. Additionally, we examine data leakage problems arising from tuning for $F_1$ score on the whole SWaT attack set and propose a method to tune detection parameters that does not utilise any attack data. If attack after-effects are accounted for then our new parameter tuning method achieved an $F_1$ score of 0.811$\pm$0.0103. |
Tasks | Intrusion Detection, Time Series |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.04278v1 |
https://arxiv.org/pdf/1911.04278v1.pdf | |
PWC | https://paperswithcode.com/paper/intrusion-detection-for-industrial-control |
Repo | |
Framework | |
The Threat of Adversarial Attacks on Machine Learning in Network Security – A Survey
Title | The Threat of Adversarial Attacks on Machine Learning in Network Security – A Survey |
Authors | Olakunle Ibitoye, Rana Abou-Khamis, Ashraf Matrawy, M. Omair Shafiq |
Abstract | Machine learning models have made many decision support systems to be faster, more accurate and more efficient. However, applications of machine learning in network security face more disproportionate threat of active adversarial attacks compared to other domains. This is because machine learning applications in network security such as malware detection, intrusion detection, and spam filtering are by themselves adversarial in nature. In what could be considered an arms race between attackers and defenders, adversaries constantly probe machine learning systems with inputs which are explicitly designed to bypass the system and induce a wrong prediction. In this survey, we first provide a taxonomy of machine learning techniques, styles, and algorithms. We then introduce a classification of machine learning in network security applications. Next, we examine various adversarial attacks against machine learning in network security and introduce two classification approaches for adversarial attacks in network security. First, we classify adversarial attacks in network security based on a taxonomy of network security applications. Secondly, we categorize adversarial attacks in network security into a problem space vs. feature space dimensional classification model. We then analyze the various defenses against adversarial attacks on machine learning-based network security applications. We conclude by introducing an adversarial risk model and evaluate several existing adversarial attacks against machine learning in network security using the risk model. We also identify where each attack classification resides within the adversarial risk model |
Tasks | Intrusion Detection, Malware Detection |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02621v1 |
https://arxiv.org/pdf/1911.02621v1.pdf | |
PWC | https://paperswithcode.com/paper/the-threat-of-adversarial-attacks-on-machine |
Repo | |
Framework | |
Improved local search for graph edit distance
Title | Improved local search for graph edit distance |
Authors | Nicolas Boria, David B. Blumenthal, Sébastien Bougleux, Luc Brun |
Abstract | The graph edit distance (GED) measures the dissimilarity between two graphs as the minimal cost of a sequence of elementary operations transforming one graph into another. This measure is fundamental in many areas such as structural pattern recognition or classification. However, exactly computing GED is NP-hard. Among different classes of heuristic algorithms that were proposed to compute approximate solutions, local search based algorithms provide the tightest upper bounds for GED. In this paper, we present K-REFINE and RANDPOST. K-REFINE generalizes and improves an existing local search algorithm and performs particularly well on small graphs. RANDPOST is a general warm start framework that stochastically generates promising initial solutions to be used by any local search based GED algorithm. It is particularly efficient on large graphs. An extensive empirical evaluation demonstrates that both K-REFINE and RANDPOST perform excellently in practice. |
Tasks | |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02929v2 |
https://arxiv.org/pdf/1907.02929v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-local-search-for-graph-edit-distance |
Repo | |
Framework | |
Investigating Resistance of Deep Learning-based IDS against Adversaries using min-max Optimization
Title | Investigating Resistance of Deep Learning-based IDS against Adversaries using min-max Optimization |
Authors | Rana Abou Khamis, Omair Shafiq, Ashraf Matrawy |
Abstract | With the growth of adversarial attacks against machine learning models, several concerns have emerged about potential vulnerabilities in designing deep neural network-based intrusion detection systems (IDS). In this paper, we study the resilience of deep learning-based intrusion detection systems against adversarial attacks. We apply the min-max (or saddle-point) approach to train intrusion detection systems against adversarial attack samples in NSW-NB 15 dataset. We have the max approach for generating adversarial samples that achieves maximum loss and attack deep neural networks. On the other side, we utilize the existing min approach [2] [9] as a defense strategy to optimize intrusion detection systems that minimize the loss of the incorporated adversarial samples during the adversarial training. We study and measure the effectiveness of the adversarial attack methods as well as the resistance of the adversarially trained models against such attacks. We find that the adversarial attack methods that were designed in binary domains can be used in continuous domains and exhibit different misclassification levels. We finally show that principal component analysis (PCA) based feature reduction can boost the robustness in intrusion detection system (IDS) using a deep neural network (DNN). |
Tasks | Adversarial Attack, Intrusion Detection |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14107v1 |
https://arxiv.org/pdf/1910.14107v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-resistance-of-deep-learning |
Repo | |
Framework | |
Learning transport cost from subset correspondence
Title | Learning transport cost from subset correspondence |
Authors | Ruishan Liu, Akshay Balsubramani, James Zou |
Abstract | Learning to align multiple datasets is an important problem with many applications, and it is especially useful when we need to integrate multiple experiments or correct for confounding. Optimal transport (OT) is a principled approach to align datasets, but a key challenge in applying OT is that we need to specify a transport cost function that accurately captures how the two datasets are related. Reliable cost functions are typically not available and practitioners often resort to using hand-crafted or Euclidean cost even if it may not be appropriate. In this work, we investigate how to learn the cost function using a small amount of side information which is often available. The side information we consider captures subset correspondence—i.e. certain subsets of points in the two data sets are known to be related. For example, we may have some images labeled as cars in both datasets; or we may have a common annotated cell type in single-cell data from two batches. We develop an end-to-end optimizer (OT-SI) that differentiates through the Sinkhorn algorithm and effectively learns the suitable cost function from side information. On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks. |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13203v1 |
https://arxiv.org/pdf/1909.13203v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-transport-cost-from-subset-1 |
Repo | |
Framework | |
Image-Guided Depth Upsampling via Hessian and TV Priors
Title | Image-Guided Depth Upsampling via Hessian and TV Priors |
Authors | Alireza Ahrabian, Joao F. C. Mota, Andrew M. Wallace |
Abstract | We propose a method that combines sparse depth (LiDAR) measurements with an intensity image and to produce a dense high-resolution depth image. As there are few, but accurate, depth measurements from the scene, our method infers the remaining depth values by incorporating information from the intensity image, namely the magnitudes and directions of the identified edges, and by assuming that the scene is composed mostly of flat surfaces. Such inference is achieved by solving a convex optimisation problem with properly weighted regularisers that are based on the `1-norm (specifically, on total variation). We solve the resulting problem with a computationally efficient ADMM-based algorithm. Using the SYNTHIA and KITTI datasets, our experiments show that the proposed method achieves a depth reconstruction performance comparable to or better than other model-based methods. | |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14377v1 |
https://arxiv.org/pdf/1910.14377v1.pdf | |
PWC | https://paperswithcode.com/paper/image-guided-depth-upsampling-via-hessian-and |
Repo | |
Framework | |
Robust Visual Tracking via Implicit Low-Rank Constraints and Structural Color Histograms
Title | Robust Visual Tracking via Implicit Low-Rank Constraints and Structural Color Histograms |
Authors | Yi-Xuan Wang, Xiao-Jun Wu, Xue-Feng Zhu |
Abstract | With the guaranteed discrimination and efficiency of spatial appearance model, Discriminative Correlation Filters (DCF-) based tracking methods have achieved outstanding performance recently. However, the construction of effective temporal appearance model is still challenging on account of filter degeneration becomes a significant factor that causes tracking failures in the DCF framework. To encourage temporal continuity and to explore the smooth variation of target appearance, we propose to enhance low-rank structure of the learned filters, which can be realized by constraining the successive filters within a $\ell_2$-norm ball. Moreover, we design a global descriptor, structural color histograms, to provide complementary support to the final response map, improving the stability and robustness to the DCF framework. The experimental results on standard benchmarks demonstrate that our Implicit Low-Rank Constraints and Structural Color Histograms (ILRCSCH) tracker outperforms state-of-the-art methods. |
Tasks | Visual Tracking |
Published | 2019-12-24 |
URL | https://arxiv.org/abs/1912.11343v1 |
https://arxiv.org/pdf/1912.11343v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-visual-tracking-via-implicit-low-rank |
Repo | |
Framework | |
Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation
Title | Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation |
Authors | Anton Akusok, Emil Eirola |
Abstract | The big data trend has inspired feature-driven learning tasks, which cannot be handled by conventional machine learning models. Unstructured data produces very large binary matrices with millions of columns when converted to vector form. However, such data is often sparse, and hence can be manageable through the use of sparse random projections. This work studies efficient non-iterative and iterative methods suitable for such data, evaluating the results on two representative machine learning tasks with millions of samples and features. An efficient Jaccard kernel is introduced as an alternative to the sparse random projection. Findings indicate that non-iterative methods can find larger, more accurate models than iterative methods in different application scenarios. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08616v1 |
https://arxiv.org/pdf/1912.08616v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-classification-methods-for-very |
Repo | |
Framework | |
Multilevel Monte Carlo estimation of log marginal likelihood
Title | Multilevel Monte Carlo estimation of log marginal likelihood |
Authors | Takashi Goda, Kei Ishikawa |
Abstract | In this short note we provide an unbiased multilevel Monte Carlo estimator of the log marginal likelihood and discuss its application to variational Bayes. |
Tasks | |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10636v1 |
https://arxiv.org/pdf/1912.10636v1.pdf | |
PWC | https://paperswithcode.com/paper/multilevel-monte-carlo-estimation-of-log |
Repo | |
Framework | |
Region based Ensemble Learning Network for Fine-grained Classification
Title | Region based Ensemble Learning Network for Fine-grained Classification |
Authors | Weikuang Li, Tian Wang, Chuanyun Wang, Guangcun Shan, Mengyi Zhang, Hichem Snoussi |
Abstract | As an important research topic in computer vision, fine-grained classification which aims to recognition subordinate-level categories has attracted significant attention. We propose a novel region based ensemble learning network for fine-grained classification. Our approach contains a detection module and a module for classification. The detection module is based on the faster R-CNN framework to locate the semantic regions of the object. The classification module using an ensemble learning method, which trains a set of sub-classifiers for different semantic regions and combines them together to get a stronger classifier. In the evaluation, we implement experiments on the CUB-2011 dataset and the result of experiments proves our method s efficient for fine-grained classification. We also extend our approach to remote scene recognition and evaluate it on the NWPU-RESISC45 dataset. |
Tasks | Scene Recognition |
Published | 2019-02-09 |
URL | http://arxiv.org/abs/1902.03377v1 |
http://arxiv.org/pdf/1902.03377v1.pdf | |
PWC | https://paperswithcode.com/paper/region-based-ensemble-learning-network-for |
Repo | |
Framework | |