Paper Group ANR 1092
A short review on Applications of Deep learning for Cyber security. Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification. Process Monitoring Using Maximum Sequence Divergence. Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization. Bayesian Optimization Using Monotonicity Information a …
A short review on Applications of Deep learning for Cyber security
Title | A short review on Applications of Deep learning for Cyber security |
Authors | Mohammed Harun Babu R, Vinayakumar R, Soman KP |
Abstract | Deep learning is an advanced model of traditional machine learning. This has the capability to extract optimal feature representation from raw input samples. This has been applied towards various use cases in cyber security such as intrusion detection, malware classification, android malware detection, spam and phishing detection and binary analysis. This paper outlines the survey of all the works related to deep learning based solutions for various cyber security use cases. Keywords: Deep learning, intrusion detection, malware detection, Android malware detection, spam & phishing detection, traffic analysis, binary analysis. |
Tasks | Android Malware Detection, Intrusion Detection, Malware Classification, Malware Detection |
Published | 2018-12-15 |
URL | http://arxiv.org/abs/1812.06292v2 |
http://arxiv.org/pdf/1812.06292v2.pdf | |
PWC | https://paperswithcode.com/paper/a-short-review-on-applications-of-deep |
Repo | |
Framework | |
Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification
Title | Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification |
Authors | Ivo M. Baltruschat, Hannes Nickisch, Michael Grass, Tobias Knopp, Axel Saalbach |
Abstract | The increased availability of X-ray image archives (e.g. the ChestX-ray14 dataset from the NIH Clinical Center) has triggered a growing interest in deep learning techniques. To provide better insight into the different approaches, and their applications to chest X-ray classification, we investigate a powerful network architecture in detail: the ResNet-50. Building on prior work in this domain, we consider transfer learning with and without fine-tuning as well as the training of a dedicated X-ray network from scratch. To leverage the high spatial resolution of X-ray data, we also include an extended ResNet-50 architecture, and a network integrating non-image data (patient age, gender and acquisition type) in the classification process. In a concluding experiment, we also investigate multiple ResNet depths (i.e. ResNet-38 and ResNet-101). In a systematic evaluation, using 5-fold re-sampling and a multi-label loss function, we compare the performance of the different approaches for pathology classification by ROC statistics and analyze differences between the classifiers using rank correlation. Overall, we observe a considerable spread in the achieved performance and conclude that the X-ray-specific ResNet-38, integrating non-image data yields the best overall results. Furthermore, class activation maps are used to understand the classification process, and a detailed analysis of the impact of non-image features is provided. |
Tasks | Transfer Learning |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02315v2 |
http://arxiv.org/pdf/1803.02315v2.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-deep-learning-approaches-for |
Repo | |
Framework | |
Process Monitoring Using Maximum Sequence Divergence
Title | Process Monitoring Using Maximum Sequence Divergence |
Authors | Yihuang Kang, Vladimir Zadorozhny |
Abstract | Process Monitoring involves tracking a system’s behaviors, evaluating the current state of the system, and discovering interesting events that require immediate actions. In this paper, we consider monitoring temporal system state sequences to help detect the changes of dynamic systems, check the divergence of the system development, and evaluate the significance of the deviation. We begin with discussions of data reduction, symbolic data representation, and the anomaly detection in temporal discrete sequences. Time-series representation methods are also discussed and used in this paper to discretize raw data into sequences of system states. Markov Chains and stationary state distributions are continuously generated from temporal sequences to represent snapshots of the system dynamics in different time frames. We use generalized Jensen-Shannon Divergence as the measure to monitor changes of the stationary symbol probability distributions and evaluate the significance of system deviations. We prove that the proposed approach is able to detect deviations of the systems we monitor and assess the deviation significance in probabilistic manner. |
Tasks | Anomaly Detection, Time Series |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03387v1 |
http://arxiv.org/pdf/1807.03387v1.pdf | |
PWC | https://paperswithcode.com/paper/process-monitoring-using-maximum-sequence |
Repo | |
Framework | |
Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization
Title | Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization |
Authors | Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun |
Abstract | Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge. In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. We represent prior knowledge sources as features in a log-linear model, which guides the learning process of the neural translation model. Experiments on Chinese-English translation show that our approach leads to significant improvements. |
Tasks | Machine Translation |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.01100v1 |
http://arxiv.org/pdf/1811.01100v1.pdf | |
PWC | https://paperswithcode.com/paper/prior-knowledge-integration-for-neural |
Repo | |
Framework | |
Bayesian Optimization Using Monotonicity Information and Its Application in Machine Learning Hyperparameter
Title | Bayesian Optimization Using Monotonicity Information and Its Application in Machine Learning Hyperparameter |
Authors | Wenyi Wang, William J. Welch |
Abstract | We propose an algorithm for a family of optimization problems where the objective can be decomposed as a sum of functions with monotonicity properties. The motivating problem is optimization of hyperparameters of machine learning algorithms, where we argue that the objective, validation error, can be decomposed as monotonic functions of the hyperparameters. Our proposed algorithm adapts Bayesian optimization methods to incorporate the monotonicity constraints. We illustrate the advantages of exploiting monotonicity using illustrative examples and demonstrate the improvements in optimization efficiency for some machine learning hyperparameter tuning applications. |
Tasks | |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03532v2 |
http://arxiv.org/pdf/1802.03532v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-using-monotonicity |
Repo | |
Framework | |
Faces as Lighting Probes via Unsupervised Deep Highlight Extraction
Title | Faces as Lighting Probes via Unsupervised Deep Highlight Extraction |
Authors | Renjiao Yi, Chenyang Zhu, Ping Tan, Stephen Lin |
Abstract | We present a method for estimating detailed scene illumination using human faces in a single image. In contrast to previous works that estimate lighting in terms of low-order basis functions or distant point lights, our technique estimates illumination at a higher precision in the form of a non-parametric environment map. Based on the observation that faces can exhibit strong highlight reflections from a broad range of lighting directions, we propose a deep neural network for extracting highlights from faces, and then trace these reflections back to the scene to acquire the environment map. Since real training data for highlight extraction is very limited, we introduce an unsupervised scheme for finetuning the network on real images, based on the consistent diffuse chromaticity of a given face seen in multiple real images. In tracing the estimated highlights to the environment, we reduce the blurring effect of skin reflectance on reflected light through a deconvolution determined by prior knowledge on face material properties. Comparisons to previous techniques for highlight extraction and illumination estimation show the state-of-the-art performance of this approach on a variety of indoor and outdoor scenes. |
Tasks | |
Published | 2018-03-16 |
URL | http://arxiv.org/abs/1803.06340v2 |
http://arxiv.org/pdf/1803.06340v2.pdf | |
PWC | https://paperswithcode.com/paper/faces-as-lighting-probes-via-unsupervised |
Repo | |
Framework | |
Estimating Learnability in the Sublinear Data Regime
Title | Estimating Learnability in the Sublinear Data Regime |
Authors | Weihao Kong, Gregory Valiant |
Abstract | We consider the problem of estimating how well a model class is capable of fitting a distribution of labeled data. We show that it is often possible to accurately estimate this “learnability” even when given an amount of data that is too small to reliably learn any accurate model. Our first result applies to the setting where the data is drawn from a $d$-dimensional distribution with isotropic covariance (or known covariance), and the label of each datapoint is an arbitrary noisy function of the datapoint. In this setting, we show that with $O(\sqrt{d})$ samples, one can accurately estimate the fraction of the variance of the label that can be explained via the best linear function of the data. In contrast to this sublinear sample size, finding an approximation of the best-fit linear function requires on the order of $d$ samples. Our sublinear sample results and approach also extend to the non-isotropic setting, where the data distribution has an (unknown) arbitrary covariance matrix: we show that, if the label $y$ of point $x$ is a linear function with independent noise, $y = \langle x , \beta \rangle + noise$ with $\beta $ bounded, the variance of the noise can be estimated to error $\epsilon$ with $O(d^{1-1/\log{1/\epsilon}})$ if the covariance matrix has bounded condition number, or $O(d^{1-\sqrt{\epsilon}})$ if there are no bounds on the condition number. We also establish that these sample complexities are optimal, to constant factors. Finally, we extend these techniques to the setting of binary classification, where we obtain analogous sample complexities for the problem of estimating the prediction error of the best linear classifier, in a natural model of binary labeled data. We demonstrate the practical viability of our approaches on several real and synthetic datasets. |
Tasks | |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01626v3 |
http://arxiv.org/pdf/1805.01626v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-learnability-in-the-sublinear-data |
Repo | |
Framework | |
Head Mounted Pupil Tracking Using Convolutional Neural Network
Title | Head Mounted Pupil Tracking Using Convolutional Neural Network |
Authors | Yinheng Zhu, Wanli Chen, Xun Zhan, Zonglin Guo, Hongjian Shi, Ian G. Harris |
Abstract | Pupil tracking is an important branch of object tracking which require high precision. We investigate head mounted pupil tracking which is often more convenient and precise than remote pupil tracking, but also more challenging. When pupil tracking suffers from noise like bad illumination, detection precision dramatically decreases. Due to the appearance of head mounted recording device and public benchmark image datasets, head mounted tracking algorithms have become easier to design and evaluate. In this paper, we propose a robust head mounted pupil detection algorithm which uses a Convolutional Neural Network (CNN) to combine different features of pupil. Here we consider three features of pupil. Firstly, we use three pupil feature-based algorithms to find pupil center independently. Secondly, we use a CNN to evaluate the quality of each result. Finally, we select the best result as output. The experimental results show that our proposed algorithm performs better than the present state-of-art. |
Tasks | Object Tracking |
Published | 2018-04-15 |
URL | http://arxiv.org/abs/1805.00311v2 |
http://arxiv.org/pdf/1805.00311v2.pdf | |
PWC | https://paperswithcode.com/paper/head-mounted-pupil-tracking-using |
Repo | |
Framework | |
Autonomous Driving without a Burden: View from Outside with Elevated LiDAR
Title | Autonomous Driving without a Burden: View from Outside with Elevated LiDAR |
Authors | Nalin Jayaweera, Nandana Rajatheva, Matti Latva-aho |
Abstract | The current autonomous driving architecture places a heavy burden in signal processing for the graphics processing units (GPUs) in the car. This directly translates into battery drain and lower energy efficiency, crucial factors in electric vehicles. This is due to the high bit rate of the captured video and other sensing inputs, mainly due to Light Detection and Ranging (LiDAR) sensor at the top of the car which is an essential feature in autonomous vehicles. LiDAR is needed to obtain a high precision map for the vehicle AI to make relevant decisions. However, this is still a quite restricted view from the car. This is the same even in the case of cars without a LiDAR such as Tesla. The existing LiDARs and the cameras have limited horizontal and vertical fields of visions. In all cases it can be argued that precision is lower, given the smaller map generated. This also results in the accumulation of a large amount of data in the order of several TBs in a day, the storage of which becomes challenging. If we are to reduce the effort for the processing units inside the car, we need to uplink the data to edge or an appropriately placed cloud. However, the required data rates in the order of several Gbps are difficult to be met even with the advent of 5G. Therefore, we propose to have a coordinated set of LiDAR’s outside at an elevation which can provide an integrated view with a much larger field of vision (FoV) to a centralized decision making body which then sends the required control actions to the vehicles with a lower bit rate in the downlink and with the required latency. The calculations we have based on industry standard equipment from several manufacturers show that this is not just a concept but a feasible system which can be implemented.The proposed system can play a supportive role with existing autonomous vehicle architecture and it is easily applicable in an urban area. |
Tasks | Autonomous Driving, Autonomous Vehicles, Decision Making |
Published | 2018-08-26 |
URL | http://arxiv.org/abs/1808.08617v2 |
http://arxiv.org/pdf/1808.08617v2.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-driving-without-a-burden-view-from |
Repo | |
Framework | |
VoxelAtlasGAN: 3D Left Ventricle Segmentation on Echocardiography with Atlas Guided Generation and Voxel-to-voxel Discrimination
Title | VoxelAtlasGAN: 3D Left Ventricle Segmentation on Echocardiography with Atlas Guided Generation and Voxel-to-voxel Discrimination |
Authors | Suyu Dong, Gongning Luo, Kuanquan Wang, Shaodong Cao, Ashley Mercado, Olga Shmuilovich, Henggui Zhang, Shuo Li |
Abstract | 3D left ventricle (LV) segmentation on echocardiography is very important for diagnosis and treatment of cardiac disease. It is not only because of that echocardiography is a real-time imaging technology and widespread in clinical application, but also because of that LV segmentation on 3D echocardiography can provide more full volume information of heart than LV segmentation on 2D echocardiography. However, 3D LV segmentation on echocardiography is still an open and challenging task owing to the lower contrast, higher noise and data dimensionality, limited annotation of 3D echocardiography. In this paper, we proposed a novel real-time framework, i.e., VoxelAtlasGAN, for 3D LV segmentation on 3D echocardiography. This framework has three contributions: 1) It is based on voxel-to-voxel conditional generative adversarial nets (cGAN). For the first time, cGAN is used for 3D LV segmentation on echocardiography. And cGAN advantageously fuses substantial 3D spatial context information from 3D echocardiography by self-learning structured loss; 2) For the first time, it embeds the atlas into an end-to-end optimization framework, which uses 3D LV atlas as a powerful prior knowledge to improve the inference speed, address the lower contrast and the limited annotation problems of 3D echocardiography; 3) It combines traditional discrimination loss and the new proposed consistent constraint, which further improves the generalization of the proposed framework. VoxelAtlasGAN was validated on 60 subjects on 3D echocardiography and it achieved satisfactory segmentation results and high inference speed. The mean surface distance is 1.85 mm, the mean hausdorff surface distance is 7.26 mm, mean dice is 0.953, the correlation of EF is 0.918, and the mean inference speed is 0.1s. These results have demonstrated that our proposed method has great potential for clinical application |
Tasks | |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03619v1 |
http://arxiv.org/pdf/1806.03619v1.pdf | |
PWC | https://paperswithcode.com/paper/voxelatlasgan-3d-left-ventricle-segmentation |
Repo | |
Framework | |
Concentric ESN: Assessing the Effect of Modularity in Cycle Reservoirs
Title | Concentric ESN: Assessing the Effect of Modularity in Cycle Reservoirs |
Authors | Davide Bacciu, Andrea Bongiorno |
Abstract | The paper introduces concentric Echo State Network, an approach to design reservoir topologies that tries to bridge the gap between deterministically constructed simple cycle models and deep reservoir computing approaches. We show how to modularize the reservoir into simple unidirectional and concentric cycles with pairwise bidirectional jump connections between adjacent loops. We provide a preliminary experimental assessment showing how concentric reservoirs yield to superior predictive accuracy and memory capacity with respect to single cycle reservoirs and deep reservoir models. |
Tasks | |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09244v1 |
http://arxiv.org/pdf/1805.09244v1.pdf | |
PWC | https://paperswithcode.com/paper/concentric-esn-assessing-the-effect-of |
Repo | |
Framework | |
Truth Validation with Evidence
Title | Truth Validation with Evidence |
Authors | Papis Wongchaisuwat, Diego Klabjan |
Abstract | In the modern era, abundant information is easily accessible from various sources, however only a few of these sources are reliable as they mostly contain unverified contents. We develop a system to validate the truthfulness of a given statement together with underlying evidence. The proposed system provides supporting evidence when the statement is tagged as false. Our work relies on an inference method on a knowledge graph (KG) to identify the truthfulness of statements. In order to extract the evidence of falseness, the proposed algorithm takes into account combined knowledge from KG and ontologies. The system shows very good results as it provides valid and concise evidence. The quality of KG plays a role in the performance of the inference method which explicitly affects the performance of our evidence-extracting algorithm. |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05786v1 |
http://arxiv.org/pdf/1802.05786v1.pdf | |
PWC | https://paperswithcode.com/paper/truth-validation-with-evidence |
Repo | |
Framework | |
Continuous Learning in Single-Incremental-Task Scenarios
Title | Continuous Learning in Single-Incremental-Task Scenarios |
Authors | Davide Maltoni, Vincenzo Lomonaco |
Abstract | It was recently shown that architectural, regularization and rehearsal strategies can be used to train deep models sequentially on a number of disjoint tasks without forgetting previously acquired knowledge. However, these strategies are still unsatisfactory if the tasks are not disjoint but constitute a single incremental task (e.g., class-incremental learning). In this paper we point out the differences between multi-task and single-incremental-task scenarios and show that well-known approaches such as LWF, EWC and SI are not ideal for incremental task scenarios. A new approach, denoted as AR1, combining architectural and regularization strategies is then specifically proposed. AR1 overhead (in term of memory and computation) is very small thus making it suitable for online learning. When tested on CORe50 and iCIFAR-100, AR1 outperformed existing regularization strategies by a good margin. |
Tasks | |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.08568v3 |
http://arxiv.org/pdf/1806.08568v3.pdf | |
PWC | https://paperswithcode.com/paper/continuous-learning-in-single-incremental |
Repo | |
Framework | |
Conditional Activation for Diverse Neurons in Heterogeneous Networks
Title | Conditional Activation for Diverse Neurons in Heterogeneous Networks |
Authors | Albert Lee, Bonnie Lam, Wenyuan Li, Hochul Lee, Wei-Hao Chen, Meng-Fan Chang, Kang. -L. Wang |
Abstract | In this paper, we propose a new scheme for modelling the diverse behavior of neurons. We introduce the conditional activation, in which a neurons activation function is dynamically modified by a control signal. We apply this method to recreate behavior of special neurons existing in the human auditory and visual system. A heterogeneous multilayered perceptron (MLP) incorporating the developed models demonstrates simultaneous improvement in learning speed and performance across a various number of hidden units and layers, compared to a homogeneous network composed of the conventional neuron model. For similar performance, the proposed model lowers the memory for storing network parameters significantly. |
Tasks | |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.05006v1 |
http://arxiv.org/pdf/1803.05006v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-activation-for-diverse-neurons-in |
Repo | |
Framework | |
Gradient-Free Learning Based on the Kernel and the Range Space
Title | Gradient-Free Learning Based on the Kernel and the Range Space |
Authors | Kar-Ann Toh, Zhiping Lin, Zhengguo Li, Beomseok Oh, Lei Sun |
Abstract | In this article, we show that solving the system of linear equations by manipulating the kernel and the range space is equivalent to solving the problem of least squares error approximation. This establishes the ground for a gradient-free learning search when the system can be expressed in the form of a linear matrix equation. When the nonlinear activation function is invertible, the learning problem of a fully-connected multilayer feedforward neural network can be easily adapted for this novel learning framework. By a series of kernel and range space manipulations, it turns out that such a network learning boils down to solving a set of cross-coupling equations. By having the weights randomly initialized, the equations can be decoupled and the network solution shows relatively good learning capability for real world data sets of small to moderate dimensions. Based on the structural information of the matrix equation, the network representation is found to be dependent on the number of data samples and the output dimension. |
Tasks | |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11581v1 |
http://arxiv.org/pdf/1810.11581v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-free-learning-based-on-the-kernel |
Repo | |
Framework | |