Paper Group ANR 515
Using Randomness to Improve Robustness of Machine-Learning Models Against Evasion Attacks. Diagnostics in Semantic Segmentation. Robbins-Monro conditions for persistent exploration learning strategies. Explaining the Unique Nature of Individual Gait Patterns with Deep Learning. Multiple Measurement Vectors Problem: A Decoupling Property and its App …
Using Randomness to Improve Robustness of Machine-Learning Models Against Evasion Attacks
Title | Using Randomness to Improve Robustness of Machine-Learning Models Against Evasion Attacks |
Authors | Fan Yang, Zhiyuan Chen |
Abstract | Machine learning models have been widely used in security applications such as intrusion detection, spam filtering, and virus or malware detection. However, it is well-known that adversaries are always trying to adapt their attacks to evade detection. For example, an email spammer may guess what features spam detection models use and modify or remove those features to avoid detection. There has been some work on making machine learning models more robust to such attacks. However, one simple but promising approach called {\em randomization} is underexplored. This paper proposes a novel randomization-based approach to improve robustness of machine learning models against evasion attacks. The proposed approach incorporates randomization into both model training time and model application time (meaning when the model is used to detect attacks). We also apply this approach to random forest, an existing ML method which already has some degree of randomness. Experiments on intrusion detection and spam filtering data show that our approach further improves robustness of random-forest method. We also discuss how this approach can be applied to other ML models. |
Tasks | Intrusion Detection, Malware Detection |
Published | 2018-08-10 |
URL | http://arxiv.org/abs/1808.03601v1 |
http://arxiv.org/pdf/1808.03601v1.pdf | |
PWC | https://paperswithcode.com/paper/using-randomness-to-improve-robustness-of |
Repo | |
Framework | |
Diagnostics in Semantic Segmentation
Title | Diagnostics in Semantic Segmentation |
Authors | Vladimir Nekrasov, Chunhua Shen, Ian Reid |
Abstract | Over the past years, computer vision community has contributed to enormous progress in semantic image segmentation, a per-pixel classification task, crucial for dense scene understanding and rapidly becoming vital in lots of real-world applications, including driverless cars and medical imaging. Most recent models are now reaching previously unthinkable numbers (e.g., 89% mean iou on PASCAL VOC, 83% on CityScapes), and, while intersection-over-union and a range of other metrics provide the general picture of model performance, in this paper we aim to extend them into other meaningful and important for applications characteristics, answering such questions as ‘how accurate the model segmentation is on small objects in the general scene?', or ‘what are the sources of uncertainty that cause the model to make an erroneous prediction?'. Besides establishing a methodology that covers the performance of a single model from different perspectives, we also showcase several extensions that can be worth pursuing in order to further improve current results in semantic segmentation. |
Tasks | Scene Understanding, Semantic Segmentation |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10328v1 |
http://arxiv.org/pdf/1809.10328v1.pdf | |
PWC | https://paperswithcode.com/paper/diagnostics-in-semantic-segmentation |
Repo | |
Framework | |
Robbins-Monro conditions for persistent exploration learning strategies
Title | Robbins-Monro conditions for persistent exploration learning strategies |
Authors | Dmitry B. Rokhlin |
Abstract | We formulate simple assumptions, implying the Robbins-Monro conditions for the $Q$-learning algorithm with the local learning rate, depending on the number of visits of a particular state-action pair (local clock) and the number of iteration (global clock). It is assumed that the Markov decision process is communicating and the learning policy ensures the persistent exploration. The restrictions are imposed on the functional dependence of the learning rate on the local and global clocks. The result partially confirms the conjecture of Bradkte (1994). |
Tasks | Q-Learning |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00245v3 |
http://arxiv.org/pdf/1808.00245v3.pdf | |
PWC | https://paperswithcode.com/paper/robbins-monro-conditions-for-persistent |
Repo | |
Framework | |
Explaining the Unique Nature of Individual Gait Patterns with Deep Learning
Title | Explaining the Unique Nature of Individual Gait Patterns with Deep Learning |
Authors | Fabian Horst, Sebastian Lapuschkin, Wojciech Samek, Klaus-Robert Müller, Wolfgang I. Schöllhorn |
Abstract | Machine learning (ML) techniques such as (deep) artificial neural networks (DNN) are solving very successfully a plethora of tasks and provide new predictive models for complex physical, chemical, biological and social systems. However, in most cases this comes with the disadvantage of acting as a black box, rarely providing information about what made them arrive at a particular prediction. This black box aspect of ML techniques can be problematic especially in medical diagnoses, so far hampering a clinical acceptance. The present paper studies the uniqueness of individual gait patterns in clinical biomechanics using DNNs. By attributing portions of the model predictions back to the input variables (ground reaction forces and full-body joint angles), the Layer-Wise Relevance Propagation (LRP) technique reliably demonstrates which variables at what time windows of the gait cycle are most relevant for the characterisation of gait patterns from a certain individual. By measuring the time-resolved contribution of each input variable to the prediction of ML techniques such as DNNs, our method describes the first general framework that enables to understand and interpret non-linear ML methods in (biomechanical) gait analysis and thereby supplies a powerful tool for analysis, diagnosis and treatment of human gait. |
Tasks | |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04308v2 |
http://arxiv.org/pdf/1808.04308v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-the-unique-nature-of-individual |
Repo | |
Framework | |
Multiple Measurement Vectors Problem: A Decoupling Property and its Applications
Title | Multiple Measurement Vectors Problem: A Decoupling Property and its Applications |
Authors | Saeid Haghighatshoar, Giuseppe Caire |
Abstract | We study a Compressed Sensing (CS) problem known as Multiple Measurement Vectors (MMV) problem, which arises in joint estimation of multiple signal realizations when the signal samples have a common (joint) sparse support over a fixed known dictionary. Although there is a vast literature on the analysis of MMV, it is not yet fully known how the number of signal samples and their statistical correlations affects the performance of the joint estimation in MMV. Moreover, in many instances of MMV the underlying sparsifying dictionary may not be precisely known, and it is still an open problem to quantify how the dictionary mismatch may affect the estimation performance. In this paper, we focus on $\ell_{2,1}$-norm regularized least squares ($\ell_{2,1}$-LS) as a well-known and widely-used MMV algorithm in the literature. We prove an interesting decoupling property for $\ell_{2,1}$-LS, where we show that it can be decomposed into two phases: i) use all the signal samples to estimate the signal covariance matrix (coupled phase), ii) plug in the resulting covariance estimate as the true covariance matrix into the Minimum Mean Squared Error (MMSE) estimator to reconstruct each signal sample individually (decoupled phase). As a consequence of this decomposition, we are able to provide further insights on the performance of $\ell_{2,1}$-LS for MMV. In particular, we address how the signal correlations and dictionary mismatch affects its performance. Moreover, we show that by using the decoupling property one can obtain a variety of MMV algorithms with performances even better than that of $\ell_{2,1}$-LS. We also provide numerical simulations to validate our theoretical results. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1810.13421v2 |
http://arxiv.org/pdf/1810.13421v2.pdf | |
PWC | https://paperswithcode.com/paper/multiple-measurement-vectors-problem-a |
Repo | |
Framework | |
M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments
Title | M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments |
Authors | Tim Laibacher, Tillman Weyde, Sepehr Jalali |
Abstract | In this paper, we present a novel neural network architecture for retinal vessel segmentation that improves over the state of the art on two benchmark datasets, is the first to run in real time on high resolution images, and its small memory and processing requirements make it deployable in mobile and embedded systems. The M2U-Net has a new encoder-decoder architecture that is inspired by the U-Net. It adds pretrained components of MobileNetV2 in the encoder part and novel contractive bottleneck blocks in the decoder part that, combined with bilinear upsampling, drastically reduce the parameter count to 0.55M compared to 31.03M in the original U-Net. We have evaluated its performance against a wide body of previously published results on three public datasets. On two of them, the M2U-Net achieves new state-of-the-art performance by a considerable margin. When implemented on a GPU, our method is the first to achieve real-time inference speeds on high-resolution fundus images. We also implemented our proposed network on an ARM-based embedded system where it segments images in between 0.6 and 15 sec, depending on the resolution. Thus, the M2U-Net enables a number of applications of retinal vessel structure extraction, such as early diagnosis of eye diseases, retinal biometric authentication systems, and robot assisted microsurgery. |
Tasks | Retinal Vessel Segmentation |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07738v3 |
http://arxiv.org/pdf/1811.07738v3.pdf | |
PWC | https://paperswithcode.com/paper/m2u-net-effective-and-efficient-retinal |
Repo | |
Framework | |
Deep Learning for Malicious Flow Detection
Title | Deep Learning for Malicious Flow Detection |
Authors | Yun-Chun Chen, Yu-Jhe Li, Aragorn Tseng, Tsungnan Lin |
Abstract | Cyber security has grown up to be a hot issue in recent years. How to identify potential malware becomes a challenging task. To tackle this challenge, we adopt deep learning approaches and perform flow detection on real data. However, real data often encounters an issue of imbalanced data distribution which will lead to a gradient dilution issue. When training a neural network, this problem will not only result in a bias toward the majority class but show the inability to learn from the minority classes. In this paper, we propose an end-to-end trainable Tree-Shaped Deep Neural Network (TSDNN) which classifies the data in a layer-wise manner. To better learn from the minority classes, we propose a Quantity Dependent Backpropagation (QDBP) algorithm which incorporates the knowledge of the disparity between classes. We evaluate our method on an imbalanced data set. Experimental result demonstrates that our approach outperforms the state-of-the-art methods and justifies that the proposed method is able to overcome the difficulty of imbalanced learning. We also conduct a partial flow experiment which shows the feasibility of real-time detection and a zero-shot learning experiment which justifies the generalization capability of deep learning in cyber security. |
Tasks | Zero-Shot Learning |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03358v1 |
http://arxiv.org/pdf/1802.03358v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-malicious-flow-detection |
Repo | |
Framework | |
Super Characters: A Conversion from Sentiment Classification to Image Classification
Title | Super Characters: A Conversion from Sentiment Classification to Image Classification |
Authors | Baohua Sun, Lin Yang, Patrick Dong, Wenhan Zhang, Jason Dong, Charles Young |
Abstract | We propose a method named Super Characters for sentiment classification. This method converts the sentiment classification problem into image classification problem by projecting texts into images and then applying CNN models for classification. Text features are extracted automatically from the generated Super Characters images, hence there is no need of any explicit step of embedding the words or characters into numerical vector representations. Experimental results on large social media corpus show that the Super Characters method consistently outperforms other methods for sentiment classification and topic classification tasks on ten large social media datasets of millions of contents in four different languages, including Chinese, Japanese, Korean and English. |
Tasks | Image Classification, Sentiment Analysis |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.07653v1 |
http://arxiv.org/pdf/1810.07653v1.pdf | |
PWC | https://paperswithcode.com/paper/super-characters-a-conversion-from-sentiment |
Repo | |
Framework | |
Scalable Magnetic Field SLAM in 3D Using Gaussian Process Maps
Title | Scalable Magnetic Field SLAM in 3D Using Gaussian Process Maps |
Authors | Manon Kok, Arno Solin |
Abstract | We present a method for scalable and fully 3D magnetic field simultaneous localisation and mapping (SLAM) using local anomalies in the magnetic field as a source of position information. These anomalies are due to the presence of ferromagnetic material in the structure of buildings and in objects such as furniture. We represent the magnetic field map using a Gaussian process model and take well-known physical properties of the magnetic field into account. We build local maps using three-dimensional hexagonal block tiling. To make our approach computationally tractable we use reduced-rank Gaussian process regression in combination with a Rao-Blackwellised particle filter. We show that it is possible to obtain accurate position and orientation estimates using measurements from a smartphone, and that our approach provides a scalable magnetic field SLAM algorithm in terms of both computational complexity and map storage. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.01926v2 |
http://arxiv.org/pdf/1804.01926v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-magnetic-field-slam-in-3d-using |
Repo | |
Framework | |
Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization
Title | Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization |
Authors | Gowtham Muniraju, Bhavya Kailkhura, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Cihan Tepedelenlioglu, Andreas Spanias |
Abstract | Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution. Typical examples include data summarization, sample mining for predictive modeling and hyper-parameter optimization. Existing solutions attempt to adaptively trade-off between global exploration and local exploitation, wherein the initial exploratory sample is critical to their success. While discrepancy-based samples have become the de facto approach for exploration, results from computer graphics suggest that coverage-based designs, e.g. Poisson disk sampling, can be a superior alternative. In order to successfully adopt coverage-based sample designs to ML applications, which were originally developed for 2-d image analysis, we propose fundamental advances by constructing a parameterized family of designs with provably improved coverage characteristics, and by developing algorithms for effective sample synthesis. Using experiments in sample mining and hyper-parameter optimization for supervised learning, we show that our approach consistently outperforms existing exploratory sampling methods in both blind exploration, and sequential search with Bayesian optimization. |
Tasks | Data Summarization |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01712v3 |
http://arxiv.org/pdf/1809.01712v3.pdf | |
PWC | https://paperswithcode.com/paper/controlled-random-search-improves-hyper |
Repo | |
Framework | |
Towards Governing Agent’s Efficacy: Action-Conditional $β$-VAE for Deep Transparent Reinforcement Learning
Title | Towards Governing Agent’s Efficacy: Action-Conditional $β$-VAE for Deep Transparent Reinforcement Learning |
Authors | John Yang, Gyujeong Lee, Minsung Hyun, Simyung Chang, Nojun Kwak |
Abstract | We tackle the blackbox issue of deep neural networks in the settings of reinforcement learning (RL) where neural agents learn towards maximizing reward gains in an uncontrollable way. Such learning approach is risky when the interacting environment includes an expanse of state space because it is then almost impossible to foresee all unwanted outcomes and penalize them with negative rewards beforehand. Unlike reverse analysis of learned neural features from previous works, our proposed method \nj{tackles the blackbox issue by encouraging} an RL policy network to learn interpretable latent features through an implementation of a disentangled representation learning method. Toward this end, our method allows an RL agent to understand self-efficacy by distinguishing its influences from uncontrollable environmental factors, which closely resembles the way humans understand their scenes. Our experimental results show that the learned latent factors not only are interpretable, but also enable modeling the distribution of entire visited state space with a specific action condition. We have experimented that this characteristic of the proposed structure can lead to ex post facto governance for desired behaviors of RL agents. |
Tasks | Representation Learning |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04350v1 |
http://arxiv.org/pdf/1811.04350v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-governing-agents-efficacy-action |
Repo | |
Framework | |
Coarse-to-fine Seam Estimation for Image Stitching
Title | Coarse-to-fine Seam Estimation for Image Stitching |
Authors | Tianli Liao, Jing Chen, Yifang Xu |
Abstract | Seam-cutting and seam-driven techniques have been proven effective for handling imperfect image series in image stitching. Generally, seam-driven is to utilize seam-cutting to find a best seam from one or finite alignment hypotheses based on a predefined seam quality metric. However, the quality metrics in most methods are defined to measure the average performance of the pixels on the seam without considering the relevance and variance among them. This may cause that the seam with the minimal measure is not optimal (perception-inconsistent) in human perception. In this paper, we propose a novel coarse-to-fine seam estimation method which applies the evaluation in a different way. For pixels on the seam, we develop a patch-point evaluation algorithm concentrating more on the correlation and variation of them. The evaluations are then used to recalculate the difference map of the overlapping region and reestimate a stitching seam. This evaluation-reestimation procedure iterates until the current seam changes negligibly comparing with the previous seams. Experiments show that our proposed method can finally find a nearly perception-consistent seam after several iterations, which outperforms the conventional seam-cutting and other seam-driven methods. |
Tasks | Image Stitching |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09578v1 |
http://arxiv.org/pdf/1805.09578v1.pdf | |
PWC | https://paperswithcode.com/paper/coarse-to-fine-seam-estimation-for-image |
Repo | |
Framework | |
Resource Mention Extraction for MOOC Discussion Forums
Title | Resource Mention Extraction for MOOC Discussion Forums |
Authors | Ya-Hui An, Liangming Pan, Min-Yen Kan, Qiang Dong, Yan Fu |
Abstract | In discussions hosted on discussion forums for MOOCs, references to online learning resources are often of central importance. They contextualize the discussion, anchoring the discussion participants’ presentation of the issues and their understanding. However they are usually mentioned in free text, without appropriate hyperlinking to their associated resource. Automated learning resource mention hyperlinking and categorization will facilitate discussion and searching within MOOC forums, and also benefit the contextualization of such resources across disparate views. We propose the novel problem of learning resource mention identification in MOOC forums. As this is a novel task with no publicly available data, we first contribute a large-scale labeled dataset, dubbed the Forum Resource Mention (FoRM) dataset, to facilitate our current research and future research on this task. We then formulate this task as a sequence tagging problem and investigate solution architectures to address the problem. Importantly, we identify two major challenges that hinder the application of sequence tagging models to the task: (1) the diversity of resource mention expression, and (2) long-range contextual dependencies. We address these challenges by incorporating character-level and thread context information into a LSTM-CRF model. First, we incorporate a character encoder to address the out-of-vocabulary problem caused by the diversity of mention expressions. Second, to address the context dependency challenge, we encode thread contexts using an RNN-based context encoder, and apply the attention mechanism to selectively leverage useful context information during sequence tagging. Experiments on FoRM show that the proposed method improves the baseline deep sequence tagging models notably, significantly bettering performance on instances that exemplify the two challenges. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08853v1 |
http://arxiv.org/pdf/1811.08853v1.pdf | |
PWC | https://paperswithcode.com/paper/resource-mention-extraction-for-mooc |
Repo | |
Framework | |
Color Constancy by GANs: An Experimental Survey
Title | Color Constancy by GANs: An Experimental Survey |
Authors | Partha Das, Anil S. Baslamisli, Yang Liu, Sezer Karaoglu, Theo Gevers |
Abstract | In this paper, we formulate the color constancy task as an image-to-image translation problem using GANs. By conducting a large set of experiments on different datasets, an experimental survey is provided on the use of different types of GANs to solve for color constancy i.e. CC-GANs (Color Constancy GANs). Based on the experimental review, recommendations are given for the design of CC-GAN architectures based on different criteria, circumstances and datasets. |
Tasks | Color Constancy, Image-to-Image Translation |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03085v1 |
http://arxiv.org/pdf/1812.03085v1.pdf | |
PWC | https://paperswithcode.com/paper/color-constancy-by-gans-an-experimental |
Repo | |
Framework | |
Stochastic Zeroth-order Optimization via Variance Reduction method
Title | Stochastic Zeroth-order Optimization via Variance Reduction method |
Authors | Liu Liu, Minhao Cheng, Cho-Jui Hsieh, Dacheng Tao |
Abstract | Derivative-free optimization has become an important technique used in machine learning for optimizing black-box models. To conduct updates without explicitly computing gradient, most current approaches iteratively sample a random search direction from Gaussian distribution and compute the estimated gradient along that direction. However, due to the variance in the search direction, the convergence rates and query complexities of existing methods suffer from a factor of $d$, where $d$ is the problem dimension. In this paper, we introduce a novel Stochastic Zeroth-order method with Variance Reduction under Gaussian smoothing (SZVR-G) and establish the complexity for optimizing non-convex problems. With variance reduction on both sample space and search space, the complexity of our algorithm is sublinear to $d$ and is strictly better than current approaches, in both smooth and non-smooth cases. Moreover, we extend the proposed method to the mini-batch version. Our experimental results demonstrate the superior performance of the proposed method over existing derivative-free optimization techniques. Furthermore, we successfully apply our method to conduct a universal black-box attack to deep neural networks and present some interesting results. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.11811v3 |
http://arxiv.org/pdf/1805.11811v3.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-zeroth-order-optimization-via |
Repo | |
Framework | |