Paper Group ANR 332
A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction. Clipping free attacks against artificial neural networks. Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection. Adaptive Monte-Carlo Optimization. Cardiopulmonary resuscitation quality parameters from motion capture …
A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction
Title | A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction |
Authors | Xiaodan Zhang, Xinbo Gao, Wen Lu, Lihuo He |
Abstract | Learning fine-grained details is a key issue in image aesthetic assessment. Most of the previous methods extract the fine-grained details via random cropping strategy, which may undermine the integrity of semantic information. Extensive studies show that humans perceive fine-grained details with a mixture of foveal vision and peripheral vision. Fovea has the highest possible visual acuity and is responsible for seeing the details. The peripheral vision is used for perceiving the broad spatial scene and selecting the attended regions for the fovea. Inspired by these observations, we propose a Gated Peripheral-Foveal Convolutional Neural Network (GPF-CNN). It is a dedicated double-subnet neural network, i.e. a peripheral subnet and a foveal subnet. The former aims to mimic the functions of peripheral vision to encode the holistic information and provide the attended regions. The latter aims to extract fine-grained features on these key regions. Considering that the peripheral vision and foveal vision play different roles in processing different visual stimuli, we further employ a gated information fusion (GIF) network to weight their contributions. The weights are determined through the fully connected layers followed by a sigmoid function. We conduct comprehensive experiments on the standard AVA and Photo.net datasets for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction. The experimental results demonstrate the effectiveness of the proposed method. |
Tasks | |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.07989v2 |
https://arxiv.org/pdf/1812.07989v2.pdf | |
PWC | https://paperswithcode.com/paper/a-gated-peripheral-foveal-convolutional |
Repo | |
Framework | |
Clipping free attacks against artificial neural networks
Title | Clipping free attacks against artificial neural networks |
Authors | Boussad Addad, Jerome Kodjabachian, Christophe Meyer |
Abstract | During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as feature squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construction the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG encoding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact. |
Tasks | Malware Detection, Speech Recognition |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09468v2 |
http://arxiv.org/pdf/1803.09468v2.pdf | |
PWC | https://paperswithcode.com/paper/clipping-free-attacks-against-artificial |
Repo | |
Framework | |
Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection
Title | Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection |
Authors | Andy Brown, Aaron Tuor, Brian Hutchinson, Nicole Nichols |
Abstract | Deep learning has recently demonstrated state-of-the art performance on key tasks related to the maintenance of computer systems, such as intrusion detection, denial of service attack detection, hardware and software system failures, and malware detection. In these contexts, model interpretability is vital for administrator and analyst to trust and act on the automated analysis of machine learning models. Deep learning methods have been criticized as black box oracles which allow limited insight into decision factors. In this work we seek to “bridge the gap” between the impressive performance of deep learning models and the need for interpretable model introspection. To this end we present recurrent neural network (RNN) language models augmented with attention for anomaly detection in system logs. Our methods are generally applicable to any computer system and logging source. By incorporating attention variants into our RNN language models we create opportunities for model introspection and analysis without sacrificing state-of-the art performance. We demonstrate model performance and illustrate model interpretability on an intrusion detection task using the Los Alamos National Laboratory (LANL) cyber security dataset, reporting upward of 0.99 area under the receiver operator characteristic curve despite being trained only on a single day’s worth of data. |
Tasks | Anomaly Detection, Intrusion Detection, Malware Detection |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.04967v1 |
http://arxiv.org/pdf/1803.04967v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-network-attention-mechanisms |
Repo | |
Framework | |
Adaptive Monte-Carlo Optimization
Title | Adaptive Monte-Carlo Optimization |
Authors | Vivek Bagaria, Govinda M. Kamath, David N. Tse |
Abstract | The celebrated Monte Carlo method estimates a quantity that is expensive to compute by random sampling. We propose adaptive Monte Carlo optimization: a general framework for discrete optimization of an expensive-to-compute function by adaptive random sampling. Applications of this framework have already appeared in machine learning but are tied to their specific contexts and developed in isolation. We take a unified view and show that the framework has broad applicability by applying it on several common machine learning problems: $k$-nearest neighbors, hierarchical clustering and maximum mutual information feature selection. On real data we show that this framework allows us to develop algorithms that confer a gain of a magnitude or two over exact computation. We also characterize the performance gain theoretically under regularity assumptions on the data that we verify in real world data. The code is available at https://github.com/govinda-kamath/combinatorial_MAB. |
Tasks | Feature Selection |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08321v2 |
http://arxiv.org/pdf/1805.08321v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-monte-carlo-optimization |
Repo | |
Framework | |
Cardiopulmonary resuscitation quality parameters from motion capture data using Differential Evolution fitting of sinusoids
Title | Cardiopulmonary resuscitation quality parameters from motion capture data using Differential Evolution fitting of sinusoids |
Authors | Christian Lins, Daniel Eckhoff, Andreas Klausen, Sandra Hellmers, Andreas Hein, Sebastian Fudickar |
Abstract | Cardiopulmonary resuscitation (CPR) is alongside electrical defibrillation the most crucial countermeasure for sudden cardiac arrest, which affects thousands of individuals every year. In this paper, we present a novel approach including sinusoid models that use skeletal motion data from an RGB-D (Kinect) sensor and the Differential Evolution (DE) optimization algorithm to dynamically fit sinusoidal curves to derive frequency and depth parameters for cardiopulmonary resuscitation training. It is intended to be part of a robust and easy-to-use feedback system for CPR training, allowing its use for unsupervised training. The accuracy of this DE-based approach is evaluated in comparison with data of 28 participants recorded by a state-of-the-art training mannequin. We optimized the DE algorithm hyperparameters and showed that with these optimized parameters the frequency of the CPR is recognized with a median error of $\pm 2.9$ compressions per minute compared to the reference training mannequin. |
Tasks | Motion Capture |
Published | 2018-06-26 |
URL | https://arxiv.org/abs/1806.10115v4 |
https://arxiv.org/pdf/1806.10115v4.pdf | |
PWC | https://paperswithcode.com/paper/cardiopulmonary-resuscitation-quality |
Repo | |
Framework | |
Automatic Electrodes Detection during simultaneous EEG/fMRI acquisition
Title | Automatic Electrodes Detection during simultaneous EEG/fMRI acquisition |
Authors | Mathis Fleury, Pierre Maurel, Marsel Mano, Elise Bannier, Christian Barillot |
Abstract | Simultaneous EEG/fMRI acquisition allows to measure brain activity at high spatial-temporal resolution. The localisation of EEG sources depends on several parameters including the position of the electrodes on the scalp. The position of the MR electrodes during its acquisitions is obtained with the use of the UTE sequence allowing their visualisation. The retrieval of the electrodes consists in obtaining the volume where the electrodes are located by applying a sphere detection algorithm. We detect around 90% of electrodes for each subject, and our UTE-based electrode detection showed an average position error of 3.7mm for all subjects. |
Tasks | EEG |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06139v1 |
http://arxiv.org/pdf/1809.06139v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-electrodes-detection-during |
Repo | |
Framework | |
Local Wealth Redistribution Promotes Cooperation in Multiagent Systems
Title | Local Wealth Redistribution Promotes Cooperation in Multiagent Systems |
Authors | Flávio L. Pinheiro, Fernando P. Santos |
Abstract | Designing mechanisms that leverage cooperation between agents has been a long-lasting goal in Multiagent Systems. The task is especially challenging when agents are selfish, lack common goals and face social dilemmas, i.e., situations in which individual interest conflicts with social welfare. Past works explored mechanisms that explain cooperation in biological and social systems, providing important clues for the aim of designing cooperative artificial societies. In particular, several works show that cooperation is able to emerge when specific network structures underlie agents’ interactions. Notwithstanding, social dilemmas in which defection is highly tempting still pose challenges concerning the effective sustainability of cooperation. Here we propose a new redistribution mechanism that can be applied in structured populations of agents. Importantly, we show that, when implemented locally (i.e., agents share a fraction of their wealth surplus with their nearest neighbors), redistribution excels in promoting cooperation under regimes where, before, only defection prevailed. |
Tasks | |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01730v1 |
http://arxiv.org/pdf/1802.01730v1.pdf | |
PWC | https://paperswithcode.com/paper/local-wealth-redistribution-promotes |
Repo | |
Framework | |
Explaining Black-box Android Malware Detection
Title | Explaining Black-box Android Malware Detection |
Authors | Marco Melis, Davide Maiorca, Battista Biggio, Giorgio Giacinto, Fabio Roli |
Abstract | Machine-learning models have been recently used for detecting malicious Android applications, reporting impressive performances on benchmark datasets, even when trained only on features statically extracted from the application, such as system calls and permissions. However, recent findings have highlighted the fragility of such in-vitro evaluations with benchmark datasets, showing that very few changes to the content of Android malware may suffice to evade detection. How can we thus trust that a malware detector performing well on benchmark data will continue to do so when deployed in an operating environment? To mitigate this issue, the most popular Android malware detectors use linear, explainable machine-learning models to easily identify the most influential features contributing to each decision. In this work, we generalize this approach to any black-box machine- learning model, by leveraging a gradient-based approach to identify the most influential local features. This enables using nonlinear models to potentially increase accuracy without sacrificing interpretability of decisions. Our approach also highlights the global characteristics learned by the model to discriminate between benign and malware applications. Finally, as shown by our empirical analysis on a popular Android malware detection task, it also helps identifying potential vulnerabilities of linear and nonlinear models against adversarial manipulations. |
Tasks | Android Malware Detection, Malware Detection |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03544v2 |
http://arxiv.org/pdf/1803.03544v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-black-box-android-malware |
Repo | |
Framework | |
Autoencoding any Data through Kernel Autoencoders
Title | Autoencoding any Data through Kernel Autoencoders |
Authors | Pierre Laforgue, Stephan Clémençon, Florence d’Alché-Buc |
Abstract | This paper investigates a novel algorithmic approach to data representation based on kernel methods. Assuming that the observations lie in a Hilbert space X, the introduced Kernel Autoencoder (KAE) is the composition of mappings from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs) that minimizes the expected reconstruction error. Beyond a first extension of the autoencoding scheme to possibly infinite dimensional Hilbert spaces, KAE further allows to autoencode any kind of data by choosing X to be itself a RKHS. A theoretical analysis of the model is carried out, providing a generalization bound, and shedding light on its connection with Kernel Principal Component Analysis. The proposed algorithms are then detailed at length: they crucially rely on the form taken by the minimizers, revealed by a dedicated Representer Theorem. Finally, numerical experiments on both simulated data and real labeled graphs (molecules) provide empirical evidence of the KAE performances. |
Tasks | |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.11028v2 |
http://arxiv.org/pdf/1805.11028v2.pdf | |
PWC | https://paperswithcode.com/paper/autoencoding-any-data-through-kernel |
Repo | |
Framework | |
DNN: A Two-Scale Distributional Tale of Heterogeneous Treatment Effect Inference
Title | DNN: A Two-Scale Distributional Tale of Heterogeneous Treatment Effect Inference |
Authors | Yingying Fan, Jinchi Lv, Jingbo Wang |
Abstract | Heterogeneous treatment effects are the center of gravity in many modern causal inference applications. In this paper, we investigate the estimation and inference of heterogeneous treatment effects with precision in a general nonparametric setting. To this end, we enhance the classical $k$-nearest neighbor method with a simple algorithm, extend it to a distributional setting, and suggest the two-scale distributional nearest neighbors (DNN) estimator with reduced finite-sample bias. Our recipe is first to subsample the data and average the 1-nearest neighbor estimators from each subsample. With appropriately chosen subsampling scale, the resulting DNN estimator is proved to be asymptotically unbiased and normal under mild regularity conditions. We then proceed with combining DNN estimators with different subsampling scales to further reduce bias. Our theoretical results on the advantages of the new two-scale DNN framework are well supported by several Monte Carlo simulations. The newly suggested method is also applied to a real-life data set to study the heterogeneity of treatment effects of smoking on children’s birth weights across mothers’ ages. |
Tasks | Causal Inference |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.08469v1 |
http://arxiv.org/pdf/1808.08469v1.pdf | |
PWC | https://paperswithcode.com/paper/dnn-a-two-scale-distributional-tale-of |
Repo | |
Framework | |
Facial Landmark Point Localization using Coarse-to-Fine Deep Recurrent Neural Network
Title | Facial Landmark Point Localization using Coarse-to-Fine Deep Recurrent Neural Network |
Authors | Shahar Mahpod, Rig Das, Emanuele Maiorana, Yosi Keller, Patrizio Campisi |
Abstract | The accurate localization of facial landmarks is at the core of face analysis tasks, such as face recognition and facial expression analysis, to name a few. In this work we propose a novel localization approach based on a Deep Learning architecture that utilizes dual cascaded CNN subnetworks of the same length, where each subnetwork in a cascade refines the accuracy of its predecessor. The first set of cascaded subnetworks estimates heatmaps that encode the landmarks’ locations, while the second set of cascaded subnetworks refines the heatmaps-based localization using regression, and also receives as input the output of the corresponding heatmap estimation subnetwork. The proposed scheme is experimentally shown to compare favorably with contemporary state-of-the-art schemes. |
Tasks | Face Recognition, Facial Landmark Detection |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01760v2 |
http://arxiv.org/pdf/1805.01760v2.pdf | |
PWC | https://paperswithcode.com/paper/facial-landmark-point-localization-using |
Repo | |
Framework | |
Densely Dilated Spatial Pooling Convolutional Network using benign loss functions for imbalanced volumetric prostate segmentation
Title | Densely Dilated Spatial Pooling Convolutional Network using benign loss functions for imbalanced volumetric prostate segmentation |
Authors | Qiuhua Liu, Min Fu, Xinqi Gong, Hao Jiang |
Abstract | The high incidence rate of prostate disease poses a requirement in early detection for diagnosis. As one of the main imaging methods used for prostate cancer detection, Magnetic Resonance Imaging (MRI) has wide range of appearance and imbalance problems, making automated prostate segmentation fundamental but challenging. Here we propose a novel Densely Dilated Spatial Pooling Convolutional Network (DDSP ConNet) in encoder-decoder structure. It employs dense structure to combine dilated convolution and global pooling, thus supplies coarse segmentation results from encoder and decoder subnet and preserves more contextual information. To obtain richer hierarchical feature maps, residual long connection is furtherly adopted to fuse contexture features. Meanwhile, we adopt DSC loss and Jaccard loss functions to train our DDSP ConNet. We surprisingly found and proved that, in contrast to re-weighted cross entropy, DSC loss and Jaccard loss have a lot of benign properties in theory, including symmetry, continuity and differentiability about the parameters of network. Extensive experiments on the MICCAI PROMISE12 challenge dataset have been done to corroborate the effectiveness of our DDSP ConNet with DSC loss and Jaccard loss. Totally, our method achieves a score of 85.78 in the test dataset, outperforming most of other competitors. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10517v2 |
http://arxiv.org/pdf/1801.10517v2.pdf | |
PWC | https://paperswithcode.com/paper/densely-dilated-spatial-pooling-convolutional |
Repo | |
Framework | |
Kernel embedding of maps for sequential Bayesian inference: The variational mapping particle filter
Title | Kernel embedding of maps for sequential Bayesian inference: The variational mapping particle filter |
Authors | Manuel Pulido, Peter Jan vanLeeuwen |
Abstract | In this work, a novel sequential Monte Carlo filter is introduced which aims at efficient sampling of high-dimensional state spaces with a limited number of particles. Particles are pushed forward from the prior to the posterior density using a sequence of mappings that minimizes the Kullback-Leibler divergence between the posterior and the sequence of intermediate densities. The sequence of mappings represents a gradient flow. A key ingredient of the mappings is that they are embedded in a reproducing kernel Hilbert space, which allows for a practical and efficient algorithm. The embedding provides a direct means to calculate the gradient of the Kullback-Leibler divergence leading to quick convergence using well-known gradient-based stochastic optimization algorithms. Evaluation of the method is conducted in the chaotic Lorenz-63 system, the Lorenz-96 system, which is a coarse prototype of atmospheric dynamics, and an epidemic model that describes cholera dynamics. No resampling is required in the mapping particle filter even for long recursive sequences. The number of effective particles remains close to the total number of particles in all the experiments. |
Tasks | Bayesian Inference, Stochastic Optimization |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11380v1 |
http://arxiv.org/pdf/1805.11380v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-embedding-of-maps-for-sequential |
Repo | |
Framework | |
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Title | Mini-batch Serialization: CNN Training with Inter-layer Data Reuse |
Authors | Sangkug Lym, Armand Behroozi, Wei Wen, Ge Li, Yongkee Kwon, Mattan Erez |
Abstract | Training convolutional neural networks (CNNs) requires intense computations and high memory bandwidth. We find that bandwidth today is over-provisioned because most memory accesses in CNN training can be eliminated by rearranging computation to better utilize on-chip buffers and avoid traffic resulting from large per-layer memory footprints. We introduce the MBS CNN training approach that significantly reduces memory traffic by partially serializing mini-batch processing across groups of layers. This optimizes reuse within on-chip buffers and balances both intra-layer and inter-layer reuse. We also introduce the WaveCore CNN training accelerator that effectively trains CNNs in the MBS approach with high functional-unit utilization. Combined, WaveCore and MBS reduce DRAM traffic by 75%, improve performance by 53%, and save 26% system energy for modern deep CNN training compared to conventional training mechanisms and accelerators. |
Tasks | |
Published | 2018-09-30 |
URL | https://arxiv.org/abs/1810.00307v4 |
https://arxiv.org/pdf/1810.00307v4.pdf | |
PWC | https://paperswithcode.com/paper/mini-batch-serialization-cnn-training-with |
Repo | |
Framework | |
Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation
Title | Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation |
Authors | Junyang Lin, Shuming Ma, Qi Su, Xu Sun |
Abstract | Attention-based sequence-to-sequence model has proved successful in Neural Machine Translation (NMT). However, the attention without consideration of decoding history, which includes the past information in the decoder and the attention mechanism, often causes much repetition. To address this problem, we propose the decoding-history-based Adaptive Control of Attention (ACA) for the NMT model. ACA learns to control the attention by keeping track of the decoding history and the current information with a memory vector, so that the model can take the translated contents and the current information into consideration. Experiments on Chinese-English translation and the English-Vietnamese translation have demonstrated that our model significantly outperforms the strong baselines. The analysis shows that our model is capable of generating translation with less repetition and higher accuracy. The code will be available at https://github.com/lancopku |
Tasks | Machine Translation |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.01812v1 |
http://arxiv.org/pdf/1802.01812v1.pdf | |
PWC | https://paperswithcode.com/paper/decoding-history-based-adaptive-control-of |
Repo | |
Framework | |