October 19, 2019

3155 words 15 mins read

Paper Group ANR 332

A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction. Clipping free attacks against artificial neural networks. Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection. Adaptive Monte-Carlo Optimization. Cardiopulmonary resuscitation quality parameters from motion capture …

A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction


Title	A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction
Authors	Xiaodan Zhang, Xinbo Gao, Wen Lu, Lihuo He
Abstract	Learning fine-grained details is a key issue in image aesthetic assessment. Most of the previous methods extract the fine-grained details via random cropping strategy, which may undermine the integrity of semantic information. Extensive studies show that humans perceive fine-grained details with a mixture of foveal vision and peripheral vision. Fovea has the highest possible visual acuity and is responsible for seeing the details. The peripheral vision is used for perceiving the broad spatial scene and selecting the attended regions for the fovea. Inspired by these observations, we propose a Gated Peripheral-Foveal Convolutional Neural Network (GPF-CNN). It is a dedicated double-subnet neural network, i.e. a peripheral subnet and a foveal subnet. The former aims to mimic the functions of peripheral vision to encode the holistic information and provide the attended regions. The latter aims to extract fine-grained features on these key regions. Considering that the peripheral vision and foveal vision play different roles in processing different visual stimuli, we further employ a gated information fusion (GIF) network to weight their contributions. The weights are determined through the fully connected layers followed by a sigmoid function. We conduct comprehensive experiments on the standard AVA and Photo.net datasets for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction. The experimental results demonstrate the effectiveness of the proposed method.
Tasks
Published	2018-12-19
URL	https://arxiv.org/abs/1812.07989v2
PDF	https://arxiv.org/pdf/1812.07989v2.pdf
PWC	https://paperswithcode.com/paper/a-gated-peripheral-foveal-convolutional
Repo
Framework

Clipping free attacks against artificial neural networks


Title	Clipping free attacks against artificial neural networks
Authors	Boussad Addad, Jerome Kodjabachian, Christophe Meyer
Abstract	During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as feature squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construction the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG encoding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact.
Tasks	Malware Detection, Speech Recognition
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09468v2
PDF	http://arxiv.org/pdf/1803.09468v2.pdf
PWC	https://paperswithcode.com/paper/clipping-free-attacks-against-artificial
Repo
Framework

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection


Title	Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection
Authors	Andy Brown, Aaron Tuor, Brian Hutchinson, Nicole Nichols
Abstract	Deep learning has recently demonstrated state-of-the art performance on key tasks related to the maintenance of computer systems, such as intrusion detection, denial of service attack detection, hardware and software system failures, and malware detection. In these contexts, model interpretability is vital for administrator and analyst to trust and act on the automated analysis of machine learning models. Deep learning methods have been criticized as black box oracles which allow limited insight into decision factors. In this work we seek to “bridge the gap” between the impressive performance of deep learning models and the need for interpretable model introspection. To this end we present recurrent neural network (RNN) language models augmented with attention for anomaly detection in system logs. Our methods are generally applicable to any computer system and logging source. By incorporating attention variants into our RNN language models we create opportunities for model introspection and analysis without sacrificing state-of-the art performance. We demonstrate model performance and illustrate model interpretability on an intrusion detection task using the Los Alamos National Laboratory (LANL) cyber security dataset, reporting upward of 0.99 area under the receiver operator characteristic curve despite being trained only on a single day’s worth of data.
Tasks	Anomaly Detection, Intrusion Detection, Malware Detection
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04967v1
PDF	http://arxiv.org/pdf/1803.04967v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-network-attention-mechanisms
Repo
Framework

Adaptive Monte-Carlo Optimization


Title	Adaptive Monte-Carlo Optimization
Authors	Vivek Bagaria, Govinda M. Kamath, David N. Tse
Abstract	The celebrated Monte Carlo method estimates a quantity that is expensive to compute by random sampling. We propose adaptive Monte Carlo optimization: a general framework for discrete optimization of an expensive-to-compute function by adaptive random sampling. Applications of this framework have already appeared in machine learning but are tied to their specific contexts and developed in isolation. We take a unified view and show that the framework has broad applicability by applying it on several common machine learning problems: $k$-nearest neighbors, hierarchical clustering and maximum mutual information feature selection. On real data we show that this framework allows us to develop algorithms that confer a gain of a magnitude or two over exact computation. We also characterize the performance gain theoretically under regularity assumptions on the data that we verify in real world data. The code is available at https://github.com/govinda-kamath/combinatorial_MAB.
Tasks	Feature Selection
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08321v2
PDF	http://arxiv.org/pdf/1805.08321v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-monte-carlo-optimization
Repo
Framework

Cardiopulmonary resuscitation quality parameters from motion capture data using Differential Evolution fitting of sinusoids


Title	Cardiopulmonary resuscitation quality parameters from motion capture data using Differential Evolution fitting of sinusoids
Authors	Christian Lins, Daniel Eckhoff, Andreas Klausen, Sandra Hellmers, Andreas Hein, Sebastian Fudickar
Abstract	Cardiopulmonary resuscitation (CPR) is alongside electrical defibrillation the most crucial countermeasure for sudden cardiac arrest, which affects thousands of individuals every year. In this paper, we present a novel approach including sinusoid models that use skeletal motion data from an RGB-D (Kinect) sensor and the Differential Evolution (DE) optimization algorithm to dynamically fit sinusoidal curves to derive frequency and depth parameters for cardiopulmonary resuscitation training. It is intended to be part of a robust and easy-to-use feedback system for CPR training, allowing its use for unsupervised training. The accuracy of this DE-based approach is evaluated in comparison with data of 28 participants recorded by a state-of-the-art training mannequin. We optimized the DE algorithm hyperparameters and showed that with these optimized parameters the frequency of the CPR is recognized with a median error of $\pm 2.9$ compressions per minute compared to the reference training mannequin.
Tasks	Motion Capture
Published	2018-06-26
URL	https://arxiv.org/abs/1806.10115v4
PDF	https://arxiv.org/pdf/1806.10115v4.pdf
PWC	https://paperswithcode.com/paper/cardiopulmonary-resuscitation-quality
Repo
Framework

Automatic Electrodes Detection during simultaneous EEG/fMRI acquisition


Title	Automatic Electrodes Detection during simultaneous EEG/fMRI acquisition
Authors	Mathis Fleury, Pierre Maurel, Marsel Mano, Elise Bannier, Christian Barillot
Abstract	Simultaneous EEG/fMRI acquisition allows to measure brain activity at high spatial-temporal resolution. The localisation of EEG sources depends on several parameters including the position of the electrodes on the scalp. The position of the MR electrodes during its acquisitions is obtained with the use of the UTE sequence allowing their visualisation. The retrieval of the electrodes consists in obtaining the volume where the electrodes are located by applying a sphere detection algorithm. We detect around 90% of electrodes for each subject, and our UTE-based electrode detection showed an average position error of 3.7mm for all subjects.
Tasks	EEG
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06139v1
PDF	http://arxiv.org/pdf/1809.06139v1.pdf
PWC	https://paperswithcode.com/paper/automatic-electrodes-detection-during
Repo
Framework

Local Wealth Redistribution Promotes Cooperation in Multiagent Systems


Title	Local Wealth Redistribution Promotes Cooperation in Multiagent Systems
Authors	Flávio L. Pinheiro, Fernando P. Santos
Abstract	Designing mechanisms that leverage cooperation between agents has been a long-lasting goal in Multiagent Systems. The task is especially challenging when agents are selfish, lack common goals and face social dilemmas, i.e., situations in which individual interest conflicts with social welfare. Past works explored mechanisms that explain cooperation in biological and social systems, providing important clues for the aim of designing cooperative artificial societies. In particular, several works show that cooperation is able to emerge when specific network structures underlie agents’ interactions. Notwithstanding, social dilemmas in which defection is highly tempting still pose challenges concerning the effective sustainability of cooperation. Here we propose a new redistribution mechanism that can be applied in structured populations of agents. Importantly, we show that, when implemented locally (i.e., agents share a fraction of their wealth surplus with their nearest neighbors), redistribution excels in promoting cooperation under regimes where, before, only defection prevailed.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01730v1
PDF	http://arxiv.org/pdf/1802.01730v1.pdf
PWC	https://paperswithcode.com/paper/local-wealth-redistribution-promotes
Repo
Framework

Explaining Black-box Android Malware Detection


Title	Explaining Black-box Android Malware Detection
Authors	Marco Melis, Davide Maiorca, Battista Biggio, Giorgio Giacinto, Fabio Roli
Abstract	Machine-learning models have been recently used for detecting malicious Android applications, reporting impressive performances on benchmark datasets, even when trained only on features statically extracted from the application, such as system calls and permissions. However, recent findings have highlighted the fragility of such in-vitro evaluations with benchmark datasets, showing that very few changes to the content of Android malware may suffice to evade detection. How can we thus trust that a malware detector performing well on benchmark data will continue to do so when deployed in an operating environment? To mitigate this issue, the most popular Android malware detectors use linear, explainable machine-learning models to easily identify the most influential features contributing to each decision. In this work, we generalize this approach to any black-box machine- learning model, by leveraging a gradient-based approach to identify the most influential local features. This enables using nonlinear models to potentially increase accuracy without sacrificing interpretability of decisions. Our approach also highlights the global characteristics learned by the model to discriminate between benign and malware applications. Finally, as shown by our empirical analysis on a popular Android malware detection task, it also helps identifying potential vulnerabilities of linear and nonlinear models against adversarial manipulations.
Tasks	Android Malware Detection, Malware Detection
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03544v2
PDF	http://arxiv.org/pdf/1803.03544v2.pdf
PWC	https://paperswithcode.com/paper/explaining-black-box-android-malware
Repo
Framework

Autoencoding any Data through Kernel Autoencoders


Title	Autoencoding any Data through Kernel Autoencoders
Authors	Pierre Laforgue, Stephan Clémençon, Florence d’Alché-Buc
Abstract	This paper investigates a novel algorithmic approach to data representation based on kernel methods. Assuming that the observations lie in a Hilbert space X, the introduced Kernel Autoencoder (KAE) is the composition of mappings from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs) that minimizes the expected reconstruction error. Beyond a first extension of the autoencoding scheme to possibly infinite dimensional Hilbert spaces, KAE further allows to autoencode any kind of data by choosing X to be itself a RKHS. A theoretical analysis of the model is carried out, providing a generalization bound, and shedding light on its connection with Kernel Principal Component Analysis. The proposed algorithms are then detailed at length: they crucially rely on the form taken by the minimizers, revealed by a dedicated Representer Theorem. Finally, numerical experiments on both simulated data and real labeled graphs (molecules) provide empirical evidence of the KAE performances.
Tasks
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11028v2
PDF	http://arxiv.org/pdf/1805.11028v2.pdf
PWC	https://paperswithcode.com/paper/autoencoding-any-data-through-kernel
Repo
Framework

DNN: A Two-Scale Distributional Tale of Heterogeneous Treatment Effect Inference


Title	DNN: A Two-Scale Distributional Tale of Heterogeneous Treatment Effect Inference
Authors	Yingying Fan, Jinchi Lv, Jingbo Wang
Abstract	Heterogeneous treatment effects are the center of gravity in many modern causal inference applications. In this paper, we investigate the estimation and inference of heterogeneous treatment effects with precision in a general nonparametric setting. To this end, we enhance the classical $k$-nearest neighbor method with a simple algorithm, extend it to a distributional setting, and suggest the two-scale distributional nearest neighbors (DNN) estimator with reduced finite-sample bias. Our recipe is first to subsample the data and average the 1-nearest neighbor estimators from each subsample. With appropriately chosen subsampling scale, the resulting DNN estimator is proved to be asymptotically unbiased and normal under mild regularity conditions. We then proceed with combining DNN estimators with different subsampling scales to further reduce bias. Our theoretical results on the advantages of the new two-scale DNN framework are well supported by several Monte Carlo simulations. The newly suggested method is also applied to a real-life data set to study the heterogeneity of treatment effects of smoking on children’s birth weights across mothers’ ages.
Tasks	Causal Inference
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08469v1
PDF	http://arxiv.org/pdf/1808.08469v1.pdf
PWC	https://paperswithcode.com/paper/dnn-a-two-scale-distributional-tale-of
Repo
Framework

Facial Landmark Point Localization using Coarse-to-Fine Deep Recurrent Neural Network


Title	Facial Landmark Point Localization using Coarse-to-Fine Deep Recurrent Neural Network
Authors	Shahar Mahpod, Rig Das, Emanuele Maiorana, Yosi Keller, Patrizio Campisi
Abstract	The accurate localization of facial landmarks is at the core of face analysis tasks, such as face recognition and facial expression analysis, to name a few. In this work we propose a novel localization approach based on a Deep Learning architecture that utilizes dual cascaded CNN subnetworks of the same length, where each subnetwork in a cascade refines the accuracy of its predecessor. The first set of cascaded subnetworks estimates heatmaps that encode the landmarks’ locations, while the second set of cascaded subnetworks refines the heatmaps-based localization using regression, and also receives as input the output of the corresponding heatmap estimation subnetwork. The proposed scheme is experimentally shown to compare favorably with contemporary state-of-the-art schemes.
Tasks	Face Recognition, Facial Landmark Detection
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01760v2
PDF	http://arxiv.org/pdf/1805.01760v2.pdf
PWC	https://paperswithcode.com/paper/facial-landmark-point-localization-using
Repo
Framework

Densely Dilated Spatial Pooling Convolutional Network using benign loss functions for imbalanced volumetric prostate segmentation


Title	Densely Dilated Spatial Pooling Convolutional Network using benign loss functions for imbalanced volumetric prostate segmentation
Authors	Qiuhua Liu, Min Fu, Xinqi Gong, Hao Jiang
Abstract	The high incidence rate of prostate disease poses a requirement in early detection for diagnosis. As one of the main imaging methods used for prostate cancer detection, Magnetic Resonance Imaging (MRI) has wide range of appearance and imbalance problems, making automated prostate segmentation fundamental but challenging. Here we propose a novel Densely Dilated Spatial Pooling Convolutional Network (DDSP ConNet) in encoder-decoder structure. It employs dense structure to combine dilated convolution and global pooling, thus supplies coarse segmentation results from encoder and decoder subnet and preserves more contextual information. To obtain richer hierarchical feature maps, residual long connection is furtherly adopted to fuse contexture features. Meanwhile, we adopt DSC loss and Jaccard loss functions to train our DDSP ConNet. We surprisingly found and proved that, in contrast to re-weighted cross entropy, DSC loss and Jaccard loss have a lot of benign properties in theory, including symmetry, continuity and differentiability about the parameters of network. Extensive experiments on the MICCAI PROMISE12 challenge dataset have been done to corroborate the effectiveness of our DDSP ConNet with DSC loss and Jaccard loss. Totally, our method achieves a score of 85.78 in the test dataset, outperforming most of other competitors.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10517v2
PDF	http://arxiv.org/pdf/1801.10517v2.pdf
PWC	https://paperswithcode.com/paper/densely-dilated-spatial-pooling-convolutional
Repo
Framework

Kernel embedding of maps for sequential Bayesian inference: The variational mapping particle filter


Title	Kernel embedding of maps for sequential Bayesian inference: The variational mapping particle filter
Authors	Manuel Pulido, Peter Jan vanLeeuwen
Abstract	In this work, a novel sequential Monte Carlo filter is introduced which aims at efficient sampling of high-dimensional state spaces with a limited number of particles. Particles are pushed forward from the prior to the posterior density using a sequence of mappings that minimizes the Kullback-Leibler divergence between the posterior and the sequence of intermediate densities. The sequence of mappings represents a gradient flow. A key ingredient of the mappings is that they are embedded in a reproducing kernel Hilbert space, which allows for a practical and efficient algorithm. The embedding provides a direct means to calculate the gradient of the Kullback-Leibler divergence leading to quick convergence using well-known gradient-based stochastic optimization algorithms. Evaluation of the method is conducted in the chaotic Lorenz-63 system, the Lorenz-96 system, which is a coarse prototype of atmospheric dynamics, and an epidemic model that describes cholera dynamics. No resampling is required in the mapping particle filter even for long recursive sequences. The number of effective particles remains close to the total number of particles in all the experiments.
Tasks	Bayesian Inference, Stochastic Optimization
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11380v1
PDF	http://arxiv.org/pdf/1805.11380v1.pdf
PWC	https://paperswithcode.com/paper/kernel-embedding-of-maps-for-sequential
Repo
Framework

Mini-batch Serialization: CNN Training with Inter-layer Data Reuse


Title	Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Authors	Sangkug Lym, Armand Behroozi, Wei Wen, Ge Li, Yongkee Kwon, Mattan Erez
Abstract	Training convolutional neural networks (CNNs) requires intense computations and high memory bandwidth. We find that bandwidth today is over-provisioned because most memory accesses in CNN training can be eliminated by rearranging computation to better utilize on-chip buffers and avoid traffic resulting from large per-layer memory footprints. We introduce the MBS CNN training approach that significantly reduces memory traffic by partially serializing mini-batch processing across groups of layers. This optimizes reuse within on-chip buffers and balances both intra-layer and inter-layer reuse. We also introduce the WaveCore CNN training accelerator that effectively trains CNNs in the MBS approach with high functional-unit utilization. Combined, WaveCore and MBS reduce DRAM traffic by 75%, improve performance by 53%, and save 26% system energy for modern deep CNN training compared to conventional training mechanisms and accelerators.
Tasks
Published	2018-09-30
URL	https://arxiv.org/abs/1810.00307v4
PDF	https://arxiv.org/pdf/1810.00307v4.pdf
PWC	https://paperswithcode.com/paper/mini-batch-serialization-cnn-training-with
Repo
Framework

Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation


Title	Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation
Authors	Junyang Lin, Shuming Ma, Qi Su, Xu Sun
Abstract	Attention-based sequence-to-sequence model has proved successful in Neural Machine Translation (NMT). However, the attention without consideration of decoding history, which includes the past information in the decoder and the attention mechanism, often causes much repetition. To address this problem, we propose the decoding-history-based Adaptive Control of Attention (ACA) for the NMT model. ACA learns to control the attention by keeping track of the decoding history and the current information with a memory vector, so that the model can take the translated contents and the current information into consideration. Experiments on Chinese-English translation and the English-Vietnamese translation have demonstrated that our model significantly outperforms the strong baselines. The analysis shows that our model is capable of generating translation with less repetition and higher accuracy. The code will be available at https://github.com/lancopku
Tasks	Machine Translation
Published	2018-02-06
URL	http://arxiv.org/abs/1802.01812v1
PDF	http://arxiv.org/pdf/1802.01812v1.pdf
PWC	https://paperswithcode.com/paper/decoding-history-based-adaptive-control-of
Repo
Framework