Paper Group ANR 901
The Adversarial Machine Learning Conundrum: Can The Insecurity of ML Become The Achilles’ Heel of Cognitive Networks?. Deep Anchored Convolutional Neural Networks. The Virtual Patch Clamp: Imputing C. elegans Membrane Potentials from Calcium Imaging. Time series cluster kernels to exploit informative missingness and incomplete label information. Cl …
The Adversarial Machine Learning Conundrum: Can The Insecurity of ML Become The Achilles’ Heel of Cognitive Networks?
Title | The Adversarial Machine Learning Conundrum: Can The Insecurity of ML Become The Achilles’ Heel of Cognitive Networks? |
Authors | Muhammad Usama, Junaid Qadir, Ala Al-Fuqaha, Mounir Hamdi |
Abstract | The holy grail of networking is to create \textit{cognitive networks} that organize, manage, and drive themselves. Such a vision now seems attainable thanks in large part to the progress in the field of machine learning (ML), which has now already disrupted a number of industries and revolutionized practically all fields of research. But are the ML models foolproof and robust to security attacks to be in charge of managing the network? Unfortunately, many modern ML models are easily misled by simple and easily-crafted adversarial perturbations, which does not bode well for the future of ML-based cognitive networks unless ML vulnerabilities for the cognitive networking environment are identified, addressed, and fixed. The purpose of this article is to highlight the problem of insecure ML and to sensitize the readers to the danger of adversarial ML by showing how an easily-crafted adversarial ML example can compromise the operations of the cognitive self-driving network. In this paper, we demonstrate adversarial attacks on two simple yet representative cognitive networking applications (namely, intrusion detection and network traffic classification). We also provide some guidelines to design secure ML models for cognitive networks that are robust to adversarial attacks on the ML pipeline of cognitive networks. |
Tasks | Intrusion Detection |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00679v1 |
https://arxiv.org/pdf/1906.00679v1.pdf | |
PWC | https://paperswithcode.com/paper/190600679 |
Repo | |
Framework | |
Deep Anchored Convolutional Neural Networks
Title | Deep Anchored Convolutional Neural Networks |
Authors | Jiahui Huang, Kshitij Dwivedi, Gemma Roig |
Abstract | Convolutional Neural Networks (CNNs) have been proven to be extremely successful at solving computer vision tasks. State-of-the-art methods favor such deep network architectures for its accuracy performance, with the cost of having massive number of parameters and high weights redundancy. Previous works have studied how to prune such CNNs weights. In this paper, we go to another extreme and analyze the performance of a network stacked with a single convolution kernel across layers, as well as other weights sharing techniques. We name it Deep Anchored Convolutional Neural Network (DACNN). Sharing the same kernel weights across layers allows to reduce the model size tremendously, more precisely, the network is compressed in memory by a factor of L, where L is the desired depth of the network, disregarding the fully connected layer for prediction. The number of parameters in DACNN barely increases as the network grows deeper, which allows us to build deep DACNNs without any concern about memory costs. We also introduce a partial shared weights network (DACNN-mix) as well as an easy-plug-in module, coined regulators, to boost the performance of our architecture. We validated our idea on 3 datasets: CIFAR-10, CIFAR-100 and SVHN. Our results show that we can save massive amounts of memory with our model, while maintaining a high accuracy performance. |
Tasks | |
Published | 2019-04-22 |
URL | http://arxiv.org/abs/1904.09764v1 |
http://arxiv.org/pdf/1904.09764v1.pdf | |
PWC | https://paperswithcode.com/paper/190409764 |
Repo | |
Framework | |
The Virtual Patch Clamp: Imputing C. elegans Membrane Potentials from Calcium Imaging
Title | The Virtual Patch Clamp: Imputing C. elegans Membrane Potentials from Calcium Imaging |
Authors | Andrew Warrington, Arthur Spencer, Frank Wood |
Abstract | We develop a stochastic whole-brain and body simulator of the nematode roundworm Caenorhabditis elegans (C. elegans) and show that it is sufficiently regularizing to allow imputation of latent membrane potentials from partial calcium fluorescence imaging observations. This is the first attempt we know of to “complete the circle,” where an anatomically grounded whole-connectome simulator is used to impute a time-varying “brain” state at single-cell fidelity from covariates that are measurable in practice. The sequential Monte Carlo (SMC) method we employ not only enables imputation of said latent states but also presents a strategy for learning simulator parameters via variational optimization of the noisy model evidence approximation provided by SMC. Our imputation and parameter estimation experiments were conducted on distributed systems using novel implementations of the aforementioned techniques applied to synthetic data of dimension and type representative of that which are measured in laboratories currently. |
Tasks | Imputation |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.11075v1 |
https://arxiv.org/pdf/1907.11075v1.pdf | |
PWC | https://paperswithcode.com/paper/the-virtual-patch-clamp-imputing-c-elegans |
Repo | |
Framework | |
Time series cluster kernels to exploit informative missingness and incomplete label information
Title | Time series cluster kernels to exploit informative missingness and incomplete label information |
Authors | Karl Øyvind Mikalsen, Cristina Soguero-Ruiz, Filippo Maria Bianchi, Arthur Revhaug, Robert Jenssen |
Abstract | The time series cluster kernel (TCK) provides a powerful tool for analysing multivariate time series subject to missing data. TCK is designed using an ensemble learning approach in which Bayesian mixture models form the base models. Because of the Bayesian approach, TCK can naturally deal with missing values without resorting to imputation and the ensemble strategy ensures robustness to hyperparameters, making it particularly well suited for unsupervised learning. However, TCK assumes missing at random and that the underlying missingness mechanism is ignorable, i.e. uninformative, an assumption that does not hold in many real-world applications, such as e.g. medicine. To overcome this limitation, we present a kernel capable of exploiting the potentially rich information in the missing values and patterns, as well as the information from the observed data. In our approach, we create a representation of the missing pattern, which is incorporated into mixed mode mixture models in such a way that the information provided by the missing patterns is effectively exploited. Moreover, we also propose a semi-supervised kernel, capable of taking advantage of incomplete label information to learn more accurate similarities. Experiments on benchmark data, as well as a real-world case study of patients described by longitudinal electronic health record data who potentially suffer from hospital-acquired infections, demonstrate the effectiveness of the proposed methods. |
Tasks | Imputation, Time Series |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.05251v1 |
https://arxiv.org/pdf/1907.05251v1.pdf | |
PWC | https://paperswithcode.com/paper/time-series-cluster-kernels-to-exploit |
Repo | |
Framework | |
Class-Conditional Domain Adaptation on Semantic Segmentation
Title | Class-Conditional Domain Adaptation on Semantic Segmentation |
Authors | Yue Wang, Yuke Li, James H. Elder, Runmin Wu, Huchuan Lu |
Abstract | Semantic segmentation is an important sub-task for many applications, but pixel-level ground truth labeling is costly and there is a tendency to overfit the training data, limiting generalization. Unsupervised domain adaptation can potentially address these problems, allowing systems trained on labelled datasets from one or more source domains (including less expensive synthetic domains) to be adapted to novel target domains. The conventional approach is to automatically align the representational distributions of source and target domains. One limitation of this approach is that it tends to disadvantage lower probability classes. We address this problem by introducing a Class-Conditional Domain Adaptation method (CCDA). It includes a class-conditional multi-scale discriminator and the class-conditional loss. This novel CCDA method encourages the network to shift the domain in a class-conditional manner, and it equalizes loss over classes. We evaluate our CCDA method on two transfer tasks and demonstrate performance comparable to state-of-the-art methods. |
Tasks | Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11981v2 |
https://arxiv.org/pdf/1911.11981v2.pdf | |
PWC | https://paperswithcode.com/paper/class-conditional-domain-adaptation-on |
Repo | |
Framework | |
Reconstruction and Membership Inference Attacks against Generative Models
Title | Reconstruction and Membership Inference Attacks against Generative Models |
Authors | Benjamin Hilprecht, Martin Härterich, Daniel Bernau |
Abstract | We present two information leakage attacks that outperform previous work on membership inference against generative models. The first attack allows membership inference without assumptions on the type of the generative model. Contrary to previous evaluation metrics for generative models, like Kernel Density Estimation, it only considers samples of the model which are close to training data records. The second attack specifically targets Variational Autoencoders, achieving high membership inference accuracy. Furthermore, previous work mostly considers membership inference adversaries who perform single record membership inference. We argue for considering regulatory actors who perform set membership inference to identify the use of specific datasets for training. The attacks are evaluated on two generative model architectures, Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), trained on standard image datasets. Our results show that the two attacks yield success rates superior to previous work on most data sets while at the same time having only very mild assumptions. We envision the two attacks in combination with the membership inference attack type formalization as especially useful. For example, to enforce data privacy standards and automatically assessing model quality in machine learning as a service setups. In practice, our work motivates the use of GANs since they prove less vulnerable against information leakage attacks while producing detailed samples. |
Tasks | Density Estimation, Inference Attack |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03006v1 |
https://arxiv.org/pdf/1906.03006v1.pdf | |
PWC | https://paperswithcode.com/paper/reconstruction-and-membership-inference |
Repo | |
Framework | |
Graph-based Transforms for Video Coding
Title | Graph-based Transforms for Video Coding |
Authors | Hilmi E. Egilmez, Yung-Hsuan Chao, Antonio Ortega |
Abstract | In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper introduces a class of transforms called graph-based transforms (GBTs) for video compression, and proposes two different techniques to design GBTs. In the first technique, we formulate an optimization problem to learn graphs from data and provide solutions for optimal separable and nonseparable GBT designs, called GL-GBTs. The optimality of the proposed GL-GBTs is also theoretically analyzed based on Gaussian-Markov random field (GMRF) models for intra and inter predicted block signals. The second technique develops edge-adaptive GBTs (EA-GBTs) in order to flexibly adapt transforms to block signals with image edges (discontinuities). The advantages of EA-GBTs are both theoretically and empirically demonstrated. Our experimental results demonstrate that the proposed transforms can significantly outperform the traditional Karhunen-Loeve transform (KLT). |
Tasks | Video Compression |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.00952v1 |
https://arxiv.org/pdf/1909.00952v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-transforms-for-video-coding |
Repo | |
Framework | |
Visual Tree Convolutional Neural Network in Image Classification
Title | Visual Tree Convolutional Neural Network in Image Classification |
Authors | Yuntao Liu, Yong Dou, Ruochun Jin, Peng Qiao |
Abstract | In image classification, Convolutional Neural Network(CNN) models have achieved high performance with the rapid development in deep learning. However, some categories in the image datasets are more difficult to distinguished than others. Improving the classification accuracy on these confused categories is benefit to the overall performance. In this paper, we build a Confusion Visual Tree(CVT) based on the confused semantic level information to identify the confused categories. With the information provided by the CVT, we can lead the CNN training procedure to pay more attention on these confused categories. Therefore, we propose Visual Tree Convolutional Neural Networks(VT-CNN) based on the original deep CNN embedded with our CVT. We evaluate our VT-CNN model on the benchmark datasets CIFAR-10 and CIFAR-100. In our experiments, we build up 3 different VT-CNN models and they obtain improvement over their based CNN models by 1.36%, 0.89% and 0.64%, respectively. |
Tasks | Image Classification |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01536v1 |
https://arxiv.org/pdf/1906.01536v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-tree-convolutional-neural-network-in |
Repo | |
Framework | |
Personalized sentence generation using generative adversarial networks with author-specific word usage
Title | Personalized sentence generation using generative adversarial networks with author-specific word usage |
Authors | Chenhan Yuan, Yi-Chin Huang |
Abstract | The author-specific word usage is a vital feature to let readers perceive the writing style of the author. In this work, a personalized sentence generation method based on generative adversarial networks (GANs) is proposed to cope with this issue. The frequently used function word and content word are incorporated not only as the input features but also as the sentence structure constraint for the GAN training. For the sentence generation with the related topics decided by the user, the Named Entity Recognition (NER) information of the input words is also used in the network training. We compared the proposed method with the GAN-based sentence generation methods, and the experimental results showed that the generated sentences using our method are more similar to the original sentences of the same author based on the objective evaluation such as BLEU and SimHash score. |
Tasks | Named Entity Recognition |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09442v1 |
http://arxiv.org/pdf/1904.09442v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-sentence-generation-using |
Repo | |
Framework | |
Diagnosis of Autism Spectrum Disorder by Causal Influence Strength Learned from Resting-State fMRI Data
Title | Diagnosis of Autism Spectrum Disorder by Causal Influence Strength Learned from Resting-State fMRI Data |
Authors | Biwei Huang, Kun Zhang, Ruben Sanchez-Romero, Joseph Ramsey, Madelyn Glymour, Clark Glymour |
Abstract | Autism spectrum disorder (ASD) is one of the major developmental disorders affecting children. Recently, it has been hypothesized that ASD is associated with atypical brain connectivities. A substantial body of researches use Pearson’s correlation coefficients, mutual information, or partial correlation to investigate the differences in brain connectivities between ASD and typical controls from functional Magnetic Resonance Imaging (fMRI). However, correlation or partial correlation does not directly reveal causal influences - the information flow - between brain regions. Comparing to correlation, causality pinpoints the key connectivity characteristics and removes redundant features for diagnosis. In this paper, we propose a two-step method for large-scale and cyclic causal discovery from fMRI. It can identify brain causal structures without doing interventional experiments. The learned causal structure, as well as the causal influence strength, provides us the path and effectiveness of information flow. With the recovered causal influence strength as candidate features, we then perform ASD diagnosis by further doing feature selection and classification. We apply our methods to three datasets from Autism Brain Imaging Data Exchange (ABIDE). From experimental results, it shows that with causal connectivities, the diagnostic accuracy largely improves. A closer examination shows that information flows starting from the superior front gyrus to default mode network and posterior areas are largely reduced. Moreover, all enhanced information flows are from posterior to anterior or in local areas. Overall, it shows that long-range influences have a larger proportion of reductions than local ones, while local influences have a larger proportion of increases than long-range ones. By examining the graph properties of brain causal structure, the group of ASD shows reduced small-worldness. |
Tasks | Causal Discovery, Feature Selection |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1902.10073v2 |
http://arxiv.org/pdf/1902.10073v2.pdf | |
PWC | https://paperswithcode.com/paper/diagnosis-of-autism-spectrum-disorder-by |
Repo | |
Framework | |
HexaGAN: Generative Adversarial Nets for Real World Classification
Title | HexaGAN: Generative Adversarial Nets for Real World Classification |
Authors | Uiwon Hwang, Dahuin Jung, Sungroh Yoon |
Abstract | Most deep learning classification studies assume clean data. However, when dealing with the real world data, we encounter three problems such as 1) missing data, 2) class imbalance, and 3) missing label problems. These problems undermine the performance of a classifier. Various preprocessing techniques have been proposed to mitigate one of these problems, but an algorithm that assumes and resolves all three problems together has not been proposed yet. In this paper, we propose HexaGAN, a generative adversarial network framework that shows promising classification performance for all three problems. We interpret the three problems from a single perspective to solve them jointly. To enable this, the framework consists of six components, which interact with each other. We also devise novel loss functions corresponding to the architecture. The designed loss functions allow us to achieve state-of-the-art imputation performance, with up to a 14% improvement, and to generate high-quality class-conditional data. We evaluate the classification performance (F1-score) of the proposed method with 20% missingness and confirm up to a 5% improvement in comparison with the performance of combinations of state-of-the-art methods. |
Tasks | Imputation |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1902.09913v2 |
https://arxiv.org/pdf/1902.09913v2.pdf | |
PWC | https://paperswithcode.com/paper/hexagan-generative-adversarial-nets-for-real |
Repo | |
Framework | |
Multilingual Factor Analysis
Title | Multilingual Factor Analysis |
Authors | Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos-Korfiatis, Nils Hammerla |
Abstract | In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning. We explore the task of alignment by querying the fitted model for multilingual embeddings achieving competitive results across a variety of tasks. The proposed model is robust to noise in the embedding space making it a suitable method for distributed representations learned from noisy corpora. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05547v2 |
https://arxiv.org/pdf/1905.05547v2.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-factor-analysis |
Repo | |
Framework | |
Deep Mixture Point Processes: Spatio-temporal Event Prediction with Rich Contextual Information
Title | Deep Mixture Point Processes: Spatio-temporal Event Prediction with Rich Contextual Information |
Authors | Maya Okawa, Tomoharu Iwata, Takeshi Kurashima, Yusuke Tanaka, Hiroyuki Toda, Naonori Ueda |
Abstract | Predicting when and where events will occur in cities, like taxi pick-ups, crimes, and vehicle collisions, is a challenging and important problem with many applications in fields such as urban planning, transportation optimization and location-based marketing. Though many point processes have been proposed to model events in a continuous spatio-temporal space, none of them allow for the consideration of the rich contextual factors that affect event occurrence, such as weather, social activities, geographical characteristics, and traffic. In this paper, we propose \textsf{DMPP} (Deep Mixture Point Processes), a point process model for predicting spatio-temporal events with the use of rich contextual information; a key advance is its incorporation of the heterogeneous and high-dimensional context available in image and text data. Specifically, we design the intensity of our point process model as a mixture of kernels, where the mixture weights are modeled by a deep neural network. This formulation allows us to automatically learn the complex nonlinear effects of the contextual factors on event occurrence. At the same time, this formulation makes analytical integration over the intensity, which is required for point process estimation, tractable. We use real-world data sets from different domains to demonstrate that DMPP has better predictive performance than existing methods. |
Tasks | Point Processes |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.08952v1 |
https://arxiv.org/pdf/1906.08952v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-mixture-point-processes-spatio-temporal |
Repo | |
Framework | |
FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary
Title | FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary |
Authors | Yingzhen Yang, Nebojsa Jojic, Jun Huan |
Abstract | We present a novel method of compression of deep Convolutional Neural Networks (CNNs). Our method reduces the number of parameters of each convolutional layer by learning a 3D tensor termed Filter Summary (FS). The convolutional filters are extracted from FS as overlapping 3D blocks, and nearby filters in FS share weights in their overlapping regions in a natural way. The resultant neural network based on such weight sharing scheme, termed Filter Summary CNNs or FSNet, has a FS in each convolution layer instead of a set of independent filters in the conventional convolution layer. FSNet has the same architecture as that of the baseline CNN to be compressed, and each convolution layer of FSNet generates the same number of filters from FS as that of the basline CNN in the forward process. Without hurting the inference speed, the parameter space of FSNet is much smaller than that of the baseline CNN. In addition, FSNet is compatible with weight quantization, leading to even higher compression ratio when combined with weight quantization. Experiments demonstrate the effectiveness of FSNet in compression of CNNs for computer vision tasks including image classification and object detection. For classification task, FSNet of 0.22M effective parameters has prediction accuracy of 93.91% on the CIFAR-10 dataset with less than 0.3% accuracy drop, using ResNet-18 of 11.18M parameters as baseline. Furthermore, FSNet version of ResNet-50 with 2.75M effective parameters achieves the top-1 and top-5 accuracy of 63.80% and 85.72% respectively on ILSVRC-12 benchmark. For object detection task, FSNet is used to compress the Single Shot MultiBox Detector (SSD300) of 26.32M parameters. FSNet of 0.45M effective parameters achieves mAP of 67.63% on the VOC2007 test data with weight quantization, and FSNet of 0.68M effective parameters achieves mAP of 70.00% with weight quantization on the same test data. |
Tasks | Image Classification, Object Detection, Quantization |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.03264v2 |
http://arxiv.org/pdf/1902.03264v2.pdf | |
PWC | https://paperswithcode.com/paper/fsnet-compression-of-deep-convolutional |
Repo | |
Framework | |
Group Pruning using a Bounded-Lp norm for Group Gating and Regularization
Title | Group Pruning using a Bounded-Lp norm for Group Gating and Regularization |
Authors | Chaithanya Kumar Mummadi, Tim Genewein, Dan Zhang, Thomas Brox, Volker Fischer |
Abstract | Deep neural networks achieve state-of-the-art results on several tasks while increasing in complexity. It has been shown that neural networks can be pruned during training by imposing sparsity inducing regularizers. In this paper, we investigate two techniques for group-wise pruning during training in order to improve network efficiency. We propose a gating factor after every convolutional layer to induce channel level sparsity, encouraging insignificant channels to become exactly zero. Further, we introduce and analyse a bounded variant of the L1 regularizer, which interpolates between L1 and L0-norms to retain performance of the network at higher pruning rates. To underline effectiveness of the proposed methods,we show that the number of parameters of ResNet-164, DenseNet-40 and MobileNetV2 can be reduced down by 30%, 69% and 75% on CIFAR100 respectively without a significant drop in accuracy. We achieve state-of-the-art pruning results for ResNet-50 with higher accuracy on ImageNet. Furthermore, we show that the light weight MobileNetV2 can further be compressed on ImageNet without a significant drop in performance. |
Tasks | |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03463v1 |
https://arxiv.org/pdf/1908.03463v1.pdf | |
PWC | https://paperswithcode.com/paper/group-pruning-using-a-bounded-lp-norm-for |
Repo | |
Framework | |