January 28, 2020

3150 words 15 mins read

Paper Group ANR 831

Analyzing Adversarial Attacks Against Deep Learning for Intrusion Detection in IoT Networks. Automatic cough detection based on airflow signals for portable spirometry system. OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits. SliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view Graphs. Ad …

Analyzing Adversarial Attacks Against Deep Learning for Intrusion Detection in IoT Networks


Title	Analyzing Adversarial Attacks Against Deep Learning for Intrusion Detection in IoT Networks
Authors	Olakunle Ibitoye, Omair Shafiq, Ashraf Matrawy
Abstract	Adversarial attacks have been widely studied in the field of computer vision but their impact on network security applications remains an area of open research. As IoT, 5G and AI continue to converge to realize the promise of the fourth industrial revolution (Industry 4.0), security incidents and events on IoT networks have increased. Deep learning techniques are being applied to detect and mitigate many of such security threats against IoT networks. Feedforward Neural Networks (FNN) have been widely used for classifying intrusion attacks in IoT networks. In this paper, we consider a variant of the FNN known as the Self-normalizing Neural Network (SNN) and compare its performance with the FNN for classifying intrusion attacks in an IoT network. Our analysis is performed using the BoT-IoT dataset from the Cyber Range Lab of the center of UNSW Canberra Cyber. In our experimental results, the FNN outperforms the SNN for intrusion detection in IoT networks based on multiple performance metrics such as accuracy, precision, and recall as well as multi-classification metrics such as Cohen’s Kappa score. However, when tested for adversarial robustness, the SNN demonstrates better resilience against the adversarial samples from the IoT dataset, presenting a promising future in the quest for safer and more secure deep learning in IoT networks.
Tasks	Intrusion Detection
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05137v1
PDF	https://arxiv.org/pdf/1905.05137v1.pdf
PWC	https://paperswithcode.com/paper/analyzing-adversarial-attacks-against-deep
Repo
Framework

Automatic cough detection based on airflow signals for portable spirometry system


Title	Automatic cough detection based on airflow signals for portable spirometry system
Authors	Mateusz Soliński, Michał Łepek, Łukasz Kołtowski
Abstract	We give a short introduction to cough detection efforts that were undertaken during the last decade and we describe the solution for automatic cough detection developed for the AioCare portable spirometry system. In contrast to more popular analysis of sound and audio recordings, we fully based our approach on airflow signals only. As the system is intended to be used in a large variety of environments and different patients, we trained and validated the algorithm using AioCare-collected data and the large database of spirometry curves from the NHANES database by the American National Center for Health Statistics. We trained different classifiers, such as logistic regression, feed-forward artificial neural network, support vector machine, and random forest to choose the one with the best performance. The ANN solution was selected as the final classifier. The classification results on the test set (AioCare data) are: 0.86 (sensitivity), 0.91 (specificity), 0.91 (accuracy) and 0.88 (F1 score). The classification methodology developed in this study is robust for detecting cough events during spirometry measurements. As far as we know, the solution presented in this work is the first fully reproducible description of the automatic cough detection algorithm based totally on airflow signals and the first cough detection implemented in a commercial spirometry system that is to be published.
Tasks
Published	2019-02-26
URL	https://arxiv.org/abs/1903.03588v4
PDF	https://arxiv.org/pdf/1903.03588v4.pdf
PWC	https://paperswithcode.com/paper/automatic-cough-detection-for-portable
Repo
Framework

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits


Title	OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits
Authors	Niladri S. Chatterji, Vidya Muthukumar, Peter L. Bartlett
Abstract	We consider the stochastic linear (multi-armed) contextual bandit problem with the possibility of hidden \textit{simple multi-armed bandit} structure in which the rewards are independent of the contextual information. Algorithms that are designed solely for one of the regimes are known to be sub-optimal for their alternate regime. We design a single computationally efficient algorithm that simultaneously obtains problem-dependent optimal regret rates in the simple multi-armed bandit regime and minimax optimal regret rates in the linear contextual bandit regime, without knowing a priori which of the two models generates the rewards. These results are proved under the condition of stochasticity of contextual information over multiple rounds. Our results should be viewed as a step towards principled data-dependent policy class selection for contextual bandits.
Tasks	Multi-Armed Bandits
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10040v3
PDF	https://arxiv.org/pdf/1905.10040v3.pdf
PWC	https://paperswithcode.com/paper/osom-a-simultaneously-optimal-algorithm-for
Repo
Framework

SliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view Graphs


Title	SliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view Graphs
Authors	Hamed Nilforoshan, Neil Shah
Abstract	Given the reach of web platforms, bad actors have considerable incentives to manipulate and defraud users at the expense of platform integrity. This has spurred research in numerous suspicious behavior detection tasks, including detection of sybil accounts, false information, and payment scams/fraud. In this paper, we draw the insight that many such initiatives can be tackled in a common framework by posing a detection task which seeks to find groups of entities which share too many properties with one another across multiple attributes (sybil accounts created at the same time and location, propaganda spreaders broadcasting articles with the same rhetoric and with similar reshares, etc.) Our work makes four core contributions: Firstly, we posit a novel formulation of this task as a multi-view graph mining problem, in which distinct views reflect distinct attribute similarities across entities, and contextual similarity and attribute importance are respected. Secondly, we propose a novel suspiciousness metric for scoring entity groups given the abnormality of their synchronicity across multiple views, which obeys intuitive desiderata that existing metrics do not. Finally, we propose the SliceNDice algorithm which enables efficient extraction of highly suspicious entity groups, and demonstrate its practicality in production, in terms of strong detection performance and discoveries on Snapchat’s large advertiser ecosystem (89% precision and numerous discoveries of real fraud rings), marked outperformance of baselines (over 97% precision/recall in simulated settings) and linear scalability.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07087v2
PDF	https://arxiv.org/pdf/1908.07087v2.pdf
PWC	https://paperswithcode.com/paper/silcendice-mining-suspicious-multi-attribute
Repo
Framework

AdvGAN++ : Harnessing latent layers for adversary generation


Title	AdvGAN++ : Harnessing latent layers for adversary generation
Authors	Puneet Mangla, Surgan Jandial, Sakshi Varshney, Vineeth N Balasubramanian
Abstract	Adversarial examples are fabricated examples, indistinguishable from the original image that mislead neural networks and drastically lower their performance. Recently proposed AdvGAN, a GAN based approach, takes input image as a prior for generating adversaries to target a model. In this work, we show how latent features can serve as better priors than input images for adversary generation by proposing AdvGAN++, a version of AdvGAN that achieves higher attack rates than AdvGAN and at the same time generates perceptually realistic images on MNIST and CIFAR-10 datasets.
Tasks
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00706v2
PDF	https://arxiv.org/pdf/1908.00706v2.pdf
PWC	https://paperswithcode.com/paper/advgan-harnessing-latent-layers-for-adversary
Repo
Framework

Detecting Cybersecurity Events from Noisy Short Text


Title	Detecting Cybersecurity Events from Noisy Short Text
Authors	Semih Yagcioglu, Mehmet Saygin Seyfioglu, Begum Citamak, Batuhan Bardak, Seren Guldamlasioglu, Azmi Yuksel, Emin Islam Tatli
Abstract	It is very critical to analyze messages shared over social networks for cyber threat intelligence and cyber-crime prevention. In this study, we propose a method that leverages both domain-specific word embeddings and task-specific features to detect cyber security events from tweets. Our model employs a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network which takes word level meta-embeddings as inputs and incorporates contextual embeddings to classify noisy short text. We collected a new dataset of cyber security related tweets from Twitter and manually annotated a subset of 2K of them. We experimented with this dataset and concluded that the proposed model outperforms both traditional and neural baselines. The results suggest that our method works well for detecting cyber security events from noisy short text.
Tasks	Word Embeddings
Published	2019-04-10
URL	https://arxiv.org/abs/1904.05054v2
PDF	https://arxiv.org/pdf/1904.05054v2.pdf
PWC	https://paperswithcode.com/paper/detecting-cybersecurity-events-from-noisy
Repo
Framework

A Scalable Learned Index Scheme in Storage Systems


Title	A Scalable Learned Index Scheme in Storage Systems
Authors	Pengfei Li, Yu Hua, Pengfei Zuo, Jingnan Jia
Abstract	Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high throughput performance with limited system resources. The promising learned indexes leverage deep-learning models to complement existing index structures and obtain significant memory savings. However, the learned indexes fail to become scalable due to the heavy inter-model dependency and expensive retraining. To address these problems, we propose a scalable learned index scheme to construct different linear regression models according to the data distribution. Moreover, the used models are independent so as to reduce the complexity of retraining and become easy to partition and store the data into different pages, blocks or distributed systems. Our experimental results show that compared with state-of-the-art schemes, AIDEL improves the insertion performance by about 2$\times$ and provides comparable lookup performance, while efficiently supporting scalability.
Tasks
Published	2019-05-08
URL	https://arxiv.org/abs/1905.06256v1
PDF	https://arxiv.org/pdf/1905.06256v1.pdf
PWC	https://paperswithcode.com/paper/190506256
Repo
Framework


Title	NETR-Tree: An Eifficient Framework for Social-Based Time-Aware Spatial Keyword Query
Authors	Zhixian Yang, Yuanning Gao, Xiaofeng Gao, Guihai Chen
Abstract	The prevalence of social media and the development of geo-positioning technology stimulate the growth of location-based social networks (LBSNs). With a large volume of data containing locations, texts, check-in information, and social relationships, spatial keyword queries in LBSNs have become increasingly complex. In this paper, we identify and solve the Social-based Time-aware Spatial Keyword Query (STSKQ) that returns the top-k objects by taking geo-spatial score, keywords similarity, visiting time score, and social relationship effect into consideration. To tackle STSKQ, we propose a two-layer hybrid index structure called Network Embedding Time-aware R-tree (NETR-tree). In user layer, we exploit network embedding strategy to measure relationship effect in users’ relationship network. In location layer, we build a Time-aware R-tree (TR-tree), which considers spatial objects’ spatio-temporal check-in information. On the basis of NETR-tree, a corresponding query processing algorithm is presented. Finally, extensive experiments on real-data collected from two different real-life LBSNs demonstrate the effectiveness and efficiency of the proposed methods, compared with existing state-of-the-art methods.
Tasks	Network Embedding
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09520v1
PDF	https://arxiv.org/pdf/1908.09520v1.pdf
PWC	https://paperswithcode.com/paper/netr-tree-an-eifficient-framework-for-social
Repo
Framework


Title	Animal Wildlife Population Estimation Using Social Media Images Collections
Authors	Matteo Foglio, Lorenzo Semeria, Guido Muscioni, Riccardo Pressiani, Tanya Berger-Wolf
Abstract	We are losing biodiversity at an unprecedented scale and in many cases, we do not even know the basic data for the species. Traditional methods for wildlife monitoring are inadequate. Development of new computer vision tools enables the use of images as the source of information about wildlife. Social media is the rich source of wildlife images, which come with a huge bias, thus thwarting traditional population size estimate approaches. Here, we present a new framework to take into account the social media bias when using this data source to provide wildlife population size estimates. We show that, surprisingly, this is a learnable and potentially solvable problem.
Tasks
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01875v2
PDF	https://arxiv.org/pdf/1908.01875v2.pdf
PWC	https://paperswithcode.com/paper/animal-wildlife-population-estimation-using
Repo
Framework

“Best-of-Many-Samples” Distribution Matching


Title	“Best-of-Many-Samples” Distribution Matching
Authors	Apratim Bhattacharyya, Mario Fritz, Bernt Schiele
Abstract	Generative Adversarial Networks (GANs) can achieve state-of-the-art sample quality in generative modelling tasks but suffer from the mode collapse problem. Variational Autoencoders (VAE) on the other hand explicitly maximize a reconstruction-based data log-likelihood forcing it to cover all modes, but suffer from poorer sample quality. Recent works have proposed hybrid VAE-GAN frameworks which integrate a GAN-based synthetic likelihood to the VAE objective to address both the mode collapse and sample quality issues, with limited success. This is because the VAE objective forces a trade-off between the data log-likelihood and divergence to the latent prior. The synthetic likelihood ratio term also shows instability during training. We propose a novel objective with a “Best-of-Many-Samples” reconstruction cost and a stable direct estimate of the synthetic likelihood. This enables our hybrid VAE-GAN framework to achieve high data log-likelihood and low divergence to the latent prior at the same time and shows significant improvement over both hybrid VAE-GANS and plain GANs in mode coverage and quality.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12598v1
PDF	https://arxiv.org/pdf/1909.12598v1.pdf
PWC	https://paperswithcode.com/paper/best-of-many-samples-distribution-matching-1
Repo
Framework

Visual Localization Using Sparse Semantic 3D Map


Title	Visual Localization Using Sparse Semantic 3D Map
Authors	Tianxin Shi, Shuhan Shen, Xiang Gao, Lingjie Zhu
Abstract	Accurate and robust visual localization under a wide range of viewing condition variations including season and illumination changes, as well as weather and day-night variations, is the key component for many computer vision and robotics applications. Under these conditions, most traditional methods would fail to locate the camera. In this paper we present a visual localization algorithm that combines structure-based method and image-based method with semantic information. Given semantic information about the query and database images, the retrieved images are scored according to the semantic consistency of the 3D model and the query image. Then the semantic matching score is used as weight for RANSAC’s sampling and the pose is solved by a standard PnP solver. Experiments on the challenging long-term visual localization benchmark dataset demonstrate that our method has significant improvement compared with the state-of-the-arts.
Tasks	Visual Localization
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03803v2
PDF	https://arxiv.org/pdf/1904.03803v2.pdf
PWC	https://paperswithcode.com/paper/visual-localization-using-sparse-semantic-3d
Repo
Framework

Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network


Title	Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network
Authors	Hideki Oki, Takio Kurita
Abstract	The deep Convolutional Neural Network (CNN) became very popular as a fundamental technique for image classification and objects recognition. To improve the recognition accuracy for the more complex tasks, deeper networks have being introduced. However, the recognition accuracy of the trained deep CNN drastically decreases for the samples which are obtained from the outside regions of the training samples. To improve the generalization ability for such samples, Krizhevsky et al. proposed to generate additional samples through transformations from the existing samples and to make the training samples richer. This method is known as data augmentation. Hongyi Zhang et al. introduced data augmentation method called mixup which achieves state-of-the-art performance in various datasets. Mixup generates new samples by mixing two different training samples. Mixing of the two images is implemented with simple image morphing. In this paper, we propose to apply mixup to the feature maps in a hidden layer. To implement the mixup in the hidden layer we use the Siamese network or the triplet network architecture to mix feature maps. From the experimental comparison, it is observed that the mixup of the feature maps obtained from the first convolution layer is more effective than the original image mixup.
Tasks	Data Augmentation, Image Classification, Image Morphing
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09739v1
PDF	https://arxiv.org/pdf/1906.09739v1.pdf
PWC	https://paperswithcode.com/paper/mixup-of-feature-maps-in-a-hidden-layer-for
Repo
Framework

Variable Grouping Based Bayesian Additive Regression Tree


Title	Variable Grouping Based Bayesian Additive Regression Tree
Authors	Yuhao Su, Jie Ding
Abstract	Using ensemble methods for regression has been a large success in obtaining high-accuracy prediction. Examples are Bagging, Random forest, Boosting, BART (Bayesian additive regression tree), and their variants. In this paper, we propose a new perspective named variable grouping to enhance the predictive performance. The main idea is to seek for potential grouping of variables in such way that there is no nonlinear interaction term between variables of different groups. Given a sum-of-learner model, each learner will only be responsible for one group of variables, which would be more efficient in modeling nonlinear interactions. We propose a two-stage method named variable grouping based Bayesian additive regression tree (GBART) with a well-developed python package gbart available. The first stage is to search for potential interactions and an appropriate grouping of variables. The second stage is to build a final model based on the discovered groups. Experiments on synthetic and real data show that the proposed method can perform significantly better than classical approaches.
Tasks
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00922v2
PDF	https://arxiv.org/pdf/1911.00922v2.pdf
PWC	https://paperswithcode.com/paper/variable-grouping-based-bayesian-additive
Repo
Framework


Title	The power of dynamic social networks to predict individuals’ mental health
Authors	Shikang Liu, David Hachen, Omar Lizardo, Christian Poellabauer, Aaron Striegel, Tijana Milenkovic
Abstract	Precision medicine has received attention both in and outside the clinic. We focus on the latter, by exploiting the relationship between individuals’ social interactions and their mental health to develop a predictive model of one’s likelihood to be depressed or anxious from rich dynamic social network data. To our knowledge, we are the first to do this. Existing studies differ from our work in at least one aspect: they do not model social interaction data as a network; they do so but analyze static network data; they examine “correlation” between social networks and health but without developing a predictive model; or they study other individual traits but not mental health. In a systematic and comprehensive evaluation, we show that our predictive model that uses dynamic social network data is superior to its static network as well as non-network equivalents when run on the same data.
Tasks
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02614v1
PDF	https://arxiv.org/pdf/1908.02614v1.pdf
PWC	https://paperswithcode.com/paper/the-power-of-dynamic-social-networks-to
Repo
Framework

Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks


Title	Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks
Authors	Siddhartha Nuthakki, Sunil Neela, Judy W. Gichoya, Saptarshi Purkayastha
Abstract	Coding diagnosis and procedures in medical records is a crucial process in the healthcare industry, which includes the creation of accurate billings, receiving reimbursements from payers, and creating standardized patient care records. In the United States, Billing and Insurance related activities cost around $471 billion in 2012 which constitutes about 25% of all the U.S hospital spending. In this paper, we report the performance of a natural language processing model that can map clinical notes to medical codes, and predict final diagnosis from unstructured entries of history of present illness, symptoms at the time of admission, etc. Previous studies have demonstrated that deep learning models perform better at such mapping when compared to conventional machine learning models. Therefore, we employed state-of-the-art deep learning method, ULMFiT on the largest emergency department clinical notes dataset MIMIC III which has 1.2M clinical notes to select for the top-10 and top-50 diagnosis and procedure codes. Our models were able to predict the top-10 diagnoses and procedures with 80.3% and 80.5% accuracy, whereas the top-50 ICD-9 codes of diagnosis and procedures are predicted with 70.7% and 63.9% accuracy. Prediction of diagnosis and procedures from unstructured clinical notes benefit human coders to save time, eliminate errors and minimize costs. With promising scores from our present model, the next step would be to deploy this on a small-scale real-world scenario and compare it with human coders as the gold standard. We believe that further research of this approach can create highly accurate predictions that can ease the workflow in a clinical setting.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12397v1
PDF	https://arxiv.org/pdf/1912.12397v1.pdf
PWC	https://paperswithcode.com/paper/natural-language-processing-of-mimic-iii
Repo
Framework

Paper Group ANR 831

Analyzing Adversarial Attacks Against Deep Learning for Intrusion Detection in IoT Networks

Automatic cough detection based on airflow signals for portable spirometry system

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

SliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view Graphs

AdvGAN++ : Harnessing latent layers for adversary generation

Detecting Cybersecurity Events from Noisy Short Text

A Scalable Learned Index Scheme in Storage Systems

NETR-Tree: An Eifficient Framework for Social-Based Time-Aware Spatial Keyword Query

Animal Wildlife Population Estimation Using Social Media Images Collections

“Best-of-Many-Samples” Distribution Matching

Visual Localization Using Sparse Semantic 3D Map

Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network

Variable Grouping Based Bayesian Additive Regression Tree

The power of dynamic social networks to predict individuals’ mental health

Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks

Paper Group ANR 1174

Paper Group ANR 945

Paper Group ANR 1138