October 20, 2019

3101 words 15 mins read

Paper Group AWR 245

N-BaIoT: Network-based Detection of IoT Botnet Attacks Using Deep Autoencoders. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding. Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs. Lenia - Biology of Artificial Life. The Fast and …

N-BaIoT: Network-based Detection of IoT Botnet Attacks Using Deep Autoencoders


Title	N-BaIoT: Network-based Detection of IoT Botnet Attacks Using Deep Autoencoders
Authors	Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Dominik Breitenbacher, Asaf Shabtai, Yuval Elovici
Abstract	The proliferation of IoT devices which can be more easily compromised than desktop computers has led to an increase in the occurrence of IoT based botnet attacks. In order to mitigate this new threat there is a need to develop new methods for detecting attacks launched from compromised IoT devices and differentiate between hour and millisecond long IoTbased attacks. In this paper we propose and empirically evaluate a novel network based anomaly detection method which extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic emanating from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two of the most widely known IoT based botnets, Mirai and BASHLITE. Our evaluation results demonstrated our proposed method’s ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices which were part of a botnet.
Tasks	Anomaly Detection
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03409v1
PDF	http://arxiv.org/pdf/1805.03409v1.pdf
PWC	https://paperswithcode.com/paper/n-baiot-network-based-detection-of-iot-botnet
Repo	https://github.com/sergts/botnet-traffic-analysis
Framework	tf

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations


Title	Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
Authors	Sosuke Kobayashi
Abstract	We propose a novel data augmentation for labeled sentences called contextual augmentation. We assume an invariance that sentences are natural even if the words in the sentences are replaced with other words with paradigmatic relations. We stochastically replace words with other words that are predicted by a bi-directional language model at the word positions. Words predicted according to a context are numerous but appropriate for the augmentation of the original words. Furthermore, we retrofit a language model with a label-conditional architecture, which allows the model to augment sentences without breaking the label-compatibility. Through the experiments for six various different text classification tasks, we demonstrate that the proposed method improves classifiers based on the convolutional or recurrent neural networks.
Tasks	Data Augmentation, Language Modelling, Text Augmentation, Text Classification
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06201v1
PDF	http://arxiv.org/pdf/1805.06201v1.pdf
PWC	https://paperswithcode.com/paper/contextual-augmentation-data-augmentation-by
Repo	https://github.com/pfnet-research/contextual_augmentation
Framework	none

Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding


Title	Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding
Authors	Yutai Hou, Yijia Liu, Wanxiang Che, Ting Liu
Abstract	In this paper, we study the problem of data augmentation for language understanding in task-oriented dialogue system. In contrast to previous work which augments an utterance without considering its relation with other utterances, we propose a sequence-to-sequence generation based data augmentation framework that leverages one utterance’s same semantic alternatives in the training data. A novel diversity rank is incorporated into the utterance representation to make the model produce diverse utterances and these diversely augmented utterances help to improve the language understanding module. Experimental results on the Airline Travel Information System dataset and a newly created semantic frame annotation on Stanford Multi-turn, Multidomain Dialogue Dataset show that our framework achieves significant improvements of 6.38 and 10.04 F-scores respectively when only a training set of hundreds utterances is represented. Case studies also confirm that our method generates diverse utterances.
Tasks	Data Augmentation, Text Augmentation
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01554v1
PDF	http://arxiv.org/pdf/1807.01554v1.pdf
PWC	https://paperswithcode.com/paper/sequence-to-sequence-data-augmentation-for
Repo	https://github.com/AtmaHou/Seq2SeqDataAugmentationForLU
Framework	pytorch

Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs


Title	Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs
Authors	Claude Coulombe
Abstract	In practice, it is common to find oneself with far too little text data to train a deep neural network. This “Big Data Wall” represents a challenge for minority language communities on the Internet, organizations, laboratories and companies that compete the GAFAM (Google, Amazon, Facebook, Apple, Microsoft). While most of the research effort in text data augmentation aims on the long-term goal of finding end-to-end learning solutions, which is equivalent to “using neural networks to feed neural networks”, this engineering work focuses on the use of practical, robust, scalable and easy-to-implement data augmentation pre-processing techniques similar to those that are successful in computer vision. Several text augmentation techniques have been experimented. Some existing ones have been tested for comparison purposes such as noise injection or the use of regular expressions. Others are modified or improved techniques like lexical replacement. Finally more innovative ones, such as the generation of paraphrases using back-translation or by the transformation of syntactic trees, are based on robust, scalable, and easy-to-use NLP Cloud APIs. All the text augmentation techniques studied, with an amplification factor of only 5, increased the accuracy of the results in a range of 4.3% to 21.6%, with significant statistical fluctuations, on a standardized task of text polarity prediction. Some standard deep neural network architectures were tested: the multilayer perceptron (MLP), the long short-term memory recurrent network (LSTM) and the bidirectional LSTM (biLSTM). Classical XGBoost algorithm has been tested with up to 2.5% improvements.
Tasks	Data Augmentation, Text Augmentation
Published	2018-12-05
URL	https://arxiv.org/abs/1812.04718v1
PDF	https://arxiv.org/pdf/1812.04718v1.pdf
PWC	https://paperswithcode.com/paper/text-data-augmentation-made-simple-by
Repo	https://github.com/ClaudeCoulombe/TextDataAugmentation
Framework	none

Lenia - Biology of Artificial Life


Title	Lenia - Biology of Artificial Life
Authors	Bert Wang-Chak Chan
Abstract	We report a new system of artificial life called Lenia (from Latin lenis “smooth”), a two-dimensional cellular automaton with continuous space-time-state and generalized local rule. Computer simulations show that Lenia supports a great diversity of complex autonomous patterns or “lifeforms” bearing resemblance to real-world microscopic organisms. More than 400 species in 18 families have been identified, many discovered via interactive evolutionary computation. They differ from other cellular automata patterns in being geometric, metameric, fuzzy, resilient, adaptive, and rule-generic. We present basic observations of the system regarding the properties of space-time and basic settings. We provide a broad survey of the lifeforms, categorize them into a hierarchical taxonomy, and map their distribution in the parameter hyperspace. We describe their morphological structures and behavioral dynamics, propose possible mechanisms of their self-propulsion, self-organization and plasticity. Finally, we discuss how the study of Lenia would be related to biology, artificial life, and artificial intelligence.
Tasks	Artificial Life
Published	2018-12-13
URL	https://arxiv.org/abs/1812.05433v3
PDF	https://arxiv.org/pdf/1812.05433v3.pdf
PWC	https://paperswithcode.com/paper/lenia-biology-of-artificial-life
Repo	https://github.com/Chakazul/Lenia
Framework	none

The Fast and the Flexible: training neural networks to learn to follow instructions from small data


Title	The Fast and the Flexible: training neural networks to learn to follow instructions from small data
Authors	Rezka Leonandya, Elia Bruni, Dieuwke Hupkes, Germán Kruszewski
Abstract	Learning to follow human instructions is a long-pursued goal in artificial intelligence. The task becomes particularly challenging if no prior knowledge of the employed language is assumed while relying only on a handful of examples to learn from. Work in the past has relied on hand-coded components or manually engineered features to provide strong inductive biases that make learning in such situations possible. In contrast, here we seek to establish whether this knowledge can be acquired automatically by a neural network system through a two phase training procedure: A (slow) offline learning stage where the network learns about the general structure of the task and a (fast) online adaptation phase where the network learns the language of a new given speaker. Controlled experiments show that when the network is exposed to familiar instructions but containing novel words, the model adapts very efficiently to the new vocabulary. Moreover, even for human speakers whose language usage can depart significantly from our artificial training language, our network can still make use of its automatically acquired inductive bias to learn to follow instructions more effectively.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06194v2
PDF	http://arxiv.org/pdf/1809.06194v2.pdf
PWC	https://paperswithcode.com/paper/the-fast-and-the-flexible-training-neural
Repo	https://github.com/rezkaaufar/fast-and-flexible
Framework	pytorch

A Benchmark for Interpretability Methods in Deep Neural Networks


Title	A Benchmark for Interpretability Methods in Deep Neural Networks
Authors	Sara Hooker, Dumitru Erhan, Pieter-Jan Kindermans, Been Kim
Abstract	We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches—VarGrad and SmoothGrad-Squared—outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.
Tasks	Feature Importance, Image Classification
Published	2018-06-28
URL	https://arxiv.org/abs/1806.10758v3
PDF	https://arxiv.org/pdf/1806.10758v3.pdf
PWC	https://paperswithcode.com/paper/evaluating-feature-importance-estimates
Repo	https://github.com/LLNL/fastcam
Framework	pytorch

Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $β$-Divergences


Title	Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $β$-Divergences
Authors	Jeremias Knoblauch, Jack Jewson, Theodoros Damoulas
Abstract	We present the very first robust Bayesian Online Changepoint Detection algorithm through General Bayesian Inference (GBI) with $\beta$-divergences. The resulting inference procedure is doubly robust for both the parameter and the changepoint (CP) posterior, with linear time and constant space complexity. We provide a construction for exponential models and demonstrate it on the Bayesian Linear Regression model. In so doing, we make two additional contributions: Firstly, we make GBI scalable using Structural Variational approximations that are exact as $\beta \to 0$. Secondly, we give a principled way of choosing the divergence parameter $\beta$ by minimizing expected predictive loss on-line. Reducing False Discovery Rates of CPs from more than 90% to 0% on real world data, this offers the state of the art.
Tasks	Bayesian Inference, Change Point Detection
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02261v2
PDF	http://arxiv.org/pdf/1806.02261v2.pdf
PWC	https://paperswithcode.com/paper/doubly-robust-bayesian-inference-for-non-1
Repo	https://github.com/alan-turing-institute/bocpdms
Framework	none

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling


Title	Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling
Authors	Shannon R. McCurdy
Abstract	Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We provide provable guarantees for deterministic column sampling using ridge leverage scores. The matrix sketch returned by our algorithm is a column subset of the original matrix, yielding additional interpretability. Like the randomized counterparts, the deterministic algorithm provides (1 + {\epsilon}) error column subset selection, (1 + {\epsilon}) error projection-cost preservation, and an additive-multiplicative spectral bound. We also show that under the assumption of power-law decay of ridge leverage scores, this deterministic algorithm is provably as accurate as randomized algorithms. Lastly, ridge regression is frequently used to regularize ill-posed linear least-squares problems. While ridge regression provides shrinkage for the regression coefficients, many of the coefficients remain small but non-zero. Performing ridge regression with the matrix sketch returned by our algorithm and a particular regularization parameter forces coefficients to zero and has a provable (1 + {\epsilon}) bound on the statistical risk. As such, it is an interesting alternative to elastic net regularization.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.06010v2
PDF	http://arxiv.org/pdf/1803.06010v2.pdf
PWC	https://paperswithcode.com/paper/ridge-regression-and-provable-deterministic-1
Repo	https://github.com/srmcc/deterministic-ridge-leverage-sampling
Framework	none

A Simple Cache Model for Image Recognition


Title	A Simple Cache Model for Image Recognition
Authors	A. Emin Orhan
Abstract	Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time. Our cache memory is directly inspired by a similar cache model previously proposed for language modeling (Grave et al., 2017). This cache component does not require any training or fine-tuning; it can be applied to any pre-trained model and, by properly setting only two hyper-parameters, leads to significant improvements in its classification performance. Improvements are observed across several architectures and datasets. In the cache component, using features extracted from layers close to the output (but not from the output layer itself) as keys leads to the largest improvements. Concatenating features from multiple layers to form keys can further improve performance over using single-layer features as keys. The cache component also has a regularizing effect, a simple consequence of which is that it substantially increases the robustness of models against adversarial attacks.
Tasks	Language Modelling
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08709v2
PDF	http://arxiv.org/pdf/1805.08709v2.pdf
PWC	https://paperswithcode.com/paper/a-simple-cache-model-for-image-recognition-1
Repo	https://github.com/eminorhan/simple-cache
Framework	tf

The Perfect Match: 3D Point Cloud Matching with Smoothed Densities


Title	The Perfect Match: 3D Point Cloud Matching with Smoothed Densities
Authors	Zan Gojcic, Caifa Zhou, Jan D. Wegner, Andreas Wieser
Abstract	We propose 3DSmoothNet, a full workflow to match 3D point clouds with a siamese deep learning architecture and fully convolutional layers using a voxelized smoothed density value (SDV) representation. The latter is computed per interest point and aligned to the local reference frame (LRF) to achieve rotation invariance. Our compact, learned, rotation invariant 3D point cloud descriptor achieves 94.9% average recall on the 3DMatch benchmark data set, outperforming the state-of-the-art by more than 20 percent points with only 32 output dimensions. This very low output dimension allows for near realtime correspondence search with 0.1 ms per feature point on a standard PC. Our approach is sensor- and sceneagnostic because of SDV, LRF and learning highly descriptive features with fully convolutional layers. We show that 3DSmoothNet trained only on RGB-D indoor scenes of buildings achieves 79.0% average recall on laser scans of outdoor vegetation, more than double the performance of our closest, learning-based competitors. Code, data and pre-trained models are available online at https://github.com/zgojcic/3DSmoothNet.
Tasks	3D Point Cloud Matching
Published	2018-11-16
URL	https://arxiv.org/abs/1811.06879v3
PDF	https://arxiv.org/pdf/1811.06879v3.pdf
PWC	https://paperswithcode.com/paper/the-perfect-match-3d-point-cloud-matching
Repo	https://github.com/zgojcic/3DSmoothNet
Framework	tf

HexaConv


Title	HexaConv
Authors	Emiel Hoogeboom, Jorn W. T. Peters, Taco S. Cohen, Max Welling
Abstract	The effectiveness of Convolutional Neural Networks stems in large part from their ability to exploit the translation invariance that is inherent in many learning problems. Recently, it was shown that CNNs can exploit other invariances, such as rotation invariance, by using group convolutions instead of planar convolutions. However, for reasons of performance and ease of implementation, it has been necessary to limit the group convolution to transformations that can be applied to the filters without interpolation. Thus, for images with square pixels, only integer translations, rotations by multiples of 90 degrees, and reflections are admissible. Whereas the square tiling provides a 4-fold rotational symmetry, a hexagonal tiling of the plane has a 6-fold rotational symmetry. In this paper we show how one can efficiently implement planar convolution and group convolution over hexagonal lattices, by re-using existing highly optimized convolution routines. We find that, due to the reduced anisotropy of hexagonal filters, planar HexaConv provides better accuracy than planar convolution with square filters, given a fixed parameter budget. Furthermore, we find that the increased degree of symmetry of the hexagonal grid increases the effectiveness of group convolutions, by allowing for more parameter sharing. We show that our method significantly outperforms conventional CNNs on the AID aerial scene classification dataset, even outperforming ImageNet pre-trained models.
Tasks	Scene Classification
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02108v1
PDF	http://arxiv.org/pdf/1803.02108v1.pdf
PWC	https://paperswithcode.com/paper/hexaconv
Repo	https://github.com/ehoogeboom/hexaconv
Framework	none

Orthographic Feature Transform for Monocular 3D Object Detection


Title	Orthographic Feature Transform for Monocular 3D Object Detection
Authors	Thomas Roddick, Alex Kendall, Roberto Cipolla
Abstract	3D object detection from monocular images has proven to be an enormously challenging task, with the performance of leading systems not yet achieving even 10% of that of LiDAR-based counterparts. One explanation for this performance gap is that existing systems are entirely at the mercy of the perspective image-based representation, in which the appearance and scale of objects varies drastically with depth and meaningful distances are difficult to infer. In this work we argue that the ability to reason about the world in 3D is an essential element of the 3D object detection task. To this end, we introduce the orthographic feature transform, which enables us to escape the image domain by mapping image-based features into an orthographic 3D space. This allows us to reason holistically about the spatial configuration of the scene in a domain where scale is consistent and distances between objects are meaningful. We apply this transformation as part of an end-to-end deep learning architecture and achieve state-of-the-art performance on the KITTI 3D object benchmark.\footnote{We will release full source code and pretrained models upon acceptance of this manuscript for publication.
Tasks	3D Object Detection, 3D Object Detection From Monocular Images, Object Detection
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08188v1
PDF	http://arxiv.org/pdf/1811.08188v1.pdf
PWC	https://paperswithcode.com/paper/orthographic-feature-transform-for-monocular
Repo	https://github.com/tom-roddick/oft
Framework	pytorch

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields


Title	OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Authors	Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, Yaser Sheikh
Abstract	OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Tasks	Keypoint Detection, Pose Estimation
Published	2018-12-18
URL	https://arxiv.org/abs/1812.08008v2
PDF	https://arxiv.org/pdf/1812.08008v2.pdf
PWC	https://paperswithcode.com/paper/openpose-realtime-multi-person-2d-pose
Repo	https://github.com/CMU-Perceptual-Computing-Lab/openpose_unity_plugin
Framework	none

FilterReg: Robust and Efficient Probabilistic Point-Set Registration using Gaussian Filter and Twist Parameterization


Title	FilterReg: Robust and Efficient Probabilistic Point-Set Registration using Gaussian Filter and Twist Parameterization
Authors	Wei Gao, Russ Tedrake
Abstract	Probabilistic point-set registration methods have been gaining more attention for their robustness to noise, outliers and occlusions. However, these methods tend to be much slower than the popular iterative closest point (ICP) algorithms, which severely limits their usability. In this paper, we contribute a novel probabilistic registration method that achieves state-of-the-art robustness as well as substantially faster computational performance than modern ICP implementations. This is achieved using a rigorous yet computationally-efficient probabilistic formulation. Point-set registration is cast as a maximum likelihood estimation and solved using the EM algorithm. We show that with a simple augmentation, the E step can be formulated as a filtering problem, allowing us to leverage advances in efficient Gaussian filtering methods. We also propose a customized permutohedral filter for improved efficiency while retaining sufficient accuracy for our task. Additionally, we present a simple and efficient twist parameterization that generalizes our method to the registration of articulated and deformable objects. For articulated objects, the complexity of our method is almost independent of the Degrees Of Freedom (DOFs), which makes it highly efficient even for high DOF systems. The results demonstrate the proposed method consistently outperforms many competitive baselines on a variety of registration tasks.
Tasks
Published	2018-11-26
URL	https://arxiv.org/abs/1811.10136v3
PDF	https://arxiv.org/pdf/1811.10136v3.pdf
PWC	https://paperswithcode.com/paper/filterreg-robust-and-efficient-probabilistic
Repo	https://github.com/neka-nat/probreg
Framework	none