April 2, 2020

3196 words 16 mins read

Paper Group ANR 225

Paper Group ANR 225

IoT Network Behavioral Fingerprint Inference with Limited Network Trace for Cyber Investigation: A Meta Learning Approach. Optimisation of Large Wave Farms using a Multi-strategy Evolutionary Framework. Robust Policy Search for Robot Navigation with Stochastic Meta-Policies. CNN-based Repetitive self-revised learning for photos’ aesthetics imbalanc …

IoT Network Behavioral Fingerprint Inference with Limited Network Trace for Cyber Investigation: A Meta Learning Approach

Title IoT Network Behavioral Fingerprint Inference with Limited Network Trace for Cyber Investigation: A Meta Learning Approach
Authors Jonathan Pan
Abstract The development and adoption of Internet of Things (IoT) devices will grow significantly in the coming years to enable Industry 4.0. Many forms of IoT devices will be developed and used across industry verticals. However, the euphoria of this technology adoption is shadowed by the solemn presence of cyber threats that will follow its growth trajectory. Cyber threats would either embed their malicious code or attack vulnerabilities in IoT that could induce significant consequences in cyber and physical realms. In order to manage such destructive effects, incident responders and cyber investigators require the capabilities to find these rogue IoT and contain them quickly. Such online devices may only leave network activity traces. A collection of relevant traces could be used to infer the IoT’s network behaviorial fingerprints and in turn could facilitate investigative find of these IoT. However, the challenge is how to infer these fingerprints when there is limited network activity traces. This research proposes the novel model construct that learns to infer the network behaviorial fingerprint of specific IoT based on limited network activity traces using a One-Card Time Series Meta-Learner called DeepNetPrint. Our research also demonstrates the application of DeepNetPrint to identify IoT devices that performs comparatively well against leading supervised learning models. Our solution would enable cyber investigator to identify specific IoT of interest while overcoming the constraints of having only limited network traces of the IoT.
Tasks Meta-Learning, Time Series
Published 2020-01-14
URL https://arxiv.org/abs/2001.04705v2
PDF https://arxiv.org/pdf/2001.04705v2.pdf
PWC https://paperswithcode.com/paper/iot-network-behavioral-fingerprint-inference
Repo
Framework

Optimisation of Large Wave Farms using a Multi-strategy Evolutionary Framework

Title Optimisation of Large Wave Farms using a Multi-strategy Evolutionary Framework
Authors Mehdi Neshat, Bradley Alexander, Nataliia Y. Sergiienko, Markus Wagner
Abstract Wave energy is a fast-developing and promising renewable energy resource. The primary goal of this research is to maximise the total harnessed power of a large wave farm consisting of fully-submerged three-tether wave energy converters (WECs). Energy maximisation for large farms is a challenging search problem due to the costly calculations of the hydrodynamic interactions between WECs in a large wave farm and the high dimensionality of the search space. To address this problem, we propose a new hybrid multi-strategy evolutionary framework combining smart initialisation, binary population-based evolutionary algorithm, discrete local search and continuous global optimisation. For assessing the performance of the proposed hybrid method, we compare it with a wide variety of state-of-the-art optimisation approaches, including six continuous evolutionary algorithms, four discrete search techniques and three hybrid optimisation methods. The results show that the proposed method performs considerably better in terms of convergence speed and farm output.
Tasks
Published 2020-03-21
URL https://arxiv.org/abs/2003.09594v1
PDF https://arxiv.org/pdf/2003.09594v1.pdf
PWC https://paperswithcode.com/paper/optimisation-of-large-wave-farms-using-a
Repo
Framework

Robust Policy Search for Robot Navigation with Stochastic Meta-Policies

Title Robust Policy Search for Robot Navigation with Stochastic Meta-Policies
Authors Javier Garcia-Barcos, Ruben Martinez-Cantin
Abstract Bayesian optimization is an efficient nonlinear optimization method where the queries are carefully selected to gather information about the optimum location. Thus, in the context of policy search, it has been called active policy search. The main ingredients of Bayesian optimization for sample efficiency are the probabilistic surrogate model and the optimal decision heuristics. In this work, we exploit those to provide robustness to different issues for policy search algorithms. We combine several methods and show how their interaction works better than the sum of the parts. First, to deal with input noise and provide a safe and repeatable policy we use an improved version of unscented Bayesian optimization. Then, to deal with mismodeling errors and improve exploration we use stochastic meta-policies for query selection and an adaptive kernel. We compare the proposed algorithm with previous results in several optimization benchmarks and robot tasks, such as pushing objects with a robot arm, or path finding with a rover.
Tasks Robot Navigation
Published 2020-03-02
URL https://arxiv.org/abs/2003.01000v1
PDF https://arxiv.org/pdf/2003.01000v1.pdf
PWC https://paperswithcode.com/paper/robust-policy-search-for-robot-navigation
Repo
Framework

CNN-based Repetitive self-revised learning for photos’ aesthetics imbalanced classification

Title CNN-based Repetitive self-revised learning for photos’ aesthetics imbalanced classification
Authors Ying Dai
Abstract Aesthetic assessment is subjective, and the distribution of the aesthetic levels is imbalanced. In order to realize the auto-assessment of photo aesthetics, we focus on using repetitive self-revised learning (RSRL) to train the CNN-based aesthetics classification network by imbalanced data set. As RSRL, the network is trained repetitively by dropping out the low likelihood photo samples at the middle levels of aesthetics from the training data set based on the previously trained network. Further, the retained two networks are used in extracting highlight regions of the photos related with the aesthetic assessment. Experimental results show that the CNN-based repetitive self-revised learning is effective for improving the performances of the imbalanced classification.
Tasks
Published 2020-03-06
URL https://arxiv.org/abs/2003.03081v4
PDF https://arxiv.org/pdf/2003.03081v4.pdf
PWC https://paperswithcode.com/paper/cnn-based-repetitive-self-revised-learning
Repo
Framework

BADGR: An Autonomous Self-Supervised Learning-Based Navigation System

Title BADGR: An Autonomous Self-Supervised Learning-Based Navigation System
Authors Gregory Kahn, Pieter Abbeel, Sergey Levine
Abstract Mobile robot navigation is typically regarded as a geometric problem, in which the robot’s objective is to perceive the geometry of the environment in order to plan collision-free paths towards a desired goal. However, a purely geometric view of the world can can be insufficient for many navigation problems. For example, a robot navigating based on geometry may avoid a field of tall grass because it believes it is untraversable, and will therefore fail to reach its desired goal. In this work, we investigate how to move beyond these purely geometric-based approaches using a method that learns about physical navigational affordances from experience. Our approach, which we call BADGR, is an end-to-end learning-based mobile robot navigation system that can be trained with self-supervised off-policy data gathered in real-world environments, without any simulation or human supervision. BADGR can navigate in real-world urban and off-road environments with geometrically distracting obstacles. It can also incorporate terrain preferences, generalize to novel environments, and continue to improve autonomously by gathering more data. Videos, code, and other supplemental material are available on our website https://sites.google.com/view/badgr
Tasks Robot Navigation
Published 2020-02-13
URL https://arxiv.org/abs/2002.05700v1
PDF https://arxiv.org/pdf/2002.05700v1.pdf
PWC https://paperswithcode.com/paper/badgr-an-autonomous-self-supervised-learning
Repo
Framework

On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning and SLAM Based Approach

Title On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning and SLAM Based Approach
Authors Nicolò Botteghi, Beril Sirmacek, Khaled A. A. Mustafa, Mannes Poel, Stefano Stramigioli
Abstract We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment that only relies on 40-dimensional raw laser data and odometry information. The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment, obtained using grid-based Rao-Blackwellized particle filter, in an attempt to enhance the obstacle awareness of the agent. The agent is trained in a complex simulated environment and evaluated in two unseen ones. We show that the policy trained using the introduced reward function not only outperforms standard reward functions in terms of convergence speed, by a reduction of 36.9% of the iteration steps, and reduction of the collision samples, but it also drastically improves the behaviour of the agent in unseen environments, respectively by 23% in a simpler workspace and by 45% in a more clustered one. Furthermore, the policy trained in the simulation environment can be directly and successfully transferred to the real robot. A video of our experiments can be found at: https://youtu.be/UEV7W6e6ZqI
Tasks Robot Navigation
Published 2020-02-10
URL https://arxiv.org/abs/2002.04109v1
PDF https://arxiv.org/pdf/2002.04109v1.pdf
PWC https://paperswithcode.com/paper/on-reward-shaping-for-mobile-robot-navigation
Repo
Framework

Stochastic Finite State Control of POMDPs with LTL Specifications

Title Stochastic Finite State Control of POMDPs with LTL Specifications
Authors Mohamadreza Ahmadi, Rangoli Sharan, Joel W. Burdick
Abstract Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study.
Tasks Decision Making, Decision Making Under Uncertainty, Robot Navigation, Self-Driving Cars
Published 2020-01-21
URL https://arxiv.org/abs/2001.07679v1
PDF https://arxiv.org/pdf/2001.07679v1.pdf
PWC https://paperswithcode.com/paper/stochastic-finite-state-control-of-pomdps
Repo
Framework

Assurance Monitoring of Cyber-Physical Systems with Machine Learning Components

Title Assurance Monitoring of Cyber-Physical Systems with Machine Learning Components
Authors Dimitrios Boursinos, Xenofon Koutsoukos
Abstract Machine learning components such as deep neural networks are used extensively in Cyber-physical Systems (CPS). However, they may introduce new types of hazards that can have disastrous consequences and need to be addressed for engineering trustworthy systems. Although deep neural networks offer advanced capabilities, they must be complemented by engineering methods and practices that allow effective integration in CPS. In this paper, we investigate how to use the conformal prediction framework for assurance monitoring of CPS with machine learning components. In order to handle high-dimensional inputs in real-time, we compute nonconformity scores using embedding representations of the learned models. By leveraging conformal prediction the approach provides well-calibrated confidence and can allow monitoring that ensures a bounded small error rate while limiting the number of inputs for which an accurate prediction cannot be made. Empirical evaluation results using the German Traffic Sign Recognition Benchmark and a robot navigation dataset demonstrate that the error rates are well-calibrated while the number of alarms is small. The method is computationally efficient, and therefore, the approach is promising for assurance monitoring of CPS.
Tasks Robot Navigation, Traffic Sign Recognition
Published 2020-01-14
URL https://arxiv.org/abs/2001.05014v1
PDF https://arxiv.org/pdf/2001.05014v1.pdf
PWC https://paperswithcode.com/paper/assurance-monitoring-of-cyber-physical
Repo
Framework

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

Title A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
Authors Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare
Abstract We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes. We demonstrate its effectiveness by presenting simple and unified proofs of convergence for a variety of commonly-used methods. We show that value-based methods such as TD($\lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions, thus establishing their exponentially fast convergence to a stationary distribution. We demonstrate that the stationary distribution obtained by any algorithm whose target is an expected Bellman update has a mean which is equal to the true value function. Furthermore, we establish that the distributions concentrate around their mean as the step-size shrinks. We further analyse the optimistic policy iteration algorithm, for which the contraction property does not hold, and formulate a probabilistic policy improvement property which entails the convergence of the algorithm.
Tasks Q-Learning
Published 2020-03-27
URL https://arxiv.org/abs/2003.12239v1
PDF https://arxiv.org/pdf/2003.12239v1.pdf
PWC https://paperswithcode.com/paper/a-distributional-analysis-of-sampling-based
Repo
Framework

PoWER-BERT: Accelerating BERT inference for Classification Tasks

Title PoWER-BERT: Accelerating BERT inference for Classification Tasks
Authors Saurabh Goyal, Anamitra Roy Choudhary, Venkatesan Chakaravarthy, Saurabh ManishRaje, Yogish Sabharwal, Ashish Verma
Abstract BERT has emerged as a popular model for natural language understanding. Given its compute intensive nature, even for inference, many recent studies have considered optimization of two important performance characteristics: model size and inference time. We consider classification tasks and propose a novel method, called PoWER-BERT, for improving the inference time for the BERT model without significant loss in the accuracy. The method works by eliminating word-vectors (intermediate vector outputs) from the encoder pipeline. We design a strategy for measuring the significance of the word-vectors based on the self-attention mechanism of the encoders which helps us identify the word-vectors to be eliminated. Experimental evaluation on the standard GLUE benchmark shows that PoWER-BERT achieves up to 4.5x reduction in inference time over BERT with < 1% loss in accuracy. We show that compared to the prior inference time reduction methods, PoWER-BERT offers better trade-off between accuracy and inference time. Lastly, we demonstrate that our scheme can also be used in conjunction with ALBERT (a highly compressed version of BERT) and can attain up to 6.8x factor reduction in inference time with < 1% loss in accuracy.
Tasks
Published 2020-01-24
URL https://arxiv.org/abs/2001.08950v1
PDF https://arxiv.org/pdf/2001.08950v1.pdf
PWC https://paperswithcode.com/paper/power-bert-accelerating-bert-inference-for
Repo
Framework

SPAN: A Stochastic Projected Approximate Newton Method

Title SPAN: A Stochastic Projected Approximate Newton Method
Authors Xunpeng Huang, Xianfeng Liang, Zhengyang Liu, Yitan Li, Linyun Yu, Yue Yu, Lei Li
Abstract Second-order optimization methods have desirable convergence properties. However, the exact Newton method requires expensive computation for the Hessian and its inverse. In this paper, we propose SPAN, a novel approximate and fast Newton method. SPAN computes the inverse of the Hessian matrix via low-rank approximation and stochastic Hessian-vector products. Our experiments on multiple benchmark datasets demonstrate that SPAN outperforms existing first-order and second-order optimization methods in terms of the convergence wall-clock time. Furthermore, we provide a theoretical analysis of the per-iteration complexity, the approximation error, and the convergence rate. Both the theoretical analysis and experimental results show that our proposed method achieves a better trade-off between the convergence rate and the per-iteration efficiency.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.03687v2
PDF https://arxiv.org/pdf/2002.03687v2.pdf
PWC https://paperswithcode.com/paper/span-a-stochastic-projected-approximate
Repo
Framework

Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory

Title Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory
Authors Elaina Chai, Mert Pilanci, Boris Murmann
Abstract Batch Normalization (BatchNorm) is commonly used in Convolutional Neural Networks (CNNs) to improve training speed and stability. However, there is still limited consensus on why this technique is effective. This paper uses concepts from the traditional adaptive filter domain to provide insight into the dynamics and inner workings of BatchNorm. First, we show that the convolution weight updates have natural modes whose stability and convergence speed are tied to the eigenvalues of the input autocorrelation matrices, which are controlled by BatchNorm through the convolution layers’ channel-wise structure. Furthermore, our experiments demonstrate that the speed and stability benefits are distinct effects. At low learning rates, it is BatchNorm’s amplification of the smallest eigenvalues that improves convergence speed, while at high learning rates, it is BatchNorm’s suppression of the largest eigenvalues that ensures stability. Lastly, we prove that in the first training step, when normalization is needed most, BatchNorm satisfies the same optimization as Normalized Least Mean Square (NLMS), while it continues to approximate this condition in subsequent steps. The analyses provided in this paper lay the groundwork for gaining further insight into the operation of modern neural network structures using adaptive filter theory.
Tasks
Published 2020-02-25
URL https://arxiv.org/abs/2002.10674v1
PDF https://arxiv.org/pdf/2002.10674v1.pdf
PWC https://paperswithcode.com/paper/separating-the-effects-of-batch-normalization
Repo
Framework

How social feedback processing in the brain shapes collective opinion processes in the era of social media

Title How social feedback processing in the brain shapes collective opinion processes in the era of social media
Authors Sven Banisch, Felix Gaisbauer, Eckehard Olbrich
Abstract What are the mechanisms by which groups with certain opinions gain public voice and force others holding a different view into silence? And how does social media play into this? Drawing on recent neuro-scientific insights into the processing of social feedback, we develop a theoretical model that allows to address these questions. The model captures phenomena described by spiral of silence theory of public opinion, provides a mechanism-based foundation for it, and allows in this way more general insight into how different group structures relate to different regimes of collective opinion expression. Even strong majorities can be forced into silence if a minority acts as a cohesive whole. The proposed framework of social feedback theory (SFT) highlights the need for sociological theorising to understand the societal-level implications of findings in social and cognitive neuroscience.
Tasks
Published 2020-03-18
URL https://arxiv.org/abs/2003.08154v1
PDF https://arxiv.org/pdf/2003.08154v1.pdf
PWC https://paperswithcode.com/paper/how-social-feedback-processing-in-the-brain
Repo
Framework

A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Title A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification
Authors Weitao Xu, Xiang Zhang, Lina Yao, Wanli Xue, Bo Wei
Abstract Automatic identification of animal species by their vocalization is an important and challenging task. Although many kinds of audio monitoring system have been proposed in the literature, they suffer from several disadvantages such as non-trivial feature selection, accuracy degradation because of environmental noise or intensive local computation. In this paper, we propose a deep learning based acoustic classification framework for Wireless Acoustic Sensor Network (WASN). The proposed framework is based on cloud architecture which relaxes the computational burden on the wireless sensor node. To improve the recognition accuracy, we design a multi-view Convolution Neural Network (CNN) to extract the short-, middle-, and long-term dependencies in parallel. The evaluation on two real datasets shows that the proposed architecture can achieve high accuracy and outperforms traditional classification systems significantly when the environmental noise dominate the audio signal (low SNR). Moreover, we implement and deploy the proposed system on a testbed and analyse the system performance in real-world environments. Both simulation and real-world evaluation demonstrate the accuracy and robustness of the proposed acoustic classification system in distinguishing species of animals.
Tasks Feature Selection
Published 2020-02-23
URL https://arxiv.org/abs/2002.09821v1
PDF https://arxiv.org/pdf/2002.09821v1.pdf
PWC https://paperswithcode.com/paper/a-multi-view-cnn-based-acoustic
Repo
Framework

Contextual Blocking Bandits

Title Contextual Blocking Bandits
Authors Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai
Abstract We study a novel variant of the multi-armed bandit problem, where at each time step, the player observes a context that determines the arms’ mean rewards. However, playing an arm blocks it (across all contexts) for a fixed number of future time steps. This model extends the blocking bandits model (Basu et al., NeurIPS19) to a contextual setting, and captures important scenarios such as recommendation systems or ad placement with diverse users, and processing diverse pool of jobs. This contextual setting, however, invalidates greedy solution techniques that are effective for its non-contextual counterpart. Assuming knowledge of the mean reward for each arm-context pair, we design a randomized LP-based algorithm which is $\alpha$-optimal in (large enough) $T$ time steps, where $\alpha = \tfrac{d_{\max}}{2d_{\max}-1}\left(1- \epsilon\right)$ for any $\epsilon >0$, and $d_{max}$ is the maximum delay of the arms. In the bandit setting, we show that a UCB based variant of the above online policy guarantees $\mathcal{O}\left(\log T\right)$ regret w.r.t. the $\alpha$-optimal strategy in $T$ time steps, which matches the $\Omega(\log(T))$ regret lower bound in this setting. Due to the time correlation caused by the blocking of arms, existing techniques for upper bounding regret fail. As a first, in the presence of such temporal correlations, we combine ideas from coupling of non-stationary Markov chains and opportunistic sub-sampling with suboptimality charging techniques from combinatorial bandits to prove our regret upper bounds.
Tasks Recommendation Systems
Published 2020-03-06
URL https://arxiv.org/abs/2003.03426v1
PDF https://arxiv.org/pdf/2003.03426v1.pdf
PWC https://paperswithcode.com/paper/contextual-blocking-bandits
Repo
Framework
comments powered by Disqus