January 30, 2020

3475 words 17 mins read

Paper Group ANR 385

Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?. Learning-Aided Physical Layer Attacks Against Multicarrier Communications in IoT. Factor Analysis in Fault Diagnostics Using Random Forest. Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis. Semantic Adversarial Network wi …

Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?


Title	Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?
Authors	Céline Hocquette, Stephen H. Muggleton
Abstract	World-class human players have been outperformed in a number of complex two person games (Go, Chess, Checkers) by Deep Reinforcement Learning systems. However, owing to tractability considerations minimax regret of a learning system cannot be evaluated in such games. In this paper we consider simple games (Noughts-and-Crosses and Hexapawn) in which minimax regret can be efficiently evaluated. We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-Interpretive Learning system called MIGO. In our experiments all tested variants of both normal and deep reinforcement learning have worse performance (higher cumulative minimax regret) than both variants of MIGO on Noughts-and-Crosses and Hexapawn. Additionally, MIGO’s learned rules are relatively easy to comprehend, and are demonstrated to achieve significant transfer learning in both directions between Noughts-and-Crosses and Hexapawn.
Tasks	Transfer Learning
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09835v1
PDF	http://arxiv.org/pdf/1902.09835v1.pdf
PWC	https://paperswithcode.com/paper/can-meta-interpretive-learning-outperform
Repo
Framework

Learning-Aided Physical Layer Attacks Against Multicarrier Communications in IoT


Title	Learning-Aided Physical Layer Attacks Against Multicarrier Communications in IoT
Authors	Alireza Nooraiepour, Waheed U. Bajwa, Narayan B. Mandayam
Abstract	Internet-of-Things (IoT) devices that are limited in power and processing capabilities are susceptible to physical layer (PHY) spoofing attacks owing to their inability to implement a full-blown protocol stack for security. The overwhelming adoption of multicarrier communications for the PHY layer makes IoT devices further vulnerable to PHY spoofing attacks. These attacks which aim at injecting bogus data into the receiver, involve inferring transmission parameters and finding PHY characteristics of the transmitted signals so as to spoof the received signal. Non-contiguous orthogonal frequency division multiplexing (NC-OFDM) systems have been argued to have low probability of exploitation (LPE) characteristics against classic attacks based on cyclostationary analysis. However, with the advent of machine learning (ML) algorithms, adversaries can devise data-driven attacks to compromise such systems. It is in this vein that PHY spoofing performance of adversaries equipped with supervised and unsupervised ML tools are investigated in this paper. The supervised ML approach is based on estimation/classification utilizing deep neural networks (DNN) while the unsupervised one employs variational autoencoders (VAEs). In particular, VAEs are shown to be capable of learning representations from NC-OFDM signals related to their PHY characteristics such as frequency pattern and modulation scheme, which are useful for PHY spoofing. In addition, a new metric based on the disentanglement principle is proposed to measure the quality of such learned representations. Simulation results demonstrate that the performance of the spoofing adversaries highly depends on the subcarriers’ allocation patterns used at the transmitter. Particularly, it is shown that utilizing a random subcarrier occupancy pattern precludes the adversary from spoofing and secures NC-OFDM systems against ML-based attacks.
Tasks
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00195v1
PDF	https://arxiv.org/pdf/1908.00195v1.pdf
PWC	https://paperswithcode.com/paper/learning-aided-physical-layer-attacks-against
Repo
Framework

Factor Analysis in Fault Diagnostics Using Random Forest


Title	Factor Analysis in Fault Diagnostics Using Random Forest
Authors	Nagdev Amruthnath, Tarun Gupta
Abstract	Factor analysis or sometimes referred to as variable analysis has been extensively used in classification problems for identifying specific factors that are significant to particular classes. This type of analysis has been widely used in application such as customer segmentation, medical research, network traffic, image, and video classification. Today, factor analysis is prominently being used in fault diagnosis of machines to identify the significant factors and to study the root cause of a specific machine fault. The advantage of performing factor analysis in machine maintenance is to perform prescriptive analysis (helps answer what actions to take?) and preemptive analysis (helps answer how to eliminate the failure mode?). In this paper, a real case of an industrial rotating machine was considered where vibration and ambient temperature data was collected for monitoring the health of the machine. Gaussian mixture model-based clustering was used to cluster the data into significant groups, and spectrum analysis was used to diagnose each cluster to a specific state of the machine. The significant features that attribute to a particular mode of the machine were identified by using the random forest classification model. The significant features for specific modes of the machine were used to conclude that the clusters generated are distinct and have a unique set of significant features.
Tasks	Video Classification
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13366v1
PDF	http://arxiv.org/pdf/1904.13366v1.pdf
PWC	https://paperswithcode.com/paper/factor-analysis-in-fault-diagnostics-using
Repo
Framework

Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis


Title	Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis
Authors	Seyyid Emre Sofuoglu, Selin Aviyente
Abstract	Higher-order data with high dimensionality arise in a diverse set of application areas such as computer vision, video analytics and medical imaging. Tensors provide a natural tool for representing these types of data. Although there has been a lot of work in the area of tensor decomposition and low-rank tensor approximation, extensions to supervised learning, feature extraction and classification are still limited. Moreover, most of the existing supervised tensor learning approaches are based on the orthogonal Tucker model. However, this model has some limitations for large tensors including high memory and computational costs. In this paper, we introduce a supervised learning approach for tensor classification based on the tensor-train model. In particular, we introduce a multi-branch tensor network structure for efficient implementation of tensor-train discriminant analysis (TTDA). The proposed approach takes advantage of the flexibility of the tensor train structure to implement various computationally efficient versions of TTDA. This approach is then evaluated on image and video classification tasks with respect to computation time, storage cost and classification accuracy and is compared to both vector and tensor based discriminant analysis methods.
Tasks	Video Classification
Published	2019-04-15
URL	http://arxiv.org/abs/1904.06788v1
PDF	http://arxiv.org/pdf/1904.06788v1.pdf
PWC	https://paperswithcode.com/paper/multi-branch-tensor-network-structure-for
Repo
Framework

Semantic Adversarial Network with Multi-scale Pyramid Attention for Video Classification


Title	Semantic Adversarial Network with Multi-scale Pyramid Attention for Video Classification
Authors	De Xie, Cheng Deng, Hao Wang, Chao Li, Dapeng Tao
Abstract	Two-stream architecture have shown strong performance in video classification task. The key idea is to learn spatio-temporal features by fusing convolutional networks spatially and temporally. However, there are some problems within such architecture. First, it relies on optical flow to model temporal information, which are often expensive to compute and store. Second, it has limited ability to capture details and local context information for video data. Third, it lacks explicit semantic guidance that greatly decrease the classification performance. In this paper, we proposed a new two-stream based deep framework for video classification to discover spatial and temporal information only from RGB frames, moreover, the multi-scale pyramid attention (MPA) layer and the semantic adversarial learning (SAL) module is introduced and integrated in our framework. The MPA enables the network capturing global and local feature to generate a comprehensive representation for video, and the SAL can make this representation gradually approximate to the real video semantics in an adversarial manner. Experimental results on two public benchmarks demonstrate our proposed methods achieves state-of-the-art results on standard video datasets.
Tasks	Optical Flow Estimation, Video Classification
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02155v1
PDF	http://arxiv.org/pdf/1903.02155v1.pdf
PWC	https://paperswithcode.com/paper/semantic-adversarial-network-with-multi-scale
Repo
Framework

Efficient Video Classification Using Fewer Frames


Title	Efficient Video Classification Using Fewer Frames
Authors	Shweta Bhardwaj, Mukundhan Srinivasan, Mitesh M. Khapra
Abstract	Recently,there has been a lot of interest in building compact models for video classification which have a small memory footprint (<1 GB). While these models are compact, they typically operate by repeated application of a small weight matrix to all the frames in a video. E.g. recurrent neural network based methods compute a hidden state for every frame of the video using a recurrent weight matrix. Similarly, cluster-and-aggregate based methods such as NetVLAD, have a learnable clustering matrix which is used to assign soft-clusters to every frame in the video. Since these models look at every frame in the video, the number of floating point operations (FLOPs) is still large even though the memory footprint is small. We focus on building compute-efficient video classification models which process fewer frames and hence have less number of FLOPs. Similar to memory efficient models, we use the idea of distillation albeit in a different setting. Specifically, in our case, a compute-heavy teacher which looks at all the frames in the video is used to train a compute-efficient student which looks at only a small fraction of frames in the video. This is in contrast to a typical memory efficient Teacher-Student setting, wherein both the teacher and the student look at all the frames in the video but the student has fewer parameters. Our work thus complements the research on memory efficient video classification. We do an extensive evaluation with three types of models for video classification,viz.(i) recurrent models (ii) cluster-and-aggregate models and (iii) memory-efficient cluster-and-aggregate models and show that in each of these cases, a see-it-all teacher can be used to train a compute efficient see-very-little student. We show that the proposed student network can reduce the inference time by 30% and the number of FLOPs by approximately 90% with a negligible drop in the performance.
Tasks	Video Classification
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10640v1
PDF	http://arxiv.org/pdf/1902.10640v1.pdf
PWC	https://paperswithcode.com/paper/efficient-video-classification-using-fewer
Repo
Framework

Understanding and Training Deep Diagonal Circulant Neural Networks


Title	Understanding and Training Deep Diagonal Circulant Neural Networks
Authors	Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre, Jamal Atif
Abstract	In this paper, we study deep diagonal circulant neural networks, that is deep neural networks in which weight matrices are the product of diagonal and circulant ones. Besides making a theoretical analysis of their expressivity, we introduced principled techniques for training these models: we devise an initialization scheme and proposed a smart use of non-linearity functions in order to train deep diagonal circulant networks. Furthermore, we show that these networks outperform recently introduced deep networks with other types of structured layers. We conduct a thorough experimental study to compare the performance of deep diagonal circulant networks with state of the art models based on structured matrices and with dense models. We show that our models achieve better accuracy than other structured approaches while required 2x fewer weights as the next best approach. Finally we train deep diagonal circulant networks to build a compact and accurate models on a real world video classification dataset with over 3.8 million training examples.
Tasks	Video Classification
Published	2019-01-29
URL	https://arxiv.org/abs/1901.10255v3
PDF	https://arxiv.org/pdf/1901.10255v3.pdf
PWC	https://paperswithcode.com/paper/on-the-expressive-power-of-deep-fully
Repo
Framework

Practical Risk Measures in Reinforcement Learning


Title	Practical Risk Measures in Reinforcement Learning
Authors	Dotan Di Castro, Joel Oren, Shie Mannor
Abstract	Practical application of Reinforcement Learning (RL) often involves risk considerations. We study a generalized approximation scheme for risk measures, based on Monte-Carlo simulations, where the risk measures need not necessarily be \emph{coherent}. We demonstrate that, even in simple problems, measures such as the variance of the reward-to-go do not capture the risk in a satisfactory manner. In addition, we show how a risk measure can be derived from model’s realizations. We propose a neural architecture for estimating the risk and suggest the risk critic architecture that can be use to optimize a policy under general risk measures. We conclude our work with experiments that demonstrate the efficacy of our approach.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08379v1
PDF	https://arxiv.org/pdf/1908.08379v1.pdf
PWC	https://paperswithcode.com/paper/practical-risk-measures-in-reinforcement
Repo
Framework

A General Data Renewal Model for Prediction Algorithms in Industrial Data Analytics


Title	A General Data Renewal Model for Prediction Algorithms in Industrial Data Analytics
Authors	Hongzhi Wang, Yijie Yang, Yang Song
Abstract	In industrial data analytics, one of the fundamental problems is to utilize the temporal correlation of the industrial data to make timely predictions in the production process, such as fault prediction and yield prediction. However, the traditional prediction models are fixed while the conditions of the machines change over time, thus making the errors of predictions increase with the lapse of time. In this paper, we propose a general data renewal model to deal with it. Combined with the similarity function and the loss function, it estimates the time of updating the existing prediction model, then updates it according to the evaluation function iteratively and adaptively. We have applied the data renewal model to two prediction algorithms. The experiments demonstrate that the data renewal model can effectively identify the changes of data, update and optimize the prediction model so as to improve the accuracy of prediction.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08368v1
PDF	https://arxiv.org/pdf/1908.08368v1.pdf
PWC	https://paperswithcode.com/paper/a-general-data-renewal-model-for-prediction
Repo
Framework


Title	Social Learning in Multi Agent Multi Armed Bandits
Authors	Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai
Abstract	In this paper, we introduce a distributed version of the classical stochastic Multi-Arm Bandit (MAB) problem. Our setting consists of a large number of agents $n$ that collaboratively and simultaneously solve the same instance of $K$ armed MAB to minimize the average cumulative regret over all agents. The agents can communicate and collaborate among each other \emph{only} through a pairwise asynchronous gossip based protocol that exchange a limited number of bits. In our model, agents at each point decide on (i) which arm to play, (ii) whether to, and if so (iii) what and whom to communicate with. Agents in our model are decentralized, namely their actions only depend on their observed history in the past. We develop a novel algorithm in which agents, whenever they choose, communicate only arm-ids and not samples, with another agent chosen uniformly and independently at random. The per-agent regret scaling achieved by our algorithm is $O \left( \frac{\lceil\frac{K}{n}\rceil+\log(n)}{\Delta} \log(T) + \frac{\log^3(n) \log \log(n)}{\Delta^2} \right)$. Furthermore, any agent in our algorithm communicates only a total of $\Theta(\log(T))$ times over a time interval of $T$. We compare our results to two benchmarks - one where there is no communication among agents and one corresponding to complete interaction. We show both theoretically and empirically, that our algorithm experiences a significant reduction both in per-agent regret when compared to the case when agents do not collaborate and in communication complexity when compared to the full interaction setting which requires $T$ communication attempts by an agent over $T$ arm pulls.
Tasks	Multi-Armed Bandits
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02100v3
PDF	https://arxiv.org/pdf/1910.02100v3.pdf
PWC	https://paperswithcode.com/paper/social-learning-in-multi-agent-multi-armed
Repo
Framework

Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous Car Case Study


Title	Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous Car Case Study
Authors	Fitash Ul Haq, Donghwan Shin, Shiva Nejati, Lionel Briand
Abstract	There is a growing body of research on developing testing techniques for Deep Neural Networks (DNN). We distinguish two general modes of testing for DNNs: Offline testing where DNNs are tested as individual units based on test datasets obtained independently from the DNNs under test, and online testing where DNNs are embedded into a specific application and tested in a close-loop mode in interaction with the application environment. In addition, we identify two sources for generating test datasets for DNNs: Datasets obtained from real-life and datasets generated by simulators. While offline testing can be used with datasets obtained from either sources, online testing is largely confined to using simulators since online testing within real-life applications can be time-consuming, expensive and dangerous. In this paper, we study the following two important questions aiming to compare test datasets and testing modes for DNNs: First, can we use simulator-generated data as a reliable substitute to real-world data for the purpose of DNN testing? Second, how do online and offline testing results differ and complement each other? Though these questions are generally relevant to all autonomous systems, we study them in the context of automated driving systems where, as study subjects, we use DNNs automating end-to-end control of cars’ steering actuators. Our results show that simulator-generated datasets are able to yield DNN prediction errors that are similar to those obtained by testing DNNs with real-life datasets. Further, offline testing is more optimistic than online testing as many safety violations identified by online testing could not be identified by offline testing, while large prediction errors generated by offline testing always led to severe safety violations detectable by online testing.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1912.00805v1
PDF	https://arxiv.org/pdf/1912.00805v1.pdf
PWC	https://paperswithcode.com/paper/comparing-offline-and-online-testing-of-deep
Repo
Framework

Alternating Phase Projected Gradient Descent with Generative Priors for Solving Compressive Phase Retrieval


Title	Alternating Phase Projected Gradient Descent with Generative Priors for Solving Compressive Phase Retrieval
Authors	Rakib Hyder, Viraj Shah, Chinmay Hegde, M. Salman Asif
Abstract	The classical problem of phase retrieval arises in various signal acquisition systems. Due to the ill-posed nature of the problem, the solution requires assumptions on the structure of the signal. In the last several years, sparsity and support-based priors have been leveraged successfully to solve this problem. In this work, we propose replacing the sparsity/support priors with generative priors and propose two algorithms to solve the phase retrieval problem. Our proposed algorithms combine the ideas from AltMin approach for non-convex sparse phase retrieval and projected gradient descent approach for solving linear inverse problems using generative priors. We empirically show that the performance of our method with projected gradient descent is superior to the existing approach for solving phase retrieval under generative priors. We support our method with an analysis of sample complexity with Gaussian measurements.
Tasks
Published	2019-03-07
URL	http://arxiv.org/abs/1903.02707v1
PDF	http://arxiv.org/pdf/1903.02707v1.pdf
PWC	https://paperswithcode.com/paper/alternating-phase-projected-gradient-descent
Repo
Framework

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes


Title	Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Authors	Yichun Hu, Nathan Kallus, Xiaojie Mao
Abstract	We study a nonparametric contextual bandit problem where the expected reward functions belong to a H"older class with smoothness parameter $\beta$. We show how this interpolates between two extremes that were previously studied in isolation: non-differentiable bandits ($\beta\leq1$), where rate-optimal regret is achieved by running separate non-contextual bandits in different context regions, and parametric-response bandits ($\beta=\infty$), where rate-optimal regret can be achieved with minimal or no exploration due to infinite extrapolatability. We develop a novel algorithm that carefully adjusts to all smoothness settings and we prove its regret is rate-optimal by establishing matching upper and lower bounds, recovering the existing results at the two extremes. In this sense, our work bridges the gap between the existing literature on parametric and non-differentiable contextual bandit problems and between bandit algorithms that exclusively use global or local information, shedding light on the crucial interplay of complexity and regret in contextual bandits.
Tasks	Multi-Armed Bandits
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02553v1
PDF	https://arxiv.org/pdf/1909.02553v1.pdf
PWC	https://paperswithcode.com/paper/smooth-contextual-bandits-bridging-the
Repo
Framework

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback


Title	Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback
Authors	Arun Verma, Manjesh K. hanawal, Arun Rajkumar, Raman Sankaran
Abstract	In this paper, we study censored Semi-Bandits, a novel variant of the semi-bandits problem. The learner is assumed to have a fixed amount of resources, which it allocates to the arms at each time step. The loss observed from an arm is random and depends on the amount of resources allocated to it. More specifically, the loss equals zero if the allocation for the arm exceeds a constant (but unknown)threshold that can be dependent on the arm. Our goal is to learn a feasible allocation that minimizes the expected loss. The problem is challenging because the loss distribution and threshold value of each arm are unknown. We study this novel setting by establishing its `equivalence’ to Multiple-Play Multi-Armed Bandits(MP-MAB) and Combinatorial Semi-Bandits. Exploiting these equivalences, we derive optimal algorithms for our setting using existing algorithms for MP-MABand Combinatorial Semi-Bandits. Experiments on synthetically generated data validate performance guarantees of the proposed algorithms. \|
Tasks	Multi-Armed Bandits
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01504v3
PDF	https://arxiv.org/pdf/1909.01504v3.pdf
PWC	https://paperswithcode.com/paper/censored-semi-bandits-a-framework-for
Repo
Framework

A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits


Title	A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits
Authors	Huozhi Zhou, Lingda Wang, Lav R. Varshney, Ee-Peng Lim
Abstract	We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, \texttt{GLR-CUCB}, which incorporates an efficient combinatorial semi-bandit algorithm, \texttt{CUCB}, with an almost parameter-free change-point detector, the \emph{Generalized Likelihood Ratio Test} (GLRT). Our analysis shows that the regret of \texttt{GLR-CUCB} is upper bounded by $\mathcal{O}(\sqrt{NKT\log{T}})$, where $N$ is the number of piecewise-stationary segments, $K$ is the number of base arms, and $T$ is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of $\Omega(\sqrt{NKT}$), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewise-stationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is $\Omega(\sqrt{T})$. Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of \texttt{GLR-CUCB} compared to other state-of-the-art algorithms.
Tasks	Multi-Armed Bandits
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10402v4
PDF	https://arxiv.org/pdf/1908.10402v4.pdf
PWC	https://paperswithcode.com/paper/a-near-optimal-change-detection-based
Repo
Framework