April 2, 2020

# Paper Group ANR 231

RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks. Predictive online optimisation with applications to optical flow. Depth Map Estimation of Dynamic Scenes Using Prior Depth Information. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition. A Model-Based Derivative-Free Approach to Blac …

#### RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks

Title RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks
Authors Jingkun Gao, Xiaomin Song, Qingsong Wen, Pichao Wang, Liang Sun, Huan Xu
Abstract The monitoring and management of numerous and diverse time series data at Alibaba Group calls for an effective and scalable time series anomaly detection service. In this paper, we propose RobustTAD, a Robust Time series Anomaly Detection framework by integrating robust seasonal-trend decomposition and convolutional neural network for time series data. The seasonal-trend decomposition can effectively handle complicated patterns in time series, and meanwhile significantly simplifies the architecture of the neural network, which is an encoder-decoder architecture with skip connections. This architecture can effectively capture the multi-scale information from time series, which is very useful in anomaly detection. Due to the limited labeled data in time series anomaly detection, we systematically investigate data augmentation methods in both time and frequency domains. We also introduce label-based weight and value-based weight in the loss function by utilizing the unbalanced nature of the time series anomaly detection problem. Compared with the widely used forecasting-based anomaly detection algorithms, decomposition-based algorithms, traditional statistical algorithms, as well as recent neural network based algorithms, RobustTAD performs significantly better on public benchmark datasets. It is deployed as a public online service and widely adopted in different business scenarios at Alibaba Group.
Tasks Anomaly Detection, Data Augmentation, Time Series
Published 2020-02-21
URL https://arxiv.org/abs/2002.09545v1
PDF https://arxiv.org/pdf/2002.09545v1.pdf
Repo
Framework

#### Predictive online optimisation with applications to optical flow

Title Predictive online optimisation with applications to optical flow
Authors Tuomo Valkonen
Abstract Online optimisation revolves around new data being introduced into a problem while it is still being solved; think of deep learning as more training samples become available. We adapt the idea to dynamic inverse problems such as video processing with optical flow. We introduce a corresponding predictive online primal-dual proximal splitting method. The video frames now exactly correspond to the algorithm iterations. A user-prescribed predictor describes the evolution of the primal variable. To prove convergence we need a predictor for the dual variable based on (proximal) gradient flow. This affects the model that the method asymptotically minimises. We show that for inverse problems the effect is, essentially, to construct a new dynamic regulariser based on infimal convolution of the static regularisers with the temporal coupling. We develop regularisation theory for dynamic inverse problems, and show the convergence of the algorithmic solutions in terms of this theory. We finish by demonstrating excellent real-time performance of our method in computational image stabilisation.
Published 2020-02-08
URL https://arxiv.org/abs/2002.03053v1
PDF https://arxiv.org/pdf/2002.03053v1.pdf
PWC https://paperswithcode.com/paper/predictive-online-optimisation-with
Repo
Framework

#### Depth Map Estimation of Dynamic Scenes Using Prior Depth Information

Title Depth Map Estimation of Dynamic Scenes Using Prior Depth Information
Authors James Noraky, Vivienne Sze
Abstract Depth information is useful for many applications. Active depth sensors are appealing because they obtain dense and accurate depth maps. However, due to issues that range from power constraints to multi-sensor interference, these sensors cannot always be continuously used. To overcome this limitation, we propose an algorithm that estimates depth maps using concurrently collected images and a previously measured depth map for dynamic scenes, where both the camera and objects in the scene may be independently moving. To estimate depth in these scenarios, our algorithm models the dynamic scene motion using independent and rigid motions. It then uses the previous depth map to efficiently estimate these rigid motions and obtain a new depth map. Our goal is to balance the acquisition of depth between the active depth sensor and computation, without incurring a large computational cost. Thus, we leverage the prior depth information to avoid computationally expensive operations like dense optical flow estimation or segmentation used in similar approaches. Our approach can obtain dense depth maps at up to real-time (30 FPS) on a standard laptop computer, which is orders of magnitude faster than similar approaches. When evaluated using RGB-D datasets of various dynamic scenes, our approach estimates depth maps with a mean relative error of 2.5% while reducing the active depth sensor usage by over 90%.
Published 2020-02-02
URL https://arxiv.org/abs/2002.00297v1
PDF https://arxiv.org/pdf/2002.00297v1.pdf
PWC https://paperswithcode.com/paper/depth-map-estimation-of-dynamic-scenes-using
Repo
Framework

#### Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

Title Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
Authors Jonathan Munro, Dima Damen
Abstract Fine-grained action recognition datasets exhibit environmental bias, where multiple video sequences are captured from a limited number of environments. Training a model in one environment and deploying in another results in a drop in performance due to an unavoidable domain shift. Unsupervised Domain Adaptation (UDA) approaches have frequently utilised adversarial training between the source and target domains. However, these approaches have not explored the multi-modal nature of video within each domain. In this work we exploit the correspondence of modalities as a self-supervised alignment approach for UDA in addition to adversarial alignment. We test our approach on three kitchens from our large-scale dataset, EPIC-Kitchens, using two modalities commonly employed for action recognition: RGB and Optical Flow. We show that multi-modal self-supervision alone improves the performance over source-only training by 2.4% on average. We then combine adversarial training with multi-modal self-supervision, showing that our approach outperforms other UDA methods by 3%.
Published 2020-01-27
URL https://arxiv.org/abs/2001.09691v2
PDF https://arxiv.org/pdf/2001.09691v2.pdf
Repo
Framework

#### A Model-Based Derivative-Free Approach to Black-Box Adversarial Examples: BOBYQA

Title A Model-Based Derivative-Free Approach to Black-Box Adversarial Examples: BOBYQA
Authors Giuseppe Ughi, Vinayak Abrol, Jared Tanner
Abstract We demonstrate that model-based derivative free optimisation algorithms can generate adversarial targeted misclassification of deep networks using fewer network queries than non-model-based methods. Specifically, we consider the black-box setting, and show that the number of networks queries is less impacted by making the task more challenging either through reducing the allowed $\ell^{\infty}$ perturbation energy or training the network with defences against adversarial misclassification. We illustrate this by contrasting the BOBYQA algorithm with the state-of-the-art model-free adversarial targeted misclassification approaches based on genetic, combinatorial, and direct-search algorithms. We observe that for high $\ell^{\infty}$ energy perturbations on networks, the aforementioned simpler model-free methods require the fewest queries. In contrast, the proposed BOBYQA based method achieves state-of-the-art results when the perturbation energy decreases, or if the network is trained against adversarial perturbations.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10349v1
PDF https://arxiv.org/pdf/2002.10349v1.pdf
PWC https://paperswithcode.com/paper/a-model-based-derivative-free-approach-to
Repo
Framework

#### RobustPeriod: Time-Frequency Mining for Robust Multiple Periodicities Detection

Title RobustPeriod: Time-Frequency Mining for Robust Multiple Periodicities Detection
Authors Qingsong Wen, Kai He, Liang Sun, Yingying Zhang, Min Ke, Huan Xu
Abstract Periodicity detection is an important task in time series analysis as it plays a crucial role in many time series tasks such as classification, clustering, compression, anomaly detection, and forecasting. It is challenging due to the following reasons: 1, complicated non-stationary time series; 2, dynamic and complicated periodic patterns, including multiple interlaced periodic components; 3, outliers and noises. In this paper, we propose a robust periodicity detection algorithm to address these challenges. Our algorithm applies maximal overlap discrete wavelet transform to transform the time series into multiple temporal-frequency scales such that different periodicities can be isolated. We rank them by wavelet variance and then at each scale, and then propose Huber-periodogram by formulating the periodogram as the solution to M-estimator for introducing robustness. We rigorously prove the theoretical properties of Huber-periodogram and justify the use of Fisher’s test on Huber-periodogram for periodicity detection. To further refine the detected periods, we compute unbiased autocorrelation function based on Wiener-Khinchin theorem from Huber-periodogram for improved robustness and efficiency. Experiments on synthetic and real-world datasets show that our algorithm outperforms other popular ones for both single and multiple periodicity detection. It is now implemented and provided as a public online service at Alibaba Group and has been used extensive in different business lines.
Tasks Anomaly Detection, Time Series, Time Series Analysis
Published 2020-02-21
URL https://arxiv.org/abs/2002.09535v1
PDF https://arxiv.org/pdf/2002.09535v1.pdf
PWC https://paperswithcode.com/paper/robustperiod-time-frequency-mining-for-robust
Repo
Framework

#### Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

Title Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM
Authors Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison
Abstract Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels. The many approaches to labelling scenes can be divided into two clear groups: view-based which estimate labels from the input view-wise data and then incrementally fuse them into the scene model as it is built; and map-based which label the generated scene model. However, there has so far been no attempt to quantitatively compare view-based and map-based labelling. Here, we present an experimental framework and comparison which uses real-time height map fusion as an accessible platform for a fair comparison, opening up the route to further systematic research in this area.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10342v1
PDF https://arxiv.org/pdf/2002.10342v1.pdf
PWC https://paperswithcode.com/paper/comparing-view-based-and-map-based-semantic
Repo
Framework

#### Differentiable Reasoning over a Virtual Knowledge Base

Title Differentiable Reasoning over a Virtual Knowledge Base
Authors Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen
Abstract We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB). In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus. At each step the module uses a combination of sparse-matrix TFIDF indices and a maximum inner product search (MIPS) on a special index of contextual representations of the mentions. This module is differentiable, so the full system can be trained end-to-end using gradient based methods, starting from natural language inputs. We also describe a pretraining scheme for the contextual representation encoder by generating hard negative examples using existing knowledge bases. We show that DrKIT improves accuracy by 9 points on 3-hop questions in the MetaQA dataset, cutting the gap between text-based and KB-based state-of-the-art by 70%. On HotpotQA, DrKIT leads to a 10% improvement over a BERT-based re-ranking approach to retrieving the relevant passages required to answer a question. DrKIT is also very efficient, processing 10-100x more queries per second than existing multi-hop systems.
Published 2020-02-25
URL https://arxiv.org/abs/2002.10640v1
PDF https://arxiv.org/pdf/2002.10640v1.pdf
PWC https://paperswithcode.com/paper/differentiable-reasoning-over-a-virtual-1
Repo
Framework

#### PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators

Title PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators
Authors Zhanhong Tan, Jiebo Song, Xiaolong Ma, Sia-Huat Tan, Hongyang Chen, Yuanqing Miao, Yifu Wu, Shaokai Ye, Yanzhi Wang, Dehui Li, Kaisheng Ma
Abstract Weight pruning is a powerful technique to realize model compression. We propose PCNN, a fine-grained regular 1D pruning method. A novel index format called Sparsity Pattern Mask (SPM) is presented to encode the sparsity in PCNN. Leveraging SPM with limited pruning patterns and non-zero sequences with equal length, PCNN can be efficiently employed in hardware. Evaluated on VGG-16 and ResNet-18, our PCNN achieves the compression rate up to 8.4X with only 0.2% accuracy loss. We also implement a pattern-aware architecture in 55nm process, achieving up to 9.0X speedup and 28.39 TOPS/W efficiency with only 3.1% on-chip memory overhead of indices.
Published 2020-02-11
URL https://arxiv.org/abs/2002.04997v1
PDF https://arxiv.org/pdf/2002.04997v1.pdf
PWC https://paperswithcode.com/paper/pcnn-pattern-based-fine-grained-regular
Repo
Framework

#### VarMixup: Exploiting the Latent Space for Robust Training and Inference

Title VarMixup: Exploiting the Latent Space for Robust Training and Inference
Authors Puneet Mangla, Vedant Singh, Shreyas Jayant Havaldar, Vineeth N Balasubramanian
Abstract The vulnerability of Deep Neural Networks (DNNs) to adversarial attacks has led to the development of many defense approaches. Among them, Adversarial Training (AT) is a popular and widely used approach for training adversarially robust models. Mixup Training (MT), a recent popular training algorithm, improves the generalization performance of models by introducing globally linear behavior in between training examples. Although still in its early phase, we observe a shift in trend of exploiting Mixup from perspectives of generalisation to that of adversarial robustness. It has been shown that the Mixup trained models improves the robustness of models but only passively. A recent approach, Mixup Inference (MI), proposes an inference principle for Mixup trained models to counter adversarial examples at inference time by mixing the input with other random clean samples. In this work, we propose a new approach - \textit{VarMixup (Variational Mixup)} - to better sample mixup images by using the latent manifold underlying the data. Our experiments on CIFAR-10, CIFAR-100, SVHN and Tiny-Imagenet demonstrate that \textit{VarMixup} beats state-of-the-art AT techniques without training the model adversarially. Additionally, we also conduct ablations that show that models trained on \textit{VarMixup} samples are also robust to various input corruptions/perturbations, have low calibration error and are transferable.
Published 2020-03-14
URL https://arxiv.org/abs/2003.06566v1
PDF https://arxiv.org/pdf/2003.06566v1.pdf
PWC https://paperswithcode.com/paper/varmixup-exploiting-the-latent-space-for
Repo
Framework

#### Small, Accurate, and Fast Vehicle Re-ID on the Edge: the SAFR Approach

Title Small, Accurate, and Fast Vehicle Re-ID on the Edge: the SAFR Approach
Authors Abhijit Suprem, Calton Pu, Joao Eduardo Ferreira
Abstract We propose a Small, Accurate, and Fast Re-ID (SAFR) design for flexible vehicle re-id under a variety of compute environments such as cloud, mobile, edge, or embedded devices by only changing the re-id model backbone. Through best-fit design choices, feature extraction, training tricks, global attention, and local attention, we create a reid model design that optimizes multi-dimensionally along model size, speed, & accuracy for deployment under various memory and compute constraints. We present several variations of our flexible SAFR model: SAFR-Large for cloud-type environments with large compute resources, SAFR-Small for mobile devices with some compute constraints, and SAFR-Micro for edge devices with severe memory & compute constraints. SAFR-Large delivers state-of-the-art results with mAP 81.34 on the VeRi-776 vehicle re-id dataset (15% better than related work). SAFR-Small trades a 5.2% drop in performance (mAP 77.14 on VeRi-776) for over 60% model compression and 150% speedup. SAFR-Micro, at only 6MB and 130MFLOPS, trades 6.8% drop in accuracy (mAP 75.80 on VeRi-776) for 95% compression and 33x speedup compared to SAFR-Large.
Published 2020-01-24
URL https://arxiv.org/abs/2001.08895v1
PDF https://arxiv.org/pdf/2001.08895v1.pdf
PWC https://paperswithcode.com/paper/small-accurate-and-fast-vehicle-re-id-on-the
Repo
Framework

#### A Fixation-based 360° Benchmark Dataset for Salient Object Detection

Title A Fixation-based 360° Benchmark Dataset for Salient Object Detection
Authors Yi Zhang, Lu Zhang, Wassim Hamidouche, Olivier Deforges
Abstract Fixation prediction (FP) in panoramic contents has been widely investigated along with the booming trend of virtual reality (VR) applications. However, another issue within the field of visual saliency, salient object detection (SOD), has been seldom explored in 360{\deg} (or omnidirectional) images due to the lack of datasets representative of real scenes with pixel-level annotations. Toward this end, we collect 107 equirectangular panoramas with challenging scenes and multiple object classes. Based on the consistency between FP and explicit saliency judgements, we further manually annotate 1,165 salient objects over the collected images with precise masks under the guidance of real human eye fixation maps. Six state-of-the-art SOD models are then benchmarked on the proposed fixation-based 360{\deg} image dataset (F-360iSOD), by applying a multiple cubic projection-based fine-tuning method. Experimental results show a limitation of the current methods when used for SOD in panoramic images, which indicates the proposed dataset is challenging. Key issues for 360{\deg} SOD is also discussed. The proposed dataset is available at https://github.com/Panorama-Bill/F-360iSOD.
Tasks Object Detection, Salient Object Detection
Published 2020-01-22
URL https://arxiv.org/abs/2001.07960v1
PDF https://arxiv.org/pdf/2001.07960v1.pdf
PWC https://paperswithcode.com/paper/a-fixation-based-360-benchmark-dataset-for
Repo
Framework

#### LiDAR guided Small obstacle Segmentation

Title LiDAR guided Small obstacle Segmentation
Abstract Detecting small obstacles on the road is critical for autonomous driving. In this paper, we present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision. LiDAR is employed to provide additional context in the form of confidence maps to monocular segmentation networks. We show significant performance gains when the context is fed as an additional input to monocular semantic segmentation frameworks. We further present a new semantic segmentation dataset to the community, comprising of over 3000 image frames with corresponding LiDAR observations. The images come with pixel-wise annotations of three classes off-road, road, and small obstacle. We stress that precise calibration between LiDAR and camera is crucial for this task and thus propose a novel Hausdorff distance based calibration refinement method over extrinsic parameters. As a first benchmark over this dataset, we report our results with 73% instance detection up to a distance of 50 meters on challenging scenarios. Qualitatively by showcasing accurate segmentation of obstacles less than 15 cms at 50m depth and quantitatively through favourable comparisons vis a vis prior art, we vindicate the method’s efficacy. Our project-page and Dataset is hosted at https://small-obstacle-dataset.github.io/
Tasks Autonomous Driving, Calibration, Semantic Segmentation
Published 2020-03-12
URL https://arxiv.org/abs/2003.05970v1
PDF https://arxiv.org/pdf/2003.05970v1.pdf
PWC https://paperswithcode.com/paper/lidar-guided-small-obstacle-segmentation
Repo
Framework

#### A Hybrid Algorithm Based Robust Big Data Clustering for Solving Unhealthy Initialization, Dynamic Centroid Selection and Empty clustering Problems with Analysis

Title A Hybrid Algorithm Based Robust Big Data Clustering for Solving Unhealthy Initialization, Dynamic Centroid Selection and Empty clustering Problems with Analysis
Authors Y. A. Joarder, Mosabbir Ahmed
Abstract Big Data is a massive volume of both structured and unstructured data that is too large and it also difficult to process using traditional techniques. Clustering algorithms have developed as a powerful learning tool that can exactly analyze the volume of data that produced by modern applications. Clustering in data mining is the grouping of a particular set of objects based on their characteristics. The main aim of clustering is to classified data into clusters such that objects are grouped in the same clusters when they are corresponding according to similarities and features mainly. Till now, K-MEANS is the best utilized calculation connected in a wide scope of zones to recognize gatherings where cluster separations are a lot than between gathering separations. Our developed algorithm works with K-MEANS for high quality clustering during clustering from big data. Our proposed algorithm EG K-MEANS : Extended Generation K-MEANS solves mainly three issues of K-MEANS: unhealthy initialization, dynamic centroid selection and empty clustering. It ensures the best way of preventing unhealthy initialization, dynamic centroid selection and empty clustering problems for getting high quality clustering.
Published 2020-02-21
URL https://arxiv.org/abs/2002.09380v1
PDF https://arxiv.org/pdf/2002.09380v1.pdf
PWC https://paperswithcode.com/paper/a-hybrid-algorithm-based-robust-big-data
Repo
Framework

#### AP-MTL: Attention Pruned Multi-task Learning Model for Real-time Instrument Detection and Segmentation in Robot-assisted Surgery

Title AP-MTL: Attention Pruned Multi-task Learning Model for Real-time Instrument Detection and Segmentation in Robot-assisted Surgery
Authors Mobarakol Islam, Vibashan VS, Hongliang Ren
Abstract Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less significant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites significant features and performs dynamic spatial and channel-wise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model significantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.
Published 2020-03-10
URL https://arxiv.org/abs/2003.04769v1
PDF https://arxiv.org/pdf/2003.04769v1.pdf