Paper Group ANR 151
Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models. Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts. BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels. Creating High Resolution Images with a Latent Adversarial Generator. Zero-shot and few-shot time s …
Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models
Title | Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models |
Authors | Thiernithi Variddhisai, Danilo Mandic |
Abstract | The concept of a random process has been recently extended to graph signals, whereby random graph processes are a class of multivariate stochastic processes whose coefficients are matrices with a \textit{graph-topological} structure. The system identification problem of a random graph process therefore revolves around determining its underlying topology, or mathematically, the graph shift operators (GSOs) i.e. an adjacency matrix or a Laplacian matrix. In the same work that introduced random graph processes, a \textit{batch} optimization method to solve for the GSO was also proposed for the random graph process based on a \textit{causal} vertex-time autoregressive model. To this end, the online version of this optimization problem was proposed via the framework of adaptive filtering. The modified stochastic gradient projection method was employed on the regularized least squares objective to create the filter. The recursion is divided into 3 regularized sub-problems to address issues like multi-convexity, sparsity, commutativity and bias. A discussion on convergence analysis is also included. Finally, experiments are conducted to illustrate the performance of the proposed algorithm, from traditional MSE measure to successful recovery rate regardless correct values, all of which to shed light on the potential, the limit and the possible research attempt of this work. |
Tasks | |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.05729v1 |
https://arxiv.org/pdf/2003.05729v1.pdf | |
PWC | https://paperswithcode.com/paper/methods-of-adaptive-signal-processing-on |
Repo | |
Framework | |
Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts
Title | Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts |
Authors | Kalun Ho, Janis Keuper, Margret Keuper |
Abstract | Multiple Object Tracking (MOT) is a long-standing task in computer vision. Current approaches based on the tracking by detection paradigm either require some sort of domain knowledge or supervision to associate data correctly into tracks. In this work, we present an unsupervised multiple object tracking approach based on visual features and minimum cost lifted multicuts. Our method is based on straight-forward spatio-temporal cues that can be extracted from neighboring frames in an image sequences without superivison. Clustering based on these cues enables us to learn the required appearance invariances for the tracking task at hand and train an autoencoder to generate suitable latent representation. Thus, the resulting latent representations can serve as robust appearance cues for tracking even over large temporal distances where no reliable spatio-temporal features could be extracted. We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking. |
Tasks | Multiple Object Tracking, Object Tracking |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01192v1 |
https://arxiv.org/pdf/2002.01192v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-multiple-person-tracking-using |
Repo | |
Framework | |
BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels
Title | BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels |
Authors | Zan Shen, Jiang Qian, Bojin Zhuang, Shaojun Wang, Jing Xiao |
Abstract | One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) due to weight sharing and single training of a supernet. However, existing methods generally suffer from two issues: predetermined number of channels in each layer which is suboptimal; and model averaging effects and poor ranking correlation caused by weight coupling and continuously expanding search space. To explicitly address these issues, in this paper, a Broadening-and-Shrinking One-Shot NAS (BS-NAS) framework is proposed, in which broadening' refers to broadening the search space with a spring block enabling search for numbers of channels during training of the supernet; while shrinking’ refers to a novel shrinking strategy gradually turning off those underperforming operations. The above innovations broaden the search space for wider representation and then shrink it by gradually removing underperforming operations, followed by an evolutionary algorithm to efficiently search for the optimal architecture. Extensive experiments on ImageNet illustrate the effectiveness of the proposed BS-NAS as well as the state-of-the-art performance. |
Tasks | Neural Architecture Search |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09821v1 |
https://arxiv.org/pdf/2003.09821v1.pdf | |
PWC | https://paperswithcode.com/paper/bs-nas-broadening-and-shrinking-one-shot-nas |
Repo | |
Framework | |
Creating High Resolution Images with a Latent Adversarial Generator
Title | Creating High Resolution Images with a Latent Adversarial Generator |
Authors | David Berthelot, Peyman Milanfar, Ian Goodfellow |
Abstract | Generating realistic images is difficult, and many formulations for this task have been proposed recently. If we restrict the task to that of generating a particular class of images, however, the task becomes more tractable. That is to say, instead of generating an arbitrary image as a sample from the manifold of natural images, we propose to sample images from a particular “subspace” of natural images, directed by a low-resolution image from the same subspace. The problem we address, while close to the formulation of the single-image super-resolution problem, is in fact rather different. Single image super-resolution is the task of predicting the image closest to the ground truth from a relatively low resolution image. We propose to produce samples of high resolution images given extremely small inputs with a new method called Latent Adversarial Generator (LAG). In our generative sampling framework, we only use the input (possibly of very low-resolution) to direct what class of samples the network should produce. As such, the output of our algorithm is not a unique image that relates to the input, but rather a possible se} of related images sampled from the manifold of natural images. Our method learns exclusively in the latent space of the adversary using perceptual loss – it does not have a pixel loss. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02365v1 |
https://arxiv.org/pdf/2003.02365v1.pdf | |
PWC | https://paperswithcode.com/paper/creating-high-resolution-images-with-a-latent |
Repo | |
Framework | |
Zero-shot and few-shot time series forecasting with ordinal regression recurrent neural networks
Title | Zero-shot and few-shot time series forecasting with ordinal regression recurrent neural networks |
Authors | Bernardo Pérez Orozco, Stephen J Roberts |
Abstract | Recurrent neural networks (RNNs) are state-of-the-art in several sequential learning tasks, but they often require considerable amounts of data to generalise well. For many time series forecasting (TSF) tasks, only a few dozens of observations may be available at training time, which restricts use of this class of models. We propose a novel RNN-based model that directly addresses this problem by learning a shared feature embedding over the space of many quantised time series. We show how this enables our RNN framework to accurately and reliably forecast unseen time series, even when there is little to no training data available. |
Tasks | Time Series, Time Series Forecasting |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12162v1 |
https://arxiv.org/pdf/2003.12162v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-and-few-shot-time-series |
Repo | |
Framework | |
3D Object Detection From LiDAR Data Using Distance Dependent Feature Extraction
Title | 3D Object Detection From LiDAR Data Using Distance Dependent Feature Extraction |
Authors | Guus Engels, Nerea Aranjuelo, Ignacio Arganda-Carreras, Marcos Nieto, Oihana Otaegui |
Abstract | This paper presents a new approach to 3D object detection that leverages the properties of the data obtained by a LiDAR sensor. State-of-the-art detectors use neural network architectures based on assumptions valid for camera images. However, point clouds obtained from LiDAR are fundamentally different. Most detectors use shared filter kernels to extract features which do not take into account the range dependent nature of the point cloud features. To show this, different detectors are trained on two splits of the KITTI dataset: close range (objects up to 25 meters from LiDAR) and long-range. Top view images are generated from point clouds as input for the networks. Combined results outperform the baseline network trained on the full dataset with a single backbone. Additional research compares the effect of using different input features when converting the point cloud to image. The results indicate that the network focuses on the shape and structure of the objects, rather than exact values of the input. This work proposes an improvement for 3D object detectors by taking into account the properties of LiDAR point clouds over distance. Results show that training separate networks for close-range and long-range objects boosts performance for all KITTI benchmark difficulties. |
Tasks | 3D Object Detection, Object Detection |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00888v2 |
https://arxiv.org/pdf/2003.00888v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-object-detection-from-lidar-data-using |
Repo | |
Framework | |
Validation Set Evaluation can be Wrong: An Evaluator-Generator Approach for Maximizing Online Performance of Ranking in E-commerce
Title | Validation Set Evaluation can be Wrong: An Evaluator-Generator Approach for Maximizing Online Performance of Ranking in E-commerce |
Authors | Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Wen-Ji Zhou, Qing Da, An-Xiang Zeng, Yang Yu |
Abstract | Learning-to-rank (LTR) has become a key technology in E-commerce applications. Previous LTR approaches followed the supervised learning paradigm so that learned models should match the labeled data point-wisely or pair-wisely. However, we have noticed that global context information, including the total order of items in the displayed webpage, can play an important role in interactions with the customers. Therefore, to approach the best global ordering, the exploration in a large combinatorial space of items is necessary, which requires evaluating orders that may not appear in the labeled data. In this scenario, we first show that the classical data-based metrics can be inconsistent with online performance, or even misleading. We then propose to learn an evaluator and search the best model guided by the evaluator, which forms the evaluator-generator framework for training the group-wise LTR model. The evaluator is learned from the labeled data, and is enhanced by incorporating the order context information. The generator is trained with the supervision of the evaluator by reinforcement learning to generate the best order in the combinatorial space. Our experiments in one of the world’s largest retail platforms disclose that the learned evaluator is a much better indicator than classical data-based metrics. Moreover, our LTR model achieves a significant improvement ($\textgreater2%$) from the current industrial-level pair-wise models in terms of both Conversion Rate (CR) and Gross Merchandise Volume (GMV) in online A/B tests. |
Tasks | Learning-To-Rank |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11941v2 |
https://arxiv.org/pdf/2003.11941v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-the-ground-truth-an-evaluator |
Repo | |
Framework | |
Machine Learning based Anomaly Detection for 5G Networks
Title | Machine Learning based Anomaly Detection for 5G Networks |
Authors | Jordan Lam, Robert Abbas |
Abstract | Protecting the networks of tomorrow is set to be a challenging domain due to increasing cyber security threats and widening attack surfaces created by the Internet of Things (IoT), increased network heterogeneity, increased use of virtualisation technologies and distributed architectures. This paper proposes SDS (Software Defined Security) as a means to provide an automated, flexible and scalable network defence system. SDS will harness current advances in machine learning to design a CNN (Convolutional Neural Network) using NAS (Neural Architecture Search) to detect anomalous network traffic. SDS can be applied to an intrusion detection system to create a more proactive and end-to-end defence for a 5G network. To test this assumption, normal and anomalous network flows from a simulated environment have been collected and analyzed with a CNN. The results from this method are promising as the model has identified benign traffic with a 100% accuracy rate and anomalous traffic with a 96.4% detection rate. This demonstrates the effectiveness of network flow analysis for a variety of common malicious attacks and also provides a viable option for detection of encrypted malicious network traffic. |
Tasks | Anomaly Detection, Intrusion Detection, Neural Architecture Search |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03474v1 |
https://arxiv.org/pdf/2003.03474v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-anomaly-detection-for |
Repo | |
Framework | |
Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks
Title | Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks |
Authors | Feibo Jiang, Kezhi Wang, Li Dong, Cunhua Pan, Kun Yang |
Abstract | An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the mobile users, by optimizing offloading decision, transmission power, and resource allocation in the mobile edge computing (MEC) system. Towards this end, a deep reinforcement learning (DRL) method is proposed to obtain an online resource scheduling policy. Firstly, a related and regularized stacked auto encoder (2r-SAE) with unsupervised learning is proposed to perform data compression and representation for high dimensional channel quality information (CQI) data, which can reduce the state space for DRL. Secondly, we present an adaptive simulated annealing based approach (ASA) as the action search method of DRL, in which an adaptive h-mutation is used to guide the search direction and an adaptive iteration is proposed to enhance the search efficiency during the DRL process. Thirdly, a preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy. Numerical results are provided to demonstrate that the proposed algorithm can achieve near-optimal performance while significantly decreasing the computational time compared with existing benchmarks. It also shows that the proposed framework is suitable for resource scheduling problem in large-scale MEC networks, especially in the dynamic environment. |
Tasks | |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.09223v1 |
https://arxiv.org/pdf/2001.09223v1.pdf | |
PWC | https://paperswithcode.com/paper/stacked-auto-encoder-based-deep-reinforcement |
Repo | |
Framework | |
Sampled Training and Node Inheritance for Fast Evolutionary Neural Architecture Search
Title | Sampled Training and Node Inheritance for Fast Evolutionary Neural Architecture Search |
Authors | Haoyu Zhang, Yaochu Jin, Ran Cheng, Kuangrong Hao |
Abstract | The performance of a deep neural network is heavily dependent on its architecture and various neural architecture search strategies have been developed for automated network architecture design. Recently, evolutionary neural architecture search (ENAS) has received increasing attention due to the attractive global optimization capability of evolutionary algorithms. However, ENAS suffers from extremely high computation costs because a large number of performance evaluations is usually required in evolutionary optimization and training deep neural networks is itself computationally very intensive. To address this issue, this paper proposes a new evolutionary framework for fast ENAS based on directed acyclic graph, in which parents are randomly sampled and trained on each mini-batch of training data. In addition, a node inheritance strategy is adopted to generate offspring individuals and their fitness is directly evaluated without training. To enhance the feature processing capability of the evolved neural networks, we also encode a channel attention mechanism in the search space. We evaluate the proposed algorithm on the widely used datasets, in comparison with 26 state-of-the-art peer algorithms. Our experimental results show the proposed algorithm is not only computationally much more efficiently, but also highly competitive in learning performance. |
Tasks | Neural Architecture Search |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.11613v1 |
https://arxiv.org/pdf/2003.11613v1.pdf | |
PWC | https://paperswithcode.com/paper/sampled-training-and-node-inheritance-for |
Repo | |
Framework | |
MCENET: Multi-Context Encoder Network for Homogeneous Agent Trajectory Prediction in Mixed Traffic
Title | MCENET: Multi-Context Encoder Network for Homogeneous Agent Trajectory Prediction in Mixed Traffic |
Authors | Hao Cheng, Wentong Liao. Michael Ying Yang, Monika Sester, Bodo Rosenhahn |
Abstract | Trajectory prediction in urban mixed-traffic zones (a.k.a. shared spaces) is critical for many intelligent transportation systems, such as intent detection for autonomous driving. However, there are many challenges to predict the trajectories of heterogeneous road agents (pedestrians, cyclists and vehicles) at a microscopical level. For example, an agent might be able to choose multiple plausible paths in complex interactions with other agents in varying environments. To this end, we propose an approach named Multi-Context Encoder Network (MCENET) that is trained by encoding both past and future scene context, interaction context and motion information to capture the patterns and variations of the future trajectories using a set of stochastic latent variables. In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories in the feature. Through experiments on several datasets of varying scenes, our method outperforms some of the recent state-of-the-art methods for mixed traffic trajectory prediction by a large margin and more robust in a very challenging environment. The impact of each context is justified via ablation studies. |
Tasks | Autonomous Driving, Intent Detection, Trajectory Prediction |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.05966v3 |
https://arxiv.org/pdf/2002.05966v3.pdf | |
PWC | https://paperswithcode.com/paper/context-conditional-variational-autoencoder |
Repo | |
Framework | |
Monocular 3D Object Detection in Cylindrical Images from Fisheye Cameras
Title | Monocular 3D Object Detection in Cylindrical Images from Fisheye Cameras |
Authors | Elad Plaut, Erez Ben Yaacov, Bat El Shlomo |
Abstract | Detecting objects in 3D from a monocular camera has been successfully demonstrated using various methods based on convolutional neural networks. These methods have been demonstrated on rectilinear perspective images equivalent to being taken by a pinhole camera, whose geometry is explicitly or implicitly exploited. Such methods fail in images with alternative projections, such as those acquired by fisheye cameras, even when provided with a labeled training set of fisheye images and 3D bounding boxes. In this work, we show how to adapt existing 3D object detection methods to images from fisheye cameras, including in the case that no labeled fisheye data is available for training. We significantly outperform existing art on a benchmark of synthetic data, and we also experiment with an internal dataset of real fisheye images. |
Tasks | 3D Object Detection, Object Detection |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.03759v1 |
https://arxiv.org/pdf/2003.03759v1.pdf | |
PWC | https://paperswithcode.com/paper/monocular-3d-object-detection-in-cylindrical |
Repo | |
Framework | |
A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation
Title | A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation |
Authors | Yihua Cheng, Shiyao Huang, Fei Wang, Chen Qian, Feng Lu |
Abstract | Human gaze is essential for various appealing applications. Aiming at more accurate gaze estimation, a series of recent works propose to utilize face and eye images simultaneously. Nevertheless, face and eye images only serve as independent or parallel feature sources in those works, the intrinsic correlation between their features is overlooked. In this paper we make the following contributions: 1) We propose a coarse-to-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images. 2) Guided by the proposed strategy, we design a framework which introduces a bi-gram model to bridge gaze residual and basic gaze direction, and an attention component to adaptively acquire suitable fine-grained feature. 3) Integrating the above innovations, we construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap. |
Tasks | Gaze Estimation |
Published | 2020-01-01 |
URL | https://arxiv.org/abs/2001.00187v1 |
https://arxiv.org/pdf/2001.00187v1.pdf | |
PWC | https://paperswithcode.com/paper/a-coarse-to-fine-adaptive-network-for |
Repo | |
Framework | |
Loop estimator for discounted values in Markov reward processes
Title | Loop estimator for discounted values in Markov reward processes |
Authors | Falcon Z. Dai, Matthew R. Walter |
Abstract | At the working heart of policy iteration algorithms commonly used and studied in the discounted setting of reinforcement learning, the policy evaluation step estimates the value of state with samples from a Markov reward process induced by following a Markov policy in a Markov decision process. We propose a simple and efficient estimator called \emph{loop estimator} that exploits the regenerative structure of Markov reward processes without explicitly estimating a full model. Our method enjoys a space complexity of $O(1)$ when estimating the value of a single positive recurrent state $s$ unlike TD (with $O(S)$) or model-based methods (with $O(S^2)$). Moreover, the regenerative structure enables us to show, without relying on the generative model approach, that the estimator has an instance-dependent convergence rate of $\widetilde{O}(\sqrt{\tau_s/T})$ over steps $T$ on a single sample path, where $\tau_s$ is the maximal expected hitting time to state $s$. In preliminary numerical experiments, the loop estimator outperforms model-free methods, such as TD(k), and is competitive with the model-based estimator. |
Tasks | |
Published | 2020-02-15 |
URL | https://arxiv.org/abs/2002.06299v1 |
https://arxiv.org/pdf/2002.06299v1.pdf | |
PWC | https://paperswithcode.com/paper/loop-estimator-for-discounted-values-in |
Repo | |
Framework | |
Exploiting Ergonomic Priors in Human-to-Robot Task Transfer
Title | Exploiting Ergonomic Priors in Human-to-Robot Task Transfer |
Authors | Jeevan Manavalan, Prabhakar Ray, Matthew Howard |
Abstract | In recent years, there has been a booming shift in the development of versatile, autonomous robots by introducing means to intuitively teach robots task-oriented behaviour by demonstration. In this paper, a method based on programming by demonstration is proposed to learn null space policies from constrained motion data. The main advantage to using this is generalisation of a task by retargeting a systems redundancy as well as the capability to fully replace an entire system with another of varying link number and lengths while still accurately repeating a task subject to the same constraints. The effectiveness of the method has been demonstrated in a 3-link simulation and a real world experiment using a human subject as the demonstrator and is verified through task reproduction on a 7DoF physical robot. In simulation, the method works accurately with even as little as five data points producing errors less than 10^-14. The approach is shown to outperform the current state-of-the-art approach in a simulated 3DoF robot manipulator control problem where motions are reproduced using learnt constraints. Retargeting of a systems null space component is also demonstrated in a task where controlling how redundancy is resolved allows for obstacle avoidance. Finally, the approach is verified in a real world experiment using demonstrations from a human subject where the learnt task space trajectory is transferred onto a 7DoF physical robot of a different embodiment. |
Tasks | |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.00544v1 |
https://arxiv.org/pdf/2003.00544v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-ergonomic-priors-in-human-to-robot |
Repo | |
Framework | |