April 1, 2020

3281 words 16 mins read

Paper Group ANR 446

Fast Lidar Clustering by Density and Connectivity. Globally Optimal Contrast Maximisation for Event-based Motion Estimation. Self-Adversarial Learning with Comparative Discrimination for Text Generation. Active Lighting Recurrence by Parallel Lighting Analogy for Fine-Grained Change Detection. Self-Supervised Object-Level Deep Reinforcement Learnin …

Fast Lidar Clustering by Density and Connectivity


Title	Fast Lidar Clustering by Density and Connectivity
Authors	Frederik Hasecke, Lukas Hahn, Anton Kummert
Abstract	Lidar sensors are widely used in various applications, ranging from scientific fields over industrial use to integration in consumer products. With an ever growing number of different driver assistance systems, they have been introduced to automotive series production in recent years and are considered an important building block for the practical realisation of autonomous driving. However, due to the potentially large amount of Lidar points per scan, tailored algorithms are required to identify objects (e.g. pedestrians or vehicles) with high precision in a very short time. In this work, we propose an algorithmic approach for real-time instance segmentation of Lidar sensor data. We show how our method leverages the properties of the Euclidean distance to retain three-dimensional measurement information, while being narrowed down to a two-dimensional representation for fast computation. We further introduce what we call skip connections, to make our approach robust against over-segmentation and improve assignment in cases of partial occlusion. Through detailed evaluation on public data and comparison with established methods, we show how these aspects enable state-of-the-art performance and runtime on a single CPU core.
Tasks	Autonomous Driving, Instance Segmentation, Real-time Instance Segmentation, Semantic Segmentation
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00575v1
PDF	https://arxiv.org/pdf/2003.00575v1.pdf
PWC	https://paperswithcode.com/paper/fast-lidar-clustering-by-density-and
Repo
Framework

Globally Optimal Contrast Maximisation for Event-based Motion Estimation


Title	Globally Optimal Contrast Maximisation for Event-based Motion Estimation
Authors	Daqi Liu, Álvaro Parra, Tat-Jun Chin
Abstract	Contrast maximisation estimates the motion captured in an event stream by maximising the sharpness of the motion compensated event image. To carry out contrast maximisation, many previous works employ iterative optimisation algorithms, such as conjugate gradient, which require good initialisation to avoid converging to bad local minima. To alleviate this weakness, we propose a new globally optimal event-based motion estimation algorithm. Based on branch-and-bound (BnB), our method solves rotational (3DoF) motion estimation on event streams, which supports practical applications such as video stabilisation and attitude estimation. Underpinning our method are novel bounding functions for contrast maximisation, whose theoretical validity is rigorously established. We show concrete examples from public datasets where globally optimal solutions are vital to the success of contrast maximisation. Despite its exact nature, our algorithm is currently able to process a 50,000 event input in 300 seconds (a locally optimal solver takes 30 seconds on the same input), and has the potential to be further speeded-up using GPUs.
Tasks	Motion Estimation
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10686v3
PDF	https://arxiv.org/pdf/2002.10686v3.pdf
PWC	https://paperswithcode.com/paper/globally-optimal-contrast-maximisation-for
Repo
Framework

Self-Adversarial Learning with Comparative Discrimination for Text Generation


Title	Self-Adversarial Learning with Comparative Discrimination for Text Generation
Authors	Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou
Abstract	Conventional Generative Adversarial Networks (GANs) for text generation tend to have issues of reward sparsity and mode collapse that affect the quality and diversity of generated samples. To address the issues, we propose a novel self-adversarial learning (SAL) paradigm for improving GANs’ performance in text generation. In contrast to standard GANs that use a binary classifier as its discriminator to predict whether a sample is real or generated, SAL employs a comparative discriminator which is a pairwise classifier for comparing the text quality between a pair of samples. During training, SAL rewards the generator when its currently generated sentence is found to be better than its previously generated samples. This self-improvement reward mechanism allows the model to receive credits more easily and avoid collapsing towards the limited number of real samples, which not only helps alleviate the reward sparsity issue but also reduces the risk of mode collapse. Experiments on text generation benchmark datasets show that our proposed approach substantially improves both the quality and the diversity, and yields more stable performance compared to the previous GANs for text generation.
Tasks	Text Generation
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11691v2
PDF	https://arxiv.org/pdf/2001.11691v2.pdf
PWC	https://paperswithcode.com/paper/self-adversarial-learning-with-comparative-1
Repo
Framework

Active Lighting Recurrence by Parallel Lighting Analogy for Fine-Grained Change Detection


Title	Active Lighting Recurrence by Parallel Lighting Analogy for Fine-Grained Change Detection
Authors	Qian Zhang, Wei Feng, Liang Wan, Fei-Peng Tian, Xiaowei Wang, Ping Tan
Abstract	This paper studies a new problem, namely active lighting recurrence (ALR) that physically relocalizes a light source to reproduce the lighting condition from single reference image for a same scene, which may suffer from fine-grained changes during twice observations. ALR is of great importance for fine-grained visual inspection and change detection, because some phenomena or minute changes can only be clearly observed under particular lighting conditions. Therefore, effective ALR should be able to online navigate a light source toward the target pose, which is challenging due to the complexity and diversity of real-world lighting and imaging processes. To this end, we propose to use the simple parallel lighting as an analogy model and based on Lambertian law to compose an instant navigation ball for this purpose. We theoretically prove the feasibility, i.e., equivalence and convergence, of this ALR approach for realistic near point light source and small near surface light source. Besides, we also theoretically prove the invariance of our ALR approach to the ambiguity of normal and lighting decomposition. The effectiveness and superiority of the proposed approach have been verified by both extensive quantitative experiments and challenging real-world tasks on fine-grained change detection of cultural heritages. We also validate the generality of our approach to non-Lambertian scenes.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09663v1
PDF	https://arxiv.org/pdf/2002.09663v1.pdf
PWC	https://paperswithcode.com/paper/active-lighting-recurrence-by-parallel
Repo
Framework

Self-Supervised Object-Level Deep Reinforcement Learning


Title	Self-Supervised Object-Level Deep Reinforcement Learning
Authors	William Agnew, Pedro Domingos
Abstract	Current deep reinforcement learning approaches incorporate minimal prior knowledge about the environment, limiting computational and sample efficiency. We incorporate a few object-based priors that humans are known to use: “Infants divide perceptual arrays into units that move as connected wholes, that move separately from one another, that tend to maintain their size and shape over motion, and that tend to act upon each other only on contact” [Spelke]. We propose a probabilistic object-based model of environments and use human object priors to develop an efficient self-supervised algorithm for maximum likelihood estimation of the model parameters from observations and for inferring objects directly from the perceptual stream. We then use object features and incorporate object-contact priors to improve the sample efficiency our object-based RL agent.We evaluate our approach on a subset of the Atari benchmarks, and learn up to four orders of magnitude faster than the standard deep Q-learning network, rendering rapid desktop experiments in this domain feasible. To our knowledge, our system is the first to learn any Atari task in fewer environment interactions than humans.
Tasks	Q-Learning
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01384v1
PDF	https://arxiv.org/pdf/2003.01384v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-object-level-deep
Repo
Framework

NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs


Title	NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs
Authors	Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, Siddharth Garg
Abstract	This paper proposes a novel two-stage defense (NNoculation) against backdoored neural networks (BadNets) that, unlike existing defenses, makes minimal assumptions on the shape, size and location of backdoor triggers and BadNet’s functioning. In the pre-deployment stage, NNoculation retrains the network using “broad-spectrum” random perturbations of inputs drawn from a clean validation set to partially reduce the adversarial impact of a backdoor. In the post-deployment stage, NNoculation detects and quarantines backdoored test inputs by recording disagreements between the original and pre-deployment patched networks. A CycleGAN is then trained to learn transformations between clean validation inputs and quarantined inputs; i.e., it learns to add triggers to clean validation images. This transformed set of backdoored validation images along with their correct labels is used to further retrain the BadNet, yielding our final defense. NNoculation outperforms state-of-the-art defenses NeuralCleanse and Artificial Brain Simulation (ABS) that we show are ineffective when their restrictive assumptions are circumvented by the attacker.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08313v1
PDF	https://arxiv.org/pdf/2002.08313v1.pdf
PWC	https://paperswithcode.com/paper/nnoculation-broad-spectrum-and-targeted
Repo
Framework

Superaccurate Camera Calibration via Inverse Rendering


Title	Superaccurate Camera Calibration via Inverse Rendering
Authors	Morten Hannemose, Jakob Wilm, Jeppe Revall Frisvad
Abstract	The most prevalent routine for camera calibration is based on the detection of well-defined feature points on a purpose-made calibration artifact. These could be checkerboard saddle points, circles, rings or triangles, often printed on a planar structure. The feature points are first detected and then used in a nonlinear optimization to estimate the internal camera parameters.We propose a new method for camera calibration using the principle of inverse rendering. Instead of relying solely on detected feature points, we use an estimate of the internal parameters and the pose of the calibration object to implicitly render a non-photorealistic equivalent of the optical features. This enables us to compute pixel-wise differences in the image domain without interpolation artifacts. We can then improve our estimate of the internal parameters by minimizing pixel-wise least-squares differences. In this way, our model optimizes a meaningful metric in the image space assuming normally distributed noise characteristic for camera sensors.We demonstrate using synthetic and real camera images that our method improves the accuracy of estimated camera parameters as compared with current state-of-the-art calibration routines. Our method also estimates these parameters more robustly in the presence of noise and in situations where the number of calibration images is limited.
Tasks	Calibration
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09177v1
PDF	https://arxiv.org/pdf/2003.09177v1.pdf
PWC	https://paperswithcode.com/paper/superaccurate-camera-calibration-via-inverse
Repo
Framework

On the Convergence of Artificial Intelligence and Distributed Ledger Technology: A Scoping Review and Future Research Agenda


Title	On the Convergence of Artificial Intelligence and Distributed Ledger Technology: A Scoping Review and Future Research Agenda
Authors	Konstantin D. Pandl, Scott Thiebes, Manuel Schmidt-Kraepelin, Ali Sunyaev
Abstract	Developments in Artificial Intelligence (AI) and Distributed Ledger Technology (DLT) currently lead to lively debates in academia and practice. AI processes data to perform tasks that were previously thought possible only for humans. DLT has the potential to create consensus over data among a group of participants in uncertain environments. In recent research, both technologies are used in similar and even the same systems. Examples include the design of secure distributed ledgers or the creation of allied learning systems distributed across multiple nodes. This can lead to technological convergence, which in the past, has paved the way for major innovations in information technology. Previous work highlights several potential benefits of the convergence of AI and DLT but only provides a limited theoretical framework to describe upcoming real-world integration cases of both technologies. We aim to contribute by conducting a systematic literature review on previous work and providing rigorously derived future research opportunities. This work helps researchers active in AI or DLT to overcome current limitations in their field, and practitioners to develop systems along with the convergence of both technologies.
Tasks
Published	2020-01-29
URL	https://arxiv.org/abs/2001.11017v2
PDF	https://arxiv.org/pdf/2001.11017v2.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-artificial-intelligence
Repo
Framework

Who2com: Collaborative Perception via Learnable Handshake Communication


Title	Who2com: Collaborative Perception via Learnable Handshake Communication
Authors	Yen-Cheng Liu, Junjiao Tian, Chih-Yao Ma, Nathan Glaser, Chia-Wen Kuo, Zsolt Kira
Abstract	In this paper, we propose the problem of collaborative perception, where robots can combine their local observations with those of neighboring agents in a learnable way to improve accuracy on a perception task. Unlike existing work in robotics and multi-agent reinforcement learning, we formulate the problem as one where learned information must be shared across a set of agents in a bandwidth-sensitive manner to optimize for scene understanding tasks such as semantic segmentation. Inspired by networking communication protocols, we propose a multi-stage handshake communication mechanism where the neural network can learn to compress relevant information needed for each stage. Specifically, a target agent with degraded sensor data sends a compressed request, the other agents respond with matching scores, and the target agent determines who to connect with (i.e., receive information from). We additionally develop the AirSim-CP dataset and metrics based on the AirSim simulator where a group of aerial robots perceive diverse landscapes, such as roads, grasslands, buildings, etc. We show that for the semantic segmentation task, our handshake communication method significantly improves accuracy by approximately 20% over decentralized baselines, and is comparable to centralized ones using a quarter of the bandwidth.
Tasks	Multi-agent Reinforcement Learning, Scene Understanding, Semantic Segmentation
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09575v1
PDF	https://arxiv.org/pdf/2003.09575v1.pdf
PWC	https://paperswithcode.com/paper/who2com-collaborative-perception-via
Repo
Framework

The Emerging Landscape of Explainable AI Planning and Decision Making


Title	The Emerging Landscape of Explainable AI Planning and Decision Making
Authors	Tathagata Chakraborti, Sarath Sreedharan, Subbarao Kambhampati
Abstract	In this paper, we provide a comprehensive outline of the different threads of work in Explainable AI Planning (XAIP) that has emerged as a focus area in the last couple of years and contrast that with earlier efforts in the field in terms of techniques, target users, and delivery mechanisms. We hope that the survey will provide guidance to new researchers in automated planning towards the role of explanations in the effective design of human-in-the-loop systems, as well as provide the established researcher with some perspective on the evolution of the exciting world of explainable planning.
Tasks	Decision Making
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11697v1
PDF	https://arxiv.org/pdf/2002.11697v1.pdf
PWC	https://paperswithcode.com/paper/the-emerging-landscape-of-explainable-ai
Repo
Framework

Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness


Title	Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness
Authors	Thinh T. Doan, Lam M. Nguyen, Nhan H. Pham, Justin Romberg
Abstract	Motivated by broad applications in reinforcement learning and machine learning, this paper considers the popular stochastic gradient descent (SGD) when the gradients of the underlying objective function are sampled from Markov processes. This Markov sampling leads to the gradient samples being biased and not independent. The existing results for the convergence of SGD under Markov randomness are often established under the assumptions on the boundedness of either the iterates or the gradient samples. Our main focus is to study the finite-time convergence of SGD for different types of objective functions, without requiring these assumptions. We show that SGD converges nearly at the same rate with Markovian gradient samples as with independent gradient samples. The only difference is a logarithmic factor that accounts for the mixing time of the Markov chain.
Tasks
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10973v2
PDF	https://arxiv.org/pdf/2003.10973v2.pdf
PWC	https://paperswithcode.com/paper/finite-time-analysis-of-stochastic-gradient
Repo
Framework

A Deep Multi-Agent Reinforcement Learning Approach to Autonomous Separation Assurance


Title	A Deep Multi-Agent Reinforcement Learning Approach to Autonomous Separation Assurance
Authors	Marc Brittain, Xuxi Yang, Peng Wei
Abstract	A novel deep multi-agent reinforcement learning framework is proposed to identify and resolve conflicts among a variable number of aircraft in a high-density, stochastic, and dynamic sector in en route airspace. Currently the sector capacity is limited by human air traffic controller’s cognitive limitation. In order to scale up to a high-density airspace, in this work we investigate the feasibility of a new concept (autonomous separation assurance) and a new approach (multi-agent reinforcement learning) to push the sector capacity above human cognitive limitation. We propose the concept of using distributed vehicle autonomy to ensure separation, instead of a centralized sector air traffic controller. Our proposed framework utilizes an actor-critic model, Proximal Policy Optimization (PPO) that we customize to incorporate an attention network. By using the attention network, we are able to encode the information from a variable number of intruder aircraft into a fixed length vector and allow the agents to learn which intruder aircraft’s information is critical to achieve the optimal performance. This allows the agents to have access to variable aircraft information in the sector in a scalable, efficient approach to achieve high traffic throughput under uncertainty. The agents are trained using a centralized learning, decentralized execution scheme where one neural network is learned and shared by all agents in the environment. To validate the proposed framework, we designed three challenging case studies in the BlueSky air traffic control environment. Numerical results show the proposed framework significantly reduces the offline training time without sacrificing performance.
Tasks	Multi-agent Reinforcement Learning
Published	2020-03-17
URL	https://arxiv.org/abs/2003.08353v1
PDF	https://arxiv.org/pdf/2003.08353v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-multi-agent-reinforcement-learning
Repo
Framework

Regularized Cycle Consistent Generative Adversarial Network for Anomaly Detection


Title	Regularized Cycle Consistent Generative Adversarial Network for Anomaly Detection
Authors	Ziyi Yang, Iman Soltani Bozchalooi, Eric Darve
Abstract	In this paper, we investigate algorithms for anomaly detection. Previous anomaly detection methods focus on modeling the distribution of non-anomalous data provided during training. However, this does not necessarily ensure the correct detection of anomalous data. We propose a new Regularized Cycle Consistent Generative Adversarial Network (RCGAN) in which deep neural networks are adversarially trained to better recognize anomalous samples. This approach is based on leveraging a penalty distribution with a new definition of the loss function and novel use of discriminator networks. It is based on a solid mathematical foundation, and proofs show that our approach has stronger guarantees for detecting anomalous examples compared to the current state-of-the-art. Experimental results on both real-world and synthetic data show that our model leads to significant and consistent improvements on previous anomaly detection benchmarks. Notably, RCGAN improves on the state-of-the-art on the KDDCUP, Arrhythmia, Thyroid, Musk and CIFAR10 datasets.
Tasks	Anomaly Detection
Published	2020-01-18
URL	https://arxiv.org/abs/2001.06591v1
PDF	https://arxiv.org/pdf/2001.06591v1.pdf
PWC	https://paperswithcode.com/paper/regularized-cycle-consistent-generative
Repo
Framework

Regularity and stability of feedback relaxed controls


Title	Regularity and stability of feedback relaxed controls
Authors	Christoph Reisinger, Yufei Zhang
Abstract	This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a H"{o}lder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. We finally prove first-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters, which subsequently enables us to construct the pure exploitation strategy of the original control problem based on the feedback relaxed controls.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.03148v1
PDF	https://arxiv.org/pdf/2001.03148v1.pdf
PWC	https://paperswithcode.com/paper/regularity-and-stability-of-feedback-relaxed
Repo
Framework

Value Variance Minimization for Learning Approximate Equilibrium in Aggregation Systems


Title	Value Variance Minimization for Learning Approximate Equilibrium in Aggregation Systems
Authors	Tanvi Verma, Pradeep Varakantham
Abstract	For effective matching of resources (e.g., taxis, food, bikes, shopping items) to customer demand, aggregation systems have been extremely successful. In aggregation systems, a central entity (e.g., Uber, Food Panda, Ofo) aggregates supply (e.g., drivers, delivery personnel) and matches demand to supply on a continuous basis (sequential decisions). Due to the objective of the central entity to maximize its profits, individual suppliers get sacrificed thereby creating incentive for individuals to leave the system. In this paper, we consider the problem of learning approximate equilibrium solutions (win-win solutions) in aggregation systems, so that individuals have an incentive to remain in the aggregation system. Unfortunately, such systems have thousands of agents and have to consider demand uncertainty and the underlying problem is a (Partially Observable) Stochastic Game. Given the significant complexity of learning or planning in a stochastic game, we make three key contributions: (a) To exploit infinitesimally small contribution of each agent and anonymity (reward and transitions between agents are dependent on agent counts) in interactions, we represent this as a Multi-Agent Reinforcement Learning (MARL) problem that builds on insights from non-atomic congestion games model; (b) We provide a novel variance reduction mechanism for moving joint solution towards Nash Equilibrium that exploits the infinitesimally small contribution of each agent; and finally (c) We provide detailed results on three different domains to demonstrate the utility of our approach in comparison to state-of-the-art methods.
Tasks	Multi-agent Reinforcement Learning
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07088v1
PDF	https://arxiv.org/pdf/2003.07088v1.pdf
PWC	https://paperswithcode.com/paper/value-variance-minimization-for-learning
Repo
Framework