April 1, 2020

3046 words 15 mins read

Paper Group ANR 431

Paper Group ANR 431

Out-of-Distribution Detection with Distance Guarantee in Deep Generative Models. “Other-Play” for Zero-Shot Coordination. Visual-Inertial Telepresence for Aerial Manipulation. Camera-Based Adaptive Trajectory Guidance via Neural Networks. A Comprehensive Survey on the Ambulance Routing and Location Problems. Automatic detection and counting of reti …

Out-of-Distribution Detection with Distance Guarantee in Deep Generative Models

Title Out-of-Distribution Detection with Distance Guarantee in Deep Generative Models
Authors Yufeng Zhang, Wanwei Liu, Zhenbang Chen, Ji Wang, Zhiming Liu, Kenli Li, Hongmei Wei, Zuoning Chen
Abstract Recent research has shown that it is challenging to detect out-of-distribution (OOD) data in deep generative models including flow-based models and variational autoencoders (VAEs). In this paper, we prove a theorem that, for a well-trained flow-based model, the distance between the distribution of representations of an OOD dataset and prior can be large enough, as long as the distance between the distributions of the training dataset and the OOD dataset is large enough. Furthermore, our observation shows that, for flow-based model and VAE with factorized prior, the representations of OOD datasets are more correlated than that of the training dataset. Based on our theorem and observation, we propose detecting OOD data according to the total correlation of representations in flow-based model and VAE. Experimental results show that our method can achieve nearly 100% AUROC for all the widely used benchmarks and has robustness against data manipulation. While the state-of-the-art method performs not better than random guessing for challenging problems and can be fooled by data manipulation in almost all cases.
Tasks Out-of-Distribution Detection
Published 2020-02-09
URL https://arxiv.org/abs/2002.03328v1
PDF https://arxiv.org/pdf/2002.03328v1.pdf
PWC https://paperswithcode.com/paper/out-of-distribution-detection-with-distance
Repo
Framework

“Other-Play” for Zero-Shot Coordination

Title “Other-Play” for Zero-Shot Coordination
Authors Hengyuan Hu, Adam Lerer, Alex Peysakhovich, Jakob Foerster
Abstract We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e.g. humans). Standard Multi-Agent Reinforcement Learning (MARL) methods typically focus on the self-play (SP) setting where agents construct strategies by playing the game with themselves repeatedly. Unfortunately, applying SP naively to the zero-shot coordination problem can produce agents that establish highly specialized conventions that do not carry over to novel partners they have not been trained with. We introduce a novel learning algorithm called other-play (OP), that enhances self-play by looking for more robust strategies, exploiting the presence of known symmetries in the underlying problem. We characterize OP theoretically as well as experimentally. We study the cooperative card game Hanabi and show that OP agents achieve higher scores when paired with independently trained agents. In preliminary results we also show that our OP agents obtains higher average scores when paired with human players, compared to state-of-the-art SP agents.
Tasks Multi-agent Reinforcement Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.02979v2
PDF https://arxiv.org/pdf/2003.02979v2.pdf
PWC https://paperswithcode.com/paper/other-play-for-zero-shot-coordination
Repo
Framework

Visual-Inertial Telepresence for Aerial Manipulation

Title Visual-Inertial Telepresence for Aerial Manipulation
Authors Jongseok Lee, Ribin Balachandran, Yuri S. Sarkisov, Marco De Stefano, Andre Coelho, Kashmira Shinde, Min Jun Kim, Rudolph Triebel, Konstantin Kondak
Abstract This paper presents a novel telepresence system for enhancing aerial manipulation capabilities. It involves not only a haptic device, but also a virtual reality that provides a 3D visual feedback to a remotely-located teleoperator in real-time. We achieve this by utilizing onboard visual and inertial sensors, an object tracking algorithm and a pre-generated object database. As the virtual reality has to closely match the real remote scene, we propose an extension of a marker tracking algorithm with visual-inertial odometry. Both indoor and outdoor experiments show benefits of our proposed system in achieving advanced aerial manipulation tasks, namely grasping, placing, force exertion and peg-in-hole insertion.
Tasks Object Tracking
Published 2020-03-25
URL https://arxiv.org/abs/2003.11509v1
PDF https://arxiv.org/pdf/2003.11509v1.pdf
PWC https://paperswithcode.com/paper/visual-inertial-telepresence-for-aerial
Repo
Framework

Camera-Based Adaptive Trajectory Guidance via Neural Networks

Title Camera-Based Adaptive Trajectory Guidance via Neural Networks
Authors Aditya Rajguru, Christopher Collander, William J. Beksi
Abstract In this paper, we introduce a novel method to capture visual trajectories for navigating an indoor robot in dynamic settings using streaming image data. First, an image processing pipeline is proposed to accurately segment trajectories from noisy backgrounds. Next, the captured trajectories are used to design, train, and compare two neural network architectures for predicting acceleration and steering commands for a line following robot over a continuous space in real time. Lastly, experimental results demonstrate the performance of the neural networks versus human teleoperation of the robot and the viability of the system in environments with occlusions and/or low-light conditions.
Tasks
Published 2020-01-09
URL https://arxiv.org/abs/2001.03205v1
PDF https://arxiv.org/pdf/2001.03205v1.pdf
PWC https://paperswithcode.com/paper/camera-based-adaptive-trajectory-guidance-via
Repo
Framework

A Comprehensive Survey on the Ambulance Routing and Location Problems

Title A Comprehensive Survey on the Ambulance Routing and Location Problems
Authors Joseph Tassone, Salimur Choudhury
Abstract In this research, an extensive literature review was performed on the recent developments of the ambulance routing problem (ARP) and ambulance location problem (ALP). Both are respective modifications of the vehicle routing problem (VRP) and maximum covering problem (MCP), with modifications to objective functions and constraints. Although alike, a key distinction is emergency service systems (EMS) are considered critical and the optimization of these has become all the more important as a result. Similar to their parent problems, these are NP-hard and must resort to approximations if the space size is too large. Much of the current work has simply been on modifying existing systems through simulation to achieve a more acceptable result. There has been attempts towards using meta-heuristics, though practical experimentation is lacking when compared to VRP or MCP. The contributions of this work are a comprehensive survey of current methodologies, summarized models, and suggested future improvements.
Tasks
Published 2020-01-10
URL https://arxiv.org/abs/2001.05288v1
PDF https://arxiv.org/pdf/2001.05288v1.pdf
PWC https://paperswithcode.com/paper/a-comprehensive-survey-on-the-ambulance
Repo
Framework

Automatic detection and counting of retina cell nuclei using deep learning

Title Automatic detection and counting of retina cell nuclei using deep learning
Authors S. M. Hadi Hosseini, Hao Chen, Monica M. Jablonski
Abstract The ability to automatically detect, classify, calculate the size, number, and grade of retinal cells and other biological objects is critically important in eye disease like age-related macular degeneration (AMD). In this paper, we developed an automated tool based on deep learning technique and Mask R-CNN model to analyze large datasets of transmission electron microscopy (TEM) images and quantify retinal cells with high speed and precision. We considered three categories for outer nuclear layer (ONL) cells: live, intermediate, and pyknotic. We trained the model using a dataset of 24 samples. We then optimized the hyper-parameters using another set of 6 samples. The results of this research, after applying to the test datasets, demonstrated that our method is highly accurate for automatically detecting, categorizing, and counting cell nuclei in the ONL of the retina. Performance of our model was tested using general metrics: general mean average precision (mAP) for detection; and precision, recall, F1-score, and accuracy for categorizing and counting.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.03563v1
PDF https://arxiv.org/pdf/2002.03563v1.pdf
PWC https://paperswithcode.com/paper/automatic-detection-and-counting-of-retina
Repo
Framework

Registration of multi-view point sets under the perspective of expectation-maximization

Title Registration of multi-view point sets under the perspective of expectation-maximization
Authors Jihua Zhu, Jing Zhang, Huimin Lu, Zhongyu Li
Abstract Registration of multi-view point sets is a prerequisite for 3D model reconstruction. To solve this problem, most of previous approaches either partially explore available information or blindly utilize unnecessary information to align each point set, which may lead to the undesired results or introduce extra computation complexity. To this end, this paper consider the multi-view registration problem as a maximum likelihood estimation problem and proposes a novel multi-view registration approach under the perspective of Expectation-Maximization (EM). The basic idea of our approach is that different data points are generated by the same number of Gaussian mixture models (GMMs). For each data point in one point set, its nearest neighbors can be searched from other well-aligned point sets. Then, we can suppose this data point is generated by the special GMM, which is composed of each nearest neighbor adhered with one Gaussian distribution. Based on this assumption, it is reasonable to define the likelihood function including all rigid transformations, which requires to be estimated for multi-view registration. Subsequently, the EM algorithm is utilized to maximize the likelihood function so as to estimate all rigid transformations. Finally, the proposed approach is tested on several bench mark data sets and compared with some state-of-the-art algorithms. Experimental results illustrate its super performance on accuracy, robustness and efficiency for the registration of multi-view point sets.
Tasks
Published 2020-02-18
URL https://arxiv.org/abs/2002.07464v2
PDF https://arxiv.org/pdf/2002.07464v2.pdf
PWC https://paperswithcode.com/paper/registration-of-multi-view-point-sets-under
Repo
Framework

IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control

Title IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control
Authors François-Xavier Devailly, Denis Larocque, Laurent Charlin
Abstract Scaling adaptive traffic-signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning attempts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-networks architectures—dominating in the multi-agent setting—do not offer the flexibility to handle an arbitrary number of entities which changes both between road networks, and over time as vehicles traverse the network. We introduce Inductive Graph Reinforcement Learning (IG-RL) based on graph-convolutional networks which adapts to the structure of any road network, to learn detailed representations of traffic-controllers and their surroundings. Our decentralized approach enables learning of a transferable-adaptive-traffic-signal-control policy. After being trained on an arbitrary set of road networks, our model can generalize to new road networks, traffic distributions, and traffic regimes, with no additional training and a constant number of parameters, enabling greater scalability compared to prior methods. Furthermore, our approach can exploit the granularity of available data by capturing the (dynamic) demand at both the lane and the vehicle levels. The proposed method is tested on both road networks and traffic settings never experienced during training. We compare IG-RL to multi-agent reinforcement learning and domain-specific baselines. In both synthetic road networks and in a larger experiment involving the control of the 3,971 traffic signals of Manhattan, we show that different instantiations of IG-RL outperform baselines.
Tasks Multi-agent Reinforcement Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.05738v3
PDF https://arxiv.org/pdf/2003.05738v3.pdf
PWC https://paperswithcode.com/paper/ig-rl-inductive-graph-reinforcement-learning
Repo
Framework

Optimal Resolution of Change-Point Detection with Empirically Observed Statistics and Erasures

Title Optimal Resolution of Change-Point Detection with Empirically Observed Statistics and Erasures
Authors Haiyun He, Qiaosheng Zhang, Vincent Y. F. Tan
Abstract This paper revisits the offline change-point detection problem from a statistical learning perspective. Instead of assuming that the underlying pre- and post-change distributions are known, it is assumed that we have partial knowledge of these distributions based on empirically observed statistics in the form of training sequences. Our problem formulation finds a variety of real-life applications from detecting when climate change occurred to detecting when a virus mutated. Using the training sequences as well as the test sequence consisting of a single-change and allowing for the erasure or rejection option, we derive the optimal resolution between the estimated and true change-points under two different asymptotic regimes on the undetected error probability—namely, the large and moderate deviations regimes. In both regimes, strong converses are also proved. In the moderate deviations case, the optimal resolution is a simple function of a symmetrized version of the chi-square distance.
Tasks Change Point Detection
Published 2020-03-13
URL https://arxiv.org/abs/2003.06511v1
PDF https://arxiv.org/pdf/2003.06511v1.pdf
PWC https://paperswithcode.com/paper/optimal-resolution-of-change-point-detection
Repo
Framework

Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

Title Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs
Authors Jonathan Frankle, David J. Schwab, Ari S. Morcos
Abstract Batch normalization (BatchNorm) has become an indispensable tool for training deep neural networks, yet it is still poorly understood. Although previous work has typically focused on its normalization component, BatchNorm also adds two per-feature trainable parameters: a coefficient and a bias. However, the role and expressive power of these parameters remains unclear. To study this question, we investigate the performance achieved when training only these parameters and freezing all others at their random initializations. We find that doing so leads to surprisingly high performance. For example, a sufficiently deep ResNet reaches 83% accuracy on CIFAR-10 in this configuration. Interestingly, BatchNorm achieves this performance in part by naturally learning to disable around a third of the random features without any changes to the training objective. Not only do these results highlight the under-appreciated role of the affine parameters in BatchNorm, but - in a broader sense - they characterize the expressive power of neural networks constructed simply by shifting and rescaling random features.
Tasks
Published 2020-02-29
URL https://arxiv.org/abs/2003.00152v1
PDF https://arxiv.org/pdf/2003.00152v1.pdf
PWC https://paperswithcode.com/paper/training-batchnorm-and-only-batchnorm-on-the
Repo
Framework

Model adaptation and unsupervised learning with non-stationary batch data under smooth concept drift

Title Model adaptation and unsupervised learning with non-stationary batch data under smooth concept drift
Authors Subhro Das, Prasanth Lade, Soundar Srinivasan
Abstract Most predictive models assume that training and test data are generated from a stationary process. However, this assumption does not hold true in practice. In this paper, we consider the scenario of a gradual concept drift due to the underlying non-stationarity of the data source. While previous work has investigated this scenario under a supervised-learning and adaption conditions, few have addressed the common, real-world scenario when labels are only available during training. We propose a novel, iterative algorithm for unsupervised adaptation of predictive models. We show that the performance of our batch adapted prediction algorithm is better than that of its corresponding unadapted version. The proposed algorithm provides similar (or better, in most cases) performance within significantly less run time compared to other state of the art methods. We validate our claims though extensive numerical evaluations on both synthetic and real data.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.04094v1
PDF https://arxiv.org/pdf/2002.04094v1.pdf
PWC https://paperswithcode.com/paper/model-adaptation-and-unsupervised-learning
Repo
Framework

Attention-Based Self-Supervised Feature Learning for Security Data

Title Attention-Based Self-Supervised Feature Learning for Security Data
Authors I-Ta Lee, Manish Marwah, Martin Arlitt
Abstract While applications of machine learning in cyber-security have grown rapidly, most models use manually constructed features. This manual approach is error-prone and requires domain expertise. In this paper, we design a self-supervised sequence-to-sequence model with attention to learn an embedding for data routinely used in cyber-security applications. The method is validated on two real world public data sets. The learned features are used in an anomaly detection model and perform better than learned features from baseline methods.
Tasks Anomaly Detection
Published 2020-03-24
URL https://arxiv.org/abs/2003.10639v1
PDF https://arxiv.org/pdf/2003.10639v1.pdf
PWC https://paperswithcode.com/paper/attention-based-self-supervised-feature
Repo
Framework

Learning to Segment 3D Point Clouds in 2D Image Space

Title Learning to Segment 3D Point Clouds in 2D Image Space
Authors Yecheng Lyu, Xinming Huang, Ziming Zhang
Abstract In contrast to the literature where local patterns in 3D point clouds are captured by customized convolutional operators, in this paper we study the problem of how to effectively and efficiently project such point clouds into a 2D image space so that traditional 2D convolutional neural networks (CNNs) such as U-Net can be applied for segmentation. To this end, we are motivated by graph drawing and reformulate it as an integer programming problem to learn the topology-preserving graph-to-grid mapping for each individual point cloud. To accelerate the computation in practice, we further propose a novel hierarchical approximate algorithm. With the help of the Delaunay triangulation for graph construction from point clouds and a multi-scale U-Net for segmentation, we manage to demonstrate the state-of-the-art performance on ShapeNet and PartNet, respectively, with significant improvement over the literature. Code is available at https://github.com/Zhang-VISLab.
Tasks 3D Part Segmentation, graph construction
Published 2020-03-12
URL https://arxiv.org/abs/2003.05593v3
PDF https://arxiv.org/pdf/2003.05593v3.pdf
PWC https://paperswithcode.com/paper/learning-to-segment-3d-point-clouds-in-2d
Repo
Framework

When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

Title When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)
Authors Victor Villena-Martinez, Sergiu Oprea, Marcelo Saval-Calvo, Jorge Azorin-Lopez, Andres Fuster-Guillo, Robert B. Fisher
Abstract Registration is the process that computes the transformation that aligns sets of data. Commonly, a registration process can be divided into four main steps: target selection, feature extraction, feature matching, and transform computation for the alignment. The accuracy of the result depends on multiple factors, the most significant are the quantity of input data, the presence of noise, outliers and occlusions, the quality of the extracted features, real-time requirements and the type of transformation, especially those ones defined by multiple parameters, like non-rigid deformations. Recent advancements in machine learning could be a turning point in these issues, particularly with the development of deep learning (DL) techniques, which are helping to improve multiple computer vision problems through an abstract understanding of the input data. In this paper, a review of deep learning-based registration methods is presented. We classify the different papers proposing a framework extracted from the traditional registration pipeline to analyse the new learning-based proposal strengths. Deep Registration Networks (DRNs) try to solve the alignment task either replacing part of the traditional pipeline with a network or fully solving the registration problem. The main conclusions extracted are, on the one hand, 1) learning-based registration techniques cannot always be clearly classified in the traditional pipeline. 2) These approaches allow more complex inputs like conceptual models as well as the traditional 3D datasets. 3) In spite of the generality of learning, the current proposals are still ad hoc solutions. Finally, 4) this is a young topic that still requires a large effort to reach general solutions able to cope with the problems that affect traditional approaches.
Tasks
Published 2020-03-06
URL https://arxiv.org/abs/2003.03167v1
PDF https://arxiv.org/pdf/2003.03167v1.pdf
PWC https://paperswithcode.com/paper/when-deep-learning-meets-data-alignment-a
Repo
Framework

Crossmodal learning for audio-visual speech event localization

Title Crossmodal learning for audio-visual speech event localization
Authors Rahul Sharma, Krishna Somandepalli, Shrikanth Narayanan
Abstract An objective understanding of media depictions, such as about inclusive portrayals of how much someone is heard and seen on screen in film and television, requires the machines to discern automatically who, when, how and where someone is talking. Media content is rich in multiple modalities such as visuals and audio which can be used to learn speaker activity in videos. In this work, we present visual representations that have implicit information about when someone is talking and where. We propose a crossmodal neural network for audio speech event detection using the visual frames. We use the learned representations for two downstream tasks: i) audio-visual voice activity detection ii) active speaker localization in video frames. We present a state-of-the-art audio-visual voice activity detection system and demonstrate that the learned embeddings can effectively localize to active speakers in the visual frames.
Tasks Action Detection, Activity Detection
Published 2020-03-09
URL https://arxiv.org/abs/2003.04358v1
PDF https://arxiv.org/pdf/2003.04358v1.pdf
PWC https://paperswithcode.com/paper/crossmodal-learning-for-audio-visual-speech
Repo
Framework
comments powered by Disqus