January 30, 2020

3056 words 15 mins read

Paper Group ANR 380

Detecting Driveable Area for Autonomous Vehicles. Hierarchical Meta Learning. Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones. Policy Learning for Malaria Control. Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem. Deep Learning Training with Simulated Approximate Mult …

Detecting Driveable Area for Autonomous Vehicles


Title	Detecting Driveable Area for Autonomous Vehicles
Authors	Niral Shah, Ashwin Shankar, Jae-hong Park
Abstract	Autonomous driving is a challenging problem where there is currently an intense focus on research and development. Human drivers are forced to make thousands of complex decisions in a short amount of time,quickly processing their surroundings and moving factors. One of these aspects, recognizing regions on the road that are driveable is vital to the success of any autonomous system. This problem can be addressed with deep learning framed as a region proposal problem. Utilizing a Mask R-CNN trained on the Berkeley Deep Drive (BDD100k) dataset, we aim to see if recognizing driveable areas, while also differentiating between the car’s direct (current) lane and alternative lanes is feasible.
Tasks	Autonomous Driving, Autonomous Vehicles
Published	2019-11-07
URL	https://arxiv.org/abs/1911.02740v1
PDF	https://arxiv.org/pdf/1911.02740v1.pdf
PWC	https://paperswithcode.com/paper/detecting-driveable-area-for-autonomous
Repo
Framework

Hierarchical Meta Learning


Title	Hierarchical Meta Learning
Authors	Yingtian Zou, Jiashi Feng
Abstract	Meta learning is a promising solution to few-shot learning problems. However, existing meta learning methods are restricted to the scenarios where training and application tasks share the same out-put structure. To obtain a meta model applicable to the tasks with new structures, it is required to collect new training data and repeat the time-consuming meta training procedure. This makes them inefficient or even inapplicable in learning to solve heterogeneous few-shot learning tasks. We thus develop a novel and principled HierarchicalMeta Learning (HML) method. Different from existing methods that only focus on optimizing the adaptability of a meta model to similar tasks, HML also explicitly optimizes its generalizability across heterogeneous tasks. To this end, HML first factorizes a set of similar training tasks into heterogeneous ones and trains the meta model over them at two levels to maximize adaptation and generalization performance respectively. The resultant model can then directly generalize to new tasks. Extensive experiments on few-shot classification and regression problems clearly demonstrate the superiority of HML over fine-tuning and state-of-the-art meta learning approaches in terms of generalization across heterogeneous tasks.
Tasks	Few-Shot Learning, Meta-Learning
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09081v1
PDF	http://arxiv.org/pdf/1904.09081v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-meta-learning
Repo
Framework

Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones


Title	Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones
Authors	Xinwei Chen, Marlin W. Ulmer, Barrett W. Thomas
Abstract	In this paper, we consider same-day delivery with a heterogeneous fleet of vehicles and drones. Customers make delivery requests over the course of the day and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. To aid feature selection, we present an analytical analysis that demonstrates the role that different types of information have on the value function and decision making. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach.
Tasks	Decision Making, Feature Selection, Q-Learning
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11901v1
PDF	https://arxiv.org/pdf/1910.11901v1.pdf
PWC	https://paperswithcode.com/paper/deep-q-learning-for-same-day-delivery-with-a
Repo
Framework

Policy Learning for Malaria Control


Title	Policy Learning for Malaria Control
Authors	Van Bach Nguyen, Belaid Mohamed Karim, Bao Long Vu, Jörg Schlötterer, Michael Granitzer
Abstract	Sequential decision making is a typical problem in reinforcement learning with plenty of algorithms to solve it. However, only a few of them can work effectively with a very small number of observations. In this report, we introduce the progress to learn the policy for Malaria Control as a Reinforcement Learning problem in the KDD Cup Challenge 2019 and propose diverse solutions to deal with the limited observations problem. We apply the Genetic Algorithm, Bayesian Optimization, Q-learning with sequence breaking to find the optimal policy for five years in a row with only 20 episodes/100 evaluations. We evaluate those algorithms and compare their performance with Random Search as a baseline. Among these algorithms, Q-Learning with sequence breaking has been submitted to the challenge and got ranked 7th in KDD Cup.
Tasks	Decision Making, Q-Learning
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08926v1
PDF	https://arxiv.org/pdf/1910.08926v1.pdf
PWC	https://paperswithcode.com/paper/policy-learning-for-malaria-control
Repo
Framework

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem


Title	Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem
Authors	John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye
Abstract	Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace. Hand-crafting heuristic solutions that account for the dynamics in these resource allocation problems is difficult, and may be better handled by an end-to-end machine learning method. Previous works have explored machine learning methods to the problem from a high-level perspective, where the learning method is responsible for either repositioning the drivers or dispatching orders, and as a further simplification, the drivers are considered independent agents maximizing their own reward functions. In this paper we present a deep reinforcement learning approach for tackling the full fleet management and dispatching problems. In addition to treating the drivers as individual agents, we consider the problem from a system-centric perspective, where a central fleet management agent is responsible for decision-making for all drivers.
Tasks	Decision Making
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11260v1
PDF	https://arxiv.org/pdf/1911.11260v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-multi-driver
Repo
Framework

Deep Learning Training with Simulated Approximate Multipliers


Title	Deep Learning Training with Simulated Approximate Multipliers
Authors	Issam Hammad, Kamal El-Sankary, Jason Gu
Abstract	This paper presents by simulation how approximate multipliers can be utilized to enhance the training performance of convolutional neural networks (CNNs). Approximate multipliers have significantly better performance in terms of speed, power, and area compared to exact multipliers. However, approximate multipliers have an inaccuracy which is defined in terms of the Mean Relative Error (MRE). To assess the applicability of approximate multipliers in enhancing CNN training performance, a simulation for the impact of approximate multipliers error on CNN training is presented. The paper demonstrates that using approximate multipliers for CNN training can significantly enhance the performance in terms of speed, power, and area at the cost of a small negative impact on the achieved accuracy. Additionally, the paper proposes a hybrid training method which mitigates this negative impact on the accuracy. Using the proposed hybrid method, the training can start using approximate multipliers then switches to exact multipliers for the last few epochs. Using this method, the performance benefits of approximate multipliers in terms of speed, power, and area can be attained for a large portion of the training stage. On the other hand, the negative impact on the accuracy is diminished by using the exact multipliers for the last epochs of training.
Tasks
Published	2019-12-26
URL	https://arxiv.org/abs/2001.00060v1
PDF	https://arxiv.org/pdf/2001.00060v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-training-with-simulated
Repo
Framework

Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression Problems


Title	Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression Problems
Authors	Ahmad Kalhor, Mohsen Saffar, Melika Kheirieh, Somayyeh Hoseinipoor, Babak N. Araabi
Abstract	This paper introduces two straightforward, effective indices to evaluate the input data and the data flowing through layers of a feedforward deep neural network. For classification problems, the separation rate of target labels in the space of dataflow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained concept, a shapeless distance-based evaluation index is proposed. Similarly, for regression problems, the smoothness rate of target outputs in the space of dataflow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained smoothness concept, a shapeless distance-based smoothness index is proposed for regression problems. To consider more strictly concepts of separation and smoothness, their extended versions are introduced, and by interpreting a regression problem as a classification problem, it is shown that the separation and smoothness indices are related together. Through four case studies, the profits of using the introduced indices are shown. In the first case study, for classification and regression problems , the challenging of some known input datasets are compared respectively by the proposed separation and smoothness indices. In the second case study, the quality of dataflow is evaluated through layers of two pre-trained VGG 16 networks in classification of Cifar10 and Cifar100. In the third case study, it is shown that the correct classification rate and the separation index are almost equivalent through layers particularly while the serration index is increased. In the last case study, two multi-layer neural networks, which are designed for the prediction of Boston Housing price, are compared layer by layer by using the proposed smoothness index.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05156v1
PDF	https://arxiv.org/pdf/1906.05156v1.pdf
PWC	https://paperswithcode.com/paper/evaluation-of-dataflow-through-layers-of-deep
Repo
Framework

In-Place Zero-Space Memory Protection for CNN


Title	In-Place Zero-Space Memory Protection for CNN
Authors	Hui Guan, Lin Ning, Zhen Lin, Xipeng Shen, Huiyang Zhou, Seung-Hwan Lim
Abstract	Convolutional Neural Networks (CNN) are being actively explored for safety-critical applications such as autonomous vehicles and aerospace, where it is essential to ensure the reliability of inference results in the presence of possible memory faults. Traditional methods such as error correction codes (ECC) and Triple Modular Redundancy (TMR) are CNN-oblivious and incur substantial memory overhead and energy cost. This paper introduces in-place zero-space ECC assisted with a new training scheme weight distribution-oriented training. The new method provides the first known zero space cost memory protection for CNNs without compromising the reliability offered by traditional ECC.
Tasks	Autonomous Vehicles
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14479v1
PDF	https://arxiv.org/pdf/1910.14479v1.pdf
PWC	https://paperswithcode.com/paper/in-place-zero-space-memory-protection-for-cnn
Repo
Framework

Bregman Proximal Framework for Deep Linear Neural Networks


Title	Bregman Proximal Framework for Deep Linear Neural Networks
Authors	Mahesh Chandra Mukkamala, Felix Westerkamp, Emanuel Laude, Daniel Cremers, Peter Ochs
Abstract	A typical assumption for the analysis of first order optimization methods is the Lipschitz continuity of the gradient of the objective function. However, for many practical applications this assumption is violated, including loss functions in deep learning. To overcome this issue, certain extensions based on generalized proximity measures known as Bregman distances were introduced. This initiated the development of the Bregman proximal gradient (BPG) algorithm and an inertial variant (momentum based) CoCaIn BPG, which however rely on problem dependent Bregman distances. In this paper, we develop Bregman distances for using BPG methods to train Deep Linear Neural Networks. The main implications of our results are strong convergence guarantees for these algorithms. We also propose several strategies for their efficient implementation, for example, closed form updates and a closed form expression for the inertial parameter of CoCaIn BPG. Moreover, the BPG method requires neither diminishing step sizes nor line search, unlike its corresponding Euclidean version. We numerically illustrate the competitiveness of the proposed methods compared to existing state of the art schemes.
Tasks
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03638v1
PDF	https://arxiv.org/pdf/1910.03638v1.pdf
PWC	https://paperswithcode.com/paper/bregman-proximal-framework-for-deep-linear
Repo
Framework

Autonomous Vehicles Meet the Physical World: RSS, Variability, Uncertainty, and Proving Safety (Expanded Version)


Title	Autonomous Vehicles Meet the Physical World: RSS, Variability, Uncertainty, and Proving Safety (Expanded Version)
Authors	Philip Koopman, Beth Osyk, Jack Weast
Abstract	The Responsibility-Sensitive Safety (RSS) model offers provable safety for vehicle behaviors such as minimum safe following distance. However, handling worst-case variability and uncertainty may significantly lower vehicle permissiveness, and in some situations safety cannot be guaranteed. Digging deeper into Newtonian mechanics, we identify complications that result from considering vehicle status, road geometry and environmental parameters. An especially challenging situation occurs if these parameters change during the course of a collision avoidance maneuver such as hard braking. As part of our analysis, we expand the original RSS following distance equation to account for edge cases involving potential collisions mid-way through a braking process. We additionally propose a Micro-Operational Design Domain ({\mu}ODD) approach to subdividing the operational space as a way of improving permissiveness. Confining probabilistic aspects of safety to {\mu}ODD transitions permits proving safety (when possible) under the assumption that the system has transitioned to the correct {\mu}ODD for the situation. Each {\mu}ODD can additionally be used to encode system fault responses, take credit for advisory information (e.g., from vehicle-to-vehicle communication), and anticipate likely emergent situations.
Tasks	Autonomous Vehicles
Published	2019-10-31
URL	https://arxiv.org/abs/1911.01207v1
PDF	https://arxiv.org/pdf/1911.01207v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-vehicles-meet-the-physical-world
Repo
Framework

Model-based Reinforcement Learning for Predictions and Control for Limit Order Books


Title	Model-based Reinforcement Learning for Predictions and Control for Limit Order Books
Authors	Haoran Wei, Yuanbo Wang, Lidia Mangu, Keith Decker
Abstract	We build a profitable electronic trading agent with Reinforcement Learning that places buy and sell orders in the stock market. An environment model is built only with historical observational data, and the RL agent learns the trading policy by interacting with the environment model instead of with the real-market to minimize the risk and potential monetary loss. Trained in unsupervised and self-supervised fashion, our environment model learned a temporal and causal representation of the market in latent space through deep neural networks. We demonstrate that the trading policy trained entirely within the environment model can be transferred back into the real market and maintain its profitability. We believe that this environment model can serve as a robust simulator that predicts market movement as well as trade impact for further studies.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03743v1
PDF	https://arxiv.org/pdf/1910.03743v1.pdf
PWC	https://paperswithcode.com/paper/model-based-reinforcement-learning-for-1
Repo
Framework

Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning


Title	Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning
Authors	Xin Qin, Nikos Aréchiga, Andrew Best, Jyotirmoy Deshmukh
Abstract	We consider the problem of using reinforcement learning to train adversarial agents for automatic testing and falsification of cyberphysical systems, such as autonomous vehicles, robots, and airplanes. In order to produce useful agents, however, it is useful to be able to control the degree of adversariality by specifying rules that an agent must follow. For example, when testing an autonomous vehicle, it is useful to find maximally antagonistic traffic participants that obey traffic rules. We model dynamic constraints as hierarchically ordered rules expressed in Signal Temporal Logic, and show how these can be incorporated into an agent training process. We prove that our agent-centric approach is able to find all dangerous behaviors that can be found by traditional falsification techniques while producing modular and reusable agents. We demonstrate our approach on two case studies from the automotive domain.
Tasks	Autonomous Vehicles
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13645v2
PDF	https://arxiv.org/pdf/1910.13645v2.pdf
PWC	https://paperswithcode.com/paper/automatic-testing-and-falsification-with
Repo
Framework


Title	A Distributed Model-Free Algorithm for Multi-hop Ride-sharing using Deep Reinforcement Learning
Authors	Ashutosh Singh, Abubakr Alabbasi, Vaneet Aggarwal
Abstract	The growth of autonomous vehicles, ridesharing systems, and self driving technology will bring a shift in the way ride hailing platforms plan out their services. However, these advances in technology coupled with road congestion, environmental concerns, fuel usage, vehicles emissions, and the high cost of the vehicle usage have brought more attention to better utilize the use of vehicles and their capacities. In this paper, we propose a novel multi-hop ride-sharing (MHRS) algorithm that uses deep reinforcement learning to learn optimal vehicle dispatch and matching decisions by interacting with the external environment. By allowing customers to transfer between vehicles, i.e., ride with one vehicle for sometime and then transfer to another one, MHRS helps in attaining 30% lower cost and 20% more efficient utilization of fleets, as compared to the ride-sharing algorithms. This flexibility of multi-hop feature gives a seamless experience to customers and ride-sharing companies, and thus improves ride-sharing services.
Tasks	Autonomous Vehicles
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14002v1
PDF	https://arxiv.org/pdf/1910.14002v1.pdf
PWC	https://paperswithcode.com/paper/a-distributed-model-free-algorithm-for-multi
Repo
Framework

On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix


Title	On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix
Authors	Alia Abbara, Antoine Baker, Florent Krzakala, Lenka Zdeborová
Abstract	In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*. There have been many theoretical works concentrating on the case where the matrix Phi is a random i.i.d. one, but a number of heuristic evidence suggests that many of these results are universal and extend well beyond this restricted case. Here we revisit this problematic through the prism of development of message passing methods, and consider not only the universality of the l1 transition, as previously addressed, but also the one of the optimal Bayesian reconstruction. We observed that the universality extends to the Bayes-optimal minimum mean-squared (MMSE) error, and to a range of structured matrices.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04735v1
PDF	https://arxiv.org/pdf/1906.04735v1.pdf
PWC	https://paperswithcode.com/paper/on-the-universality-of-noiseless-linear
Repo
Framework

A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction


Title	A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
Authors	Fan Zhou, Guojing Cong
Abstract	Reducing communication in training large-scale machine learning applications on distributed platform is still a big challenge. To address this issue, we propose a distributed hierarchical averaging stochastic gradient descent (Hier-AVG) algorithm with infrequent global reduction by introducing local reduction. As a general type of parallel SGD, Hier-AVG can reproduce several popular synchronous parallel SGD variants by adjusting its parameters. We show that Hier-AVG with infrequent global reduction can still achieve standard convergence rate for non-convex optimization problems. In addition, we show that more frequent local averaging with more participants involved can lead to faster training convergence. By comparing Hier-AVG with another popular distributed training algorithm K-AVG, we show that through deploying local averaging with fewer number of global averaging, Hier-AVG can still achieve comparable training speed while frequently get better test accuracy. This indicates that local averaging can serve as an alternative remedy to effectively reduce communication overhead when the number of learners is large. Experimental results of Hier-AVG with several state-of-the-art deep neural nets on CIFAR-10 and IMAGENET-1K are presented to validate our analysis and show its superiority.
Tasks
Published	2019-03-12
URL	https://arxiv.org/abs/1903.05133v2
PDF	https://arxiv.org/pdf/1903.05133v2.pdf
PWC	https://paperswithcode.com/paper/a-distributed-hierarchical-sgd-algorithm-with
Repo
Framework