Paper Group ANR 380
Detecting Driveable Area for Autonomous Vehicles. Hierarchical Meta Learning. Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones. Policy Learning for Malaria Control. Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem. Deep Learning Training with Simulated Approximate Mult …
Detecting Driveable Area for Autonomous Vehicles
Title | Detecting Driveable Area for Autonomous Vehicles |
Authors | Niral Shah, Ashwin Shankar, Jae-hong Park |
Abstract | Autonomous driving is a challenging problem where there is currently an intense focus on research and development. Human drivers are forced to make thousands of complex decisions in a short amount of time,quickly processing their surroundings and moving factors. One of these aspects, recognizing regions on the road that are driveable is vital to the success of any autonomous system. This problem can be addressed with deep learning framed as a region proposal problem. Utilizing a Mask R-CNN trained on the Berkeley Deep Drive (BDD100k) dataset, we aim to see if recognizing driveable areas, while also differentiating between the car’s direct (current) lane and alternative lanes is feasible. |
Tasks | Autonomous Driving, Autonomous Vehicles |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02740v1 |
https://arxiv.org/pdf/1911.02740v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-driveable-area-for-autonomous |
Repo | |
Framework | |
Hierarchical Meta Learning
Title | Hierarchical Meta Learning |
Authors | Yingtian Zou, Jiashi Feng |
Abstract | Meta learning is a promising solution to few-shot learning problems. However, existing meta learning methods are restricted to the scenarios where training and application tasks share the same out-put structure. To obtain a meta model applicable to the tasks with new structures, it is required to collect new training data and repeat the time-consuming meta training procedure. This makes them inefficient or even inapplicable in learning to solve heterogeneous few-shot learning tasks. We thus develop a novel and principled HierarchicalMeta Learning (HML) method. Different from existing methods that only focus on optimizing the adaptability of a meta model to similar tasks, HML also explicitly optimizes its generalizability across heterogeneous tasks. To this end, HML first factorizes a set of similar training tasks into heterogeneous ones and trains the meta model over them at two levels to maximize adaptation and generalization performance respectively. The resultant model can then directly generalize to new tasks. Extensive experiments on few-shot classification and regression problems clearly demonstrate the superiority of HML over fine-tuning and state-of-the-art meta learning approaches in terms of generalization across heterogeneous tasks. |
Tasks | Few-Shot Learning, Meta-Learning |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09081v1 |
http://arxiv.org/pdf/1904.09081v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-meta-learning |
Repo | |
Framework | |
Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones
Title | Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones |
Authors | Xinwei Chen, Marlin W. Ulmer, Barrett W. Thomas |
Abstract | In this paper, we consider same-day delivery with a heterogeneous fleet of vehicles and drones. Customers make delivery requests over the course of the day and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. To aid feature selection, we present an analytical analysis that demonstrates the role that different types of information have on the value function and decision making. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach. |
Tasks | Decision Making, Feature Selection, Q-Learning |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11901v1 |
https://arxiv.org/pdf/1910.11901v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-q-learning-for-same-day-delivery-with-a |
Repo | |
Framework | |
Policy Learning for Malaria Control
Title | Policy Learning for Malaria Control |
Authors | Van Bach Nguyen, Belaid Mohamed Karim, Bao Long Vu, Jörg Schlötterer, Michael Granitzer |
Abstract | Sequential decision making is a typical problem in reinforcement learning with plenty of algorithms to solve it. However, only a few of them can work effectively with a very small number of observations. In this report, we introduce the progress to learn the policy for Malaria Control as a Reinforcement Learning problem in the KDD Cup Challenge 2019 and propose diverse solutions to deal with the limited observations problem. We apply the Genetic Algorithm, Bayesian Optimization, Q-learning with sequence breaking to find the optimal policy for five years in a row with only 20 episodes/100 evaluations. We evaluate those algorithms and compare their performance with Random Search as a baseline. Among these algorithms, Q-Learning with sequence breaking has been submitted to the challenge and got ranked 7th in KDD Cup. |
Tasks | Decision Making, Q-Learning |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.08926v1 |
https://arxiv.org/pdf/1910.08926v1.pdf | |
PWC | https://paperswithcode.com/paper/policy-learning-for-malaria-control |
Repo | |
Framework | |
Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem
Title | Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem |
Authors | John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye |
Abstract | Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace. Hand-crafting heuristic solutions that account for the dynamics in these resource allocation problems is difficult, and may be better handled by an end-to-end machine learning method. Previous works have explored machine learning methods to the problem from a high-level perspective, where the learning method is responsible for either repositioning the drivers or dispatching orders, and as a further simplification, the drivers are considered independent agents maximizing their own reward functions. In this paper we present a deep reinforcement learning approach for tackling the full fleet management and dispatching problems. In addition to treating the drivers as individual agents, we consider the problem from a system-centric perspective, where a central fleet management agent is responsible for decision-making for all drivers. |
Tasks | Decision Making |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11260v1 |
https://arxiv.org/pdf/1911.11260v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-multi-driver |
Repo | |
Framework | |
Deep Learning Training with Simulated Approximate Multipliers
Title | Deep Learning Training with Simulated Approximate Multipliers |
Authors | Issam Hammad, Kamal El-Sankary, Jason Gu |
Abstract | This paper presents by simulation how approximate multipliers can be utilized to enhance the training performance of convolutional neural networks (CNNs). Approximate multipliers have significantly better performance in terms of speed, power, and area compared to exact multipliers. However, approximate multipliers have an inaccuracy which is defined in terms of the Mean Relative Error (MRE). To assess the applicability of approximate multipliers in enhancing CNN training performance, a simulation for the impact of approximate multipliers error on CNN training is presented. The paper demonstrates that using approximate multipliers for CNN training can significantly enhance the performance in terms of speed, power, and area at the cost of a small negative impact on the achieved accuracy. Additionally, the paper proposes a hybrid training method which mitigates this negative impact on the accuracy. Using the proposed hybrid method, the training can start using approximate multipliers then switches to exact multipliers for the last few epochs. Using this method, the performance benefits of approximate multipliers in terms of speed, power, and area can be attained for a large portion of the training stage. On the other hand, the negative impact on the accuracy is diminished by using the exact multipliers for the last epochs of training. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/2001.00060v1 |
https://arxiv.org/pdf/2001.00060v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-training-with-simulated |
Repo | |
Framework | |
Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression Problems
Title | Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression Problems |
Authors | Ahmad Kalhor, Mohsen Saffar, Melika Kheirieh, Somayyeh Hoseinipoor, Babak N. Araabi |
Abstract | This paper introduces two straightforward, effective indices to evaluate the input data and the data flowing through layers of a feedforward deep neural network. For classification problems, the separation rate of target labels in the space of dataflow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained concept, a shapeless distance-based evaluation index is proposed. Similarly, for regression problems, the smoothness rate of target outputs in the space of dataflow is explained as a key factor indicating the performance of designed layers in improving the generalization of the network. According to the explained smoothness concept, a shapeless distance-based smoothness index is proposed for regression problems. To consider more strictly concepts of separation and smoothness, their extended versions are introduced, and by interpreting a regression problem as a classification problem, it is shown that the separation and smoothness indices are related together. Through four case studies, the profits of using the introduced indices are shown. In the first case study, for classification and regression problems , the challenging of some known input datasets are compared respectively by the proposed separation and smoothness indices. In the second case study, the quality of dataflow is evaluated through layers of two pre-trained VGG 16 networks in classification of Cifar10 and Cifar100. In the third case study, it is shown that the correct classification rate and the separation index are almost equivalent through layers particularly while the serration index is increased. In the last case study, two multi-layer neural networks, which are designed for the prediction of Boston Housing price, are compared layer by layer by using the proposed smoothness index. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05156v1 |
https://arxiv.org/pdf/1906.05156v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-dataflow-through-layers-of-deep |
Repo | |
Framework | |
In-Place Zero-Space Memory Protection for CNN
Title | In-Place Zero-Space Memory Protection for CNN |
Authors | Hui Guan, Lin Ning, Zhen Lin, Xipeng Shen, Huiyang Zhou, Seung-Hwan Lim |
Abstract | Convolutional Neural Networks (CNN) are being actively explored for safety-critical applications such as autonomous vehicles and aerospace, where it is essential to ensure the reliability of inference results in the presence of possible memory faults. Traditional methods such as error correction codes (ECC) and Triple Modular Redundancy (TMR) are CNN-oblivious and incur substantial memory overhead and energy cost. This paper introduces in-place zero-space ECC assisted with a new training scheme weight distribution-oriented training. The new method provides the first known zero space cost memory protection for CNNs without compromising the reliability offered by traditional ECC. |
Tasks | Autonomous Vehicles |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14479v1 |
https://arxiv.org/pdf/1910.14479v1.pdf | |
PWC | https://paperswithcode.com/paper/in-place-zero-space-memory-protection-for-cnn |
Repo | |
Framework | |
Bregman Proximal Framework for Deep Linear Neural Networks
Title | Bregman Proximal Framework for Deep Linear Neural Networks |
Authors | Mahesh Chandra Mukkamala, Felix Westerkamp, Emanuel Laude, Daniel Cremers, Peter Ochs |
Abstract | A typical assumption for the analysis of first order optimization methods is the Lipschitz continuity of the gradient of the objective function. However, for many practical applications this assumption is violated, including loss functions in deep learning. To overcome this issue, certain extensions based on generalized proximity measures known as Bregman distances were introduced. This initiated the development of the Bregman proximal gradient (BPG) algorithm and an inertial variant (momentum based) CoCaIn BPG, which however rely on problem dependent Bregman distances. In this paper, we develop Bregman distances for using BPG methods to train Deep Linear Neural Networks. The main implications of our results are strong convergence guarantees for these algorithms. We also propose several strategies for their efficient implementation, for example, closed form updates and a closed form expression for the inertial parameter of CoCaIn BPG. Moreover, the BPG method requires neither diminishing step sizes nor line search, unlike its corresponding Euclidean version. We numerically illustrate the competitiveness of the proposed methods compared to existing state of the art schemes. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03638v1 |
https://arxiv.org/pdf/1910.03638v1.pdf | |
PWC | https://paperswithcode.com/paper/bregman-proximal-framework-for-deep-linear |
Repo | |
Framework | |
Autonomous Vehicles Meet the Physical World: RSS, Variability, Uncertainty, and Proving Safety (Expanded Version)
Title | Autonomous Vehicles Meet the Physical World: RSS, Variability, Uncertainty, and Proving Safety (Expanded Version) |
Authors | Philip Koopman, Beth Osyk, Jack Weast |
Abstract | The Responsibility-Sensitive Safety (RSS) model offers provable safety for vehicle behaviors such as minimum safe following distance. However, handling worst-case variability and uncertainty may significantly lower vehicle permissiveness, and in some situations safety cannot be guaranteed. Digging deeper into Newtonian mechanics, we identify complications that result from considering vehicle status, road geometry and environmental parameters. An especially challenging situation occurs if these parameters change during the course of a collision avoidance maneuver such as hard braking. As part of our analysis, we expand the original RSS following distance equation to account for edge cases involving potential collisions mid-way through a braking process. We additionally propose a Micro-Operational Design Domain ({\mu}ODD) approach to subdividing the operational space as a way of improving permissiveness. Confining probabilistic aspects of safety to {\mu}ODD transitions permits proving safety (when possible) under the assumption that the system has transitioned to the correct {\mu}ODD for the situation. Each {\mu}ODD can additionally be used to encode system fault responses, take credit for advisory information (e.g., from vehicle-to-vehicle communication), and anticipate likely emergent situations. |
Tasks | Autonomous Vehicles |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1911.01207v1 |
https://arxiv.org/pdf/1911.01207v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-vehicles-meet-the-physical-world |
Repo | |
Framework | |
Model-based Reinforcement Learning for Predictions and Control for Limit Order Books
Title | Model-based Reinforcement Learning for Predictions and Control for Limit Order Books |
Authors | Haoran Wei, Yuanbo Wang, Lidia Mangu, Keith Decker |
Abstract | We build a profitable electronic trading agent with Reinforcement Learning that places buy and sell orders in the stock market. An environment model is built only with historical observational data, and the RL agent learns the trading policy by interacting with the environment model instead of with the real-market to minimize the risk and potential monetary loss. Trained in unsupervised and self-supervised fashion, our environment model learned a temporal and causal representation of the market in latent space through deep neural networks. We demonstrate that the trading policy trained entirely within the environment model can be transferred back into the real market and maintain its profitability. We believe that this environment model can serve as a robust simulator that predicts market movement as well as trade impact for further studies. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03743v1 |
https://arxiv.org/pdf/1910.03743v1.pdf | |
PWC | https://paperswithcode.com/paper/model-based-reinforcement-learning-for-1 |
Repo | |
Framework | |
Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning
Title | Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning |
Authors | Xin Qin, Nikos Aréchiga, Andrew Best, Jyotirmoy Deshmukh |
Abstract | We consider the problem of using reinforcement learning to train adversarial agents for automatic testing and falsification of cyberphysical systems, such as autonomous vehicles, robots, and airplanes. In order to produce useful agents, however, it is useful to be able to control the degree of adversariality by specifying rules that an agent must follow. For example, when testing an autonomous vehicle, it is useful to find maximally antagonistic traffic participants that obey traffic rules. We model dynamic constraints as hierarchically ordered rules expressed in Signal Temporal Logic, and show how these can be incorporated into an agent training process. We prove that our agent-centric approach is able to find all dangerous behaviors that can be found by traditional falsification techniques while producing modular and reusable agents. We demonstrate our approach on two case studies from the automotive domain. |
Tasks | Autonomous Vehicles |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13645v2 |
https://arxiv.org/pdf/1910.13645v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-testing-and-falsification-with |
Repo | |
Framework | |
A Distributed Model-Free Algorithm for Multi-hop Ride-sharing using Deep Reinforcement Learning
Title | A Distributed Model-Free Algorithm for Multi-hop Ride-sharing using Deep Reinforcement Learning |
Authors | Ashutosh Singh, Abubakr Alabbasi, Vaneet Aggarwal |
Abstract | The growth of autonomous vehicles, ridesharing systems, and self driving technology will bring a shift in the way ride hailing platforms plan out their services. However, these advances in technology coupled with road congestion, environmental concerns, fuel usage, vehicles emissions, and the high cost of the vehicle usage have brought more attention to better utilize the use of vehicles and their capacities. In this paper, we propose a novel multi-hop ride-sharing (MHRS) algorithm that uses deep reinforcement learning to learn optimal vehicle dispatch and matching decisions by interacting with the external environment. By allowing customers to transfer between vehicles, i.e., ride with one vehicle for sometime and then transfer to another one, MHRS helps in attaining 30% lower cost and 20% more efficient utilization of fleets, as compared to the ride-sharing algorithms. This flexibility of multi-hop feature gives a seamless experience to customers and ride-sharing companies, and thus improves ride-sharing services. |
Tasks | Autonomous Vehicles |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14002v1 |
https://arxiv.org/pdf/1910.14002v1.pdf | |
PWC | https://paperswithcode.com/paper/a-distributed-model-free-algorithm-for-multi |
Repo | |
Framework | |
On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix
Title | On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix |
Authors | Alia Abbara, Antoine Baker, Florent Krzakala, Lenka Zdeborová |
Abstract | In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*. There have been many theoretical works concentrating on the case where the matrix Phi is a random i.i.d. one, but a number of heuristic evidence suggests that many of these results are universal and extend well beyond this restricted case. Here we revisit this problematic through the prism of development of message passing methods, and consider not only the universality of the l1 transition, as previously addressed, but also the one of the optimal Bayesian reconstruction. We observed that the universality extends to the Bayes-optimal minimum mean-squared (MMSE) error, and to a range of structured matrices. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04735v1 |
https://arxiv.org/pdf/1906.04735v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-universality-of-noiseless-linear |
Repo | |
Framework | |
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
Title | A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction |
Authors | Fan Zhou, Guojing Cong |
Abstract | Reducing communication in training large-scale machine learning applications on distributed platform is still a big challenge. To address this issue, we propose a distributed hierarchical averaging stochastic gradient descent (Hier-AVG) algorithm with infrequent global reduction by introducing local reduction. As a general type of parallel SGD, Hier-AVG can reproduce several popular synchronous parallel SGD variants by adjusting its parameters. We show that Hier-AVG with infrequent global reduction can still achieve standard convergence rate for non-convex optimization problems. In addition, we show that more frequent local averaging with more participants involved can lead to faster training convergence. By comparing Hier-AVG with another popular distributed training algorithm K-AVG, we show that through deploying local averaging with fewer number of global averaging, Hier-AVG can still achieve comparable training speed while frequently get better test accuracy. This indicates that local averaging can serve as an alternative remedy to effectively reduce communication overhead when the number of learners is large. Experimental results of Hier-AVG with several state-of-the-art deep neural nets on CIFAR-10 and IMAGENET-1K are presented to validate our analysis and show its superiority. |
Tasks | |
Published | 2019-03-12 |
URL | https://arxiv.org/abs/1903.05133v2 |
https://arxiv.org/pdf/1903.05133v2.pdf | |
PWC | https://paperswithcode.com/paper/a-distributed-hierarchical-sgd-algorithm-with |
Repo | |
Framework | |