October 18, 2019

3264 words 16 mins read

Paper Group ANR 593

Expected Policy Gradients for Reinforcement Learning. Early Start Intention Detection of Cyclists Using Motion History Images and a Deep Residual Network. Adaptive Transfer Learning in Deep Neural Networks: Wind Power Prediction using Knowledge Transfer from Region to Region and Between Different Task Domains. Stochastic Second-order Methods for No …

Expected Policy Gradients for Reinforcement Learning


Title	Expected Policy Gradients for Reinforcement Learning
Authors	Kamil Ciosek, Shimon Whiteson
Abstract	We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates (or sums) across actions when estimating the gradient, instead of relying only on the action in the sampled trajectory. For continuous action spaces, we first derive a practical result for Gaussian policies and quadric critics and then extend it to an analytical method for the universal case, covering a broad class of actors and critics, including Gaussian, exponential families, and reparameterised policies with bounded support. For Gaussian policies, we show that it is optimal to explore using covariance proportional to the matrix exponential of the scaled Hessian of the critic with respect to the actions. EPG also provides a general framework for reasoning about policy gradient methods, which we use to establish a new general policy gradient theorem, of which the stochastic and deterministic policy gradient theorems are special cases. Furthermore, we prove that EPG reduces the variance of the gradient estimates without requiring deterministic policies and with little computational overhead. Finally, we show that EPG outperforms existing approaches on six challenging domains involving the simulated control of physical systems.
Tasks	Policy Gradient Methods
Published	2018-01-10
URL	http://arxiv.org/abs/1801.03326v1
PDF	http://arxiv.org/pdf/1801.03326v1.pdf
PWC	https://paperswithcode.com/paper/expected-policy-gradients-for-reinforcement
Repo
Framework

Early Start Intention Detection of Cyclists Using Motion History Images and a Deep Residual Network


Title	Early Start Intention Detection of Cyclists Using Motion History Images and a Deep Residual Network
Authors	Stefan Zernetsch, Viktor Kress, Bernhard Sick, Konrad Doll
Abstract	In this article, we present a novel approach to detect starting motions of cyclists in real world traffic scenarios based on Motion History Images (MHIs). The method uses a deep Convolutional Neural Network (CNN) with a residual network architecture (ResNet), which is commonly used in image classification and detection tasks. By combining MHIs with a ResNet classifier and performing a frame by frame classification of the MHIs, we are able to detect starting motions in image sequences. The detection is performed using a wide angle stereo camera system at an urban intersection. We compare our algorithm to an existing method to detect movement transitions of pedestrians that uses MHIs in combination with a Histograms of Oriented Gradients (HOG) like descriptor and a Support Vector Machine (SVM), which we adapted to cyclists. To train and evaluate the methods a dataset containing MHIs of 394 cyclist starting motions was created. The results show that both methods can be used to detect starting motions of cyclists. Using the SVM approach, we were able to safely detect starting motions 0.506 s on average after the bicycle starts moving with an F1-score of 97.7%. The ResNet approach achieved an F1-score of 100% at an average detection time of 0.144 s. The ResNet approach outperformed the SVM approach in both robustness against false positive detections and detection time.
Tasks	Image Classification
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02242v1
PDF	http://arxiv.org/pdf/1803.02242v1.pdf
PWC	https://paperswithcode.com/paper/early-start-intention-detection-of-cyclists
Repo
Framework

Adaptive Transfer Learning in Deep Neural Networks: Wind Power Prediction using Knowledge Transfer from Region to Region and Between Different Task Domains


Title	Adaptive Transfer Learning in Deep Neural Networks: Wind Power Prediction using Knowledge Transfer from Region to Region and Between Different Task Domains
Authors	Aqsa Saeed Qureshi, Asifullah Khan
Abstract	Transfer Learning (TL) in Deep Neural Networks is gaining importance because in most of the applications, the labeling of data is costly and time-consuming. Additionally, TL also provides an effective weight initialization strategy for Deep Neural Networks . This paper introduces the idea of Adaptive Transfer Learning in Deep Neural Networks (ATL-DNN) for wind power prediction. Specifically, we show in case of wind power prediction that adaptive TL of Deep Neural Networks system can be adaptively modified as regards training on a different wind farm is concerned. The proposed ATL-DNN technique is tested for short-term wind power prediction, where continuously arriving information has to be exploited. Adaptive TL not only helps in providing good weight initialization, but is also helpful to utilize the incoming data for effective learning. Additionally, the proposed ATL-DNN technique is shown to transfer knowledge between different task domains (wind power to wind speed prediction) and from one region to another region. The simulation results show that the proposed ATL-DNN technique achieves average values of 0.0637,0.0986, and 0.0984 for the Mean-Absolute-Error, Root-Mean-Squared-Error, and Standard-Deviation-Error, respectively.
Tasks	Transfer Learning
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12611v2
PDF	http://arxiv.org/pdf/1810.12611v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-transfer-learning-in-deep-neural
Repo
Framework

Stochastic Second-order Methods for Non-convex Optimization with Inexact Hessian and Gradient


Title	Stochastic Second-order Methods for Non-convex Optimization with Inexact Hessian and Gradient
Authors	Liu Liu, Xuanqing Liu, Cho-Jui Hsieh, Dacheng Tao
Abstract	Trust region and cubic regularization methods have demonstrated good performance in small scale non-convex optimization, showing the ability to escape from saddle points. Each iteration of these methods involves computation of gradient, Hessian and function value in order to obtain the search direction and adjust the radius or cubic regularization parameter. However, exactly computing those quantities are too expensive in large-scale problems such as training deep networks. In this paper, we study a family of stochastic trust region and cubic regularization methods when gradient, Hessian and function values are computed inexactly, and show the iteration complexity to achieve $\epsilon$-approximate second-order optimality is in the same order with previous work for which gradient and function values are computed exactly. The mild conditions on inexactness can be achieved in finite-sum minimization using random sampling. We show the algorithm performs well on training convolutional neural networks compared with previous second-order methods.
Tasks
Published	2018-09-26
URL	http://arxiv.org/abs/1809.09853v1
PDF	http://arxiv.org/pdf/1809.09853v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-second-order-methods-for-non
Repo
Framework

Inlining External Sources in Answer Set Programs


Title	Inlining External Sources in Answer Set Programs
Authors	Christoph Redl
Abstract	HEX-programs are an extension of answer set programs (ASP) with external sources. To this end, external atoms provide a bidirectional interface between the program and an external source. The traditional evaluation algorithm for HEX-programs is based on guessing truth values of external atoms and verifying them by explicit calls of the external source. The approach was optimized by techniques that reduce the number of necessary verification calls or speed them up, but the remaining external calls are still expensive. In this paper we present an alternative evaluation approach based on inlining of external atoms, motivated by existing but less general approaches for specialized formalisms such as DL-programs. External atoms are then compiled away such that no verification calls are necessary. The approach is implemented in the dlvhex reasoner. Experiments show a significant performance gain. Besides performance improvements, we further exploit inlining for extending previous (semantic) characterizations of program equivalence from ASP to HEX-programs, including those of strong equivalence, uniform equivalence and H, B -equivalence. Finally, based on these equivalence criteria, we characterize also inconsistency of programs wrt. extensions. Since well-known ASP extensions (such as constraint ASP) are special cases of HEX, the results are interesting beyond the particular formalism. Under consideration in Theory and Practice of Logic Programming (TPLP).
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00727v1
PDF	http://arxiv.org/pdf/1808.00727v1.pdf
PWC	https://paperswithcode.com/paper/inlining-external-sources-in-answer-set
Repo
Framework

Defend Deep Neural Networks Against Adversarial Examples via Fixed and Dynamic Quantized Activation Functions


Title	Defend Deep Neural Networks Against Adversarial Examples via Fixed and Dynamic Quantized Activation Functions
Authors	Adnan Siraj Rakin, Jinfeng Yi, Boqing Gong, Deliang Fan
Abstract	Recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial attacks. To this end, many defense approaches that attempt to improve the robustness of DNNs have been proposed. In a separate and yet related area, recent works have explored to quantize neural network weights and activation functions into low bit-width to compress model size and reduce computational complexity. In this work, we find that these two different tracks, namely the pursuit of network compactness and robustness, can be merged into one and give rise to networks of both advantages. To the best of our knowledge, this is the first work that uses quantization of activation functions to defend against adversarial examples. We also propose to train robust neural networks by using adaptive quantization techniques for the activation functions. Our proposed Dynamic Quantized Activation (DQA) is verified through a wide range of experiments with the MNIST and CIFAR-10 datasets under different white-box attack methods, including FGSM, PGD, and C & W attacks. Furthermore, Zeroth Order Optimization and substitute model-based black-box attacks are also considered in this work. The experimental results clearly show that the robustness of DNNs could be greatly improved using the proposed DQA.
Tasks	Quantization
Published	2018-07-18
URL	https://arxiv.org/abs/1807.06714v2
PDF	https://arxiv.org/pdf/1807.06714v2.pdf
PWC	https://paperswithcode.com/paper/defend-deep-neural-networks-against
Repo
Framework

Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes


Title	Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes
Authors	Tongfei Chen, Jiří Navrátil, Vijay Iyengar, Karthikeyan Shanmugam
Abstract	We propose a novel confidence scoring mechanism for deep neural networks based on a two-model paradigm involving a base model and a meta-model. The confidence score is learned by the meta-model observing the base model succeeding/failing at its task. As features to the meta-model, we investigate linear classifier probes inserted between the various layers of the base model. Our experiments demonstrate that this approach outperforms various baselines in a filtering task, i.e., task of rejecting samples with low confidence. Experimental results are presented using CIFAR-10 and CIFAR-100 dataset with and without added noise. We discuss the importance of confidence scoring to bridge the gap between experimental and real-world applications.
Tasks
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05396v2
PDF	http://arxiv.org/pdf/1805.05396v2.pdf
PWC	https://paperswithcode.com/paper/confidence-scoring-using-whitebox-meta-models
Repo
Framework


Title	End-to-End Refinement Guided by Pre-trained Prototypical Classifier
Authors	Junwen Bai, Zihang Lai, Runzhe Yang, Yexiang Xue, John Gregoire, Carla Gomes
Abstract	Many real-world tasks involve identifying patterns from data satisfying background or prior knowledge. In domains like materials discovery, due to the flaws and biases in raw experimental data, the identification of X-ray diffraction patterns (XRD) often requires a huge amount of manual work in finding refined phases that are similar to the ideal theoretical ones. Automatically refining the raw XRDs utilizing the simulated theoretical data is thus desirable. We propose imitation refinement, a novel approach to refine imperfect input patterns, guided by a pre-trained classifier incorporating prior knowledge from simulated theoretical data, such that the refined patterns imitate the ideal data. The classifier is trained on the ideal simulated data to classify patterns and learns an embedding space where each class is represented by a prototype. The refiner learns to refine the imperfect patterns with small modifications, such that their embeddings are closer to the corresponding prototypes. We show that the refiner can be trained in both supervised and unsupervised fashions. We further illustrate the effectiveness of the proposed approach both qualitatively and quantitatively in a digit refinement task and an X-ray diffraction pattern refinement task in materials discovery.
Tasks
Published	2018-05-07
URL	http://arxiv.org/abs/1805.08698v2
PDF	http://arxiv.org/pdf/1805.08698v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-refinement-guided-by-pre-trained
Repo
Framework

Convergence Rates of Gaussian ODE Filters


Title	Convergence Rates of Gaussian ODE Filters
Authors	Hans Kersting, T. J. Sullivan, Philipp Hennig
Abstract	A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives a priori as a Gauss–Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. This article establishes worst-case local convergence rates of order $q+1$ for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order $q$ in the case of $q=1$ and an integrated Brownian motion prior, and analyses how inaccurate information on $\dot{x}$ coming from approximate evaluations of $f$ affects these rates. Moreover, we show that, in the globally convergent case, the posterior credible intervals are well calibrated in the sense that they globally contract at the same rate as the truncation error. We illustrate these theoretical results by numerical experiments which suggest their generalizability to $q \in {2,3,4,\dots}$.
Tasks
Published	2018-07-25
URL	https://arxiv.org/abs/1807.09737v2
PDF	https://arxiv.org/pdf/1807.09737v2.pdf
PWC	https://paperswithcode.com/paper/convergence-rates-of-gaussian-ode-filters
Repo
Framework

AAAI FSS-18: Artificial Intelligence in Government and Public Sector Proceedings


Title	AAAI FSS-18: Artificial Intelligence in Government and Public Sector Proceedings
Authors	Frank Stein, Alun Preece, Mihai Boicu
Abstract	Proceedings of the AAAI Fall Symposium on Artificial Intelligence in Government and Public Sector, Arlington, Virginia, USA, October 18-20, 2018
Tasks
Published	2018-10-14
URL	http://arxiv.org/abs/1810.06018v1
PDF	http://arxiv.org/pdf/1810.06018v1.pdf
PWC	https://paperswithcode.com/paper/aaai-fss-18-artificial-intelligence-in
Repo
Framework

Mapping Road Lanes Using Laser Remission and Deep Neural Networks


Title	Mapping Road Lanes Using Laser Remission and Deep Neural Networks
Authors	Raphael V. Carneiro, Rafael C. Nascimento, Rânik Guidolini, Vinicius B. Cardoso, Thiago Oliveira-Santos, Claudine Badue, Alberto F. De Souza
Abstract	We propose the use of deep neural networks (DNN) for solving the problem of inferring the position and relevant properties of lanes of urban roads with poor or absent horizontal signalization, in order to allow the operation of autonomous cars in such situations. We take a segmentation approach to the problem and use the Efficient Neural Network (ENet) DNN for segmenting LiDAR remission grid maps into road maps. We represent road maps using what we called road grid maps. Road grid maps are square matrixes and each element of these matrixes represents a small square region of real-world space. The value of each element is a code associated with the semantics of the road map. Our road grid maps contain all information about the roads’ lanes required for building the Road Definition Data Files (RDDFs) that are necessary for the operation of our autonomous car, IARA (Intelligent Autonomous Robotic Automobile). We have built a dataset of tens of kilometers of manually marked road lanes and used part of it to train ENet to segment road grid maps from remission grid maps. After being trained, ENet achieved an average segmentation accuracy of 83.7%. We have tested the use of inferred road grid maps in the real world using IARA on a stretch of 3.7 km of urban roads and it has shown performance equivalent to that of the previous IARA’s subsystem that uses a manually generated RDDF.
Tasks
Published	2018-04-27
URL	http://arxiv.org/abs/1804.10662v1
PDF	http://arxiv.org/pdf/1804.10662v1.pdf
PWC	https://paperswithcode.com/paper/mapping-road-lanes-using-laser-remission-and
Repo
Framework

Investigating the Automatic Classification of Algae Using Fusion of Spectral and Morphological Characteristics of Algae via Deep Residual Learning


Title	Investigating the Automatic Classification of Algae Using Fusion of Spectral and Morphological Characteristics of Algae via Deep Residual Learning
Authors	Jason L. Deglint, Chao Jin, Alexander Wong
Abstract	Under the impact of global climate changes and human activities, harmful algae blooms in surface waters have become a growing concern due to negative impacts on water related industries. Therefore, reliable and cost effective methods of quantifying the type and concentration of threshold levels of algae cells has become critical for ensuring successful water management. In this work, we present SAMSON, an innovative system to automatically classify multiple types of algae from different phyla groups by combining standard morphological features with their multi-wavelength signals. Two phyla with focused investigation in this study are the Cyanophyta phylum (blue-green algae), and the Chlorophyta phylum (green algae). We use a custom-designed microscopy imaging system which is configured to image water samples at two fluorescent wavelengths and seven absorption wavelengths using discrete-wavelength high-powered light emitting diodes (LEDs). Powered by computer vision and machine learning, we investigate the possibility and effectiveness of automatic classification using a deep residual convolutional neural network. More specifically, a classification accuracy of 96% was achieved in an experiment conducted with six different algae types. This high level of accuracy was achieved using a deep residual convolutional neural network that learns the optimal combination of spectral and morphological features. These findings elude to the possibility of leveraging a unique fingerprint of algae cell (i.e. spectral wavelengths and morphological features) to automatically distinguish different algae types. Our work herein demonstrates that, when coupled with multi-band fluorescence microscopy, machine learning algorithms can potentially be used as a robust and cost-effective tool for identifying and enumerating algae cells.
Tasks
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10889v1
PDF	http://arxiv.org/pdf/1810.10889v1.pdf
PWC	https://paperswithcode.com/paper/investigating-the-automatic-classification-of
Repo
Framework

Using an Ancillary Neural Network to Capture Weekends and Holidays in an Adjoint Neural Network Architecture for Intelligent Building Management


Title	Using an Ancillary Neural Network to Capture Weekends and Holidays in an Adjoint Neural Network Architecture for Intelligent Building Management
Authors	Zhicheng Ding, Mehmet Kerem Turkcan, Albert Boulanger
Abstract	The US EIA estimated in 2017 about 39% of total U.S. energy consumption was by the residential and commercial sectors. Therefore, Intelligent Building Management (IBM) solutions that minimize consumption while maintaining tenant comfort are an important component in addressing climate change. A forecasting capability for accurate prediction of indoor temperatures in a planning horizon of 24 hours is essential to IBM. It should predict the indoor temperature in both short-term (e.g. 15 minutes) and long-term (e.g. 24 hours) periods accurately including weekends, major holidays, and minor holidays. Other requirements include the ability to predict the maximum and the minimum indoor temperatures precisely and provide the confidence for each prediction. To achieve these requirements, we propose a novel adjoint neural network architecture for time series prediction that uses an ancillary neural network to capture weekend and holiday information. We studied four long short-term memory (LSTM) based time series prediction networks within this architecture. We observed that the ancillary neural network helps to improve the prediction accuracy, the maximum and the minimum temperature prediction and model reliability for all networks tested.
Tasks	Time Series, Time Series Prediction
Published	2018-12-26
URL	http://arxiv.org/abs/1902.06778v1
PDF	http://arxiv.org/pdf/1902.06778v1.pdf
PWC	https://paperswithcode.com/paper/using-an-ancillary-neural-network-to-capture
Repo
Framework

Learning Dexterous In-Hand Manipulation


Title	Learning Dexterous In-Hand Manipulation
Authors	OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, Wojciech Zaremba
Abstract	We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object’s appearance. Our policies transfer to the physical robot despite being trained entirely in simulation. Our method does not rely on any human demonstrations, but many behaviors found in human manipulation emerge naturally, including finger gaiting, multi-finger coordination, and the controlled use of gravity. Our results were obtained using the same distributed RL system that was used to train OpenAI Five. We also include a video of our results: https://youtu.be/jwSbzNHGflM
Tasks
Published	2018-08-01
URL	http://arxiv.org/abs/1808.00177v5
PDF	http://arxiv.org/pdf/1808.00177v5.pdf
PWC	https://paperswithcode.com/paper/learning-dexterous-in-hand-manipulation
Repo
Framework

Spatio-temporal Stacked LSTM for Temperature Prediction in Weather Forecasting


Title	Spatio-temporal Stacked LSTM for Temperature Prediction in Weather Forecasting
Authors	Zahra Karevan, Johan A. K. Suykens
Abstract	Long Short-Term Memory (LSTM) is a well-known method used widely on sequence learning and time series prediction. In this paper we deployed stacked LSTM model in an application of weather forecasting. We propose a 2-layer spatio-temporal stacked LSTM model which consists of independent LSTM models per location in the first LSTM layer. Subsequently, the input of the second LSTM layer is formed based on the combination of the hidden states of the first layer LSTM models. The experiments show that by utilizing the spatial information the prediction performance of the stacked LSTM model improves in most of the cases.
Tasks	Time Series, Time Series Prediction, Weather Forecasting
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06341v1
PDF	http://arxiv.org/pdf/1811.06341v1.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-stacked-lstm-for-temperature
Repo
Framework