January 28, 2020

2895 words 14 mins read

Paper Group ANR 897

Stochastically Dominant Distributional Reinforcement Learning. Distributed Parameter Estimation in Randomized One-hidden-layer Neural Networks. The Knowledge Within: Methods for Data-Free Model Compression. Understanding LSTM – a tutorial into Long Short-Term Memory Recurrent Neural Networks. Communication-Efficient Distributed Online Learning wit …

Stochastically Dominant Distributional Reinforcement Learning


Title	Stochastically Dominant Distributional Reinforcement Learning
Authors	John D. Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot
Abstract	We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of the environment’s uncertainty. The necessary conditions for SSD require estimators to predict accurate second moments. To accommodate this, we map the distributional RL problem to a Wasserstein gradient flow, treating the distributional Bellman residual as a potential energy functional. We propose a particle-based algorithm for which we prove optimality and convergence. Our experiments characterize the algorithm performance and demonstrate how uncertainty and performance are better balanced using an \textsc{ssd} policy than with other risk measures.
Tasks	Distributional Reinforcement Learning
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07318v3
PDF	https://arxiv.org/pdf/1905.07318v3.pdf
PWC	https://paperswithcode.com/paper/stochastically-dominant-distributional
Repo
Framework

Distributed Parameter Estimation in Randomized One-hidden-layer Neural Networks


Title	Distributed Parameter Estimation in Randomized One-hidden-layer Neural Networks
Authors	Yinsong Wang, Shahin Shahrampour
Abstract	This paper addresses distributed parameter estimation in randomized one-hidden-layer neural networks. A group of agents sequentially receive measurements of an unknown parameter that is only partially observable to them. In this paper, we present a fully distributed estimation algorithm where agents exchange local estimates with their neighbors to collectively identify the true value of the parameter. We prove that this distributed update provides an asymptotically unbiased estimator of the unknown parameter, i.e., the first moment of the expected global error converges to zero asymptotically. We further analyze the efficiency of the proposed estimation scheme by establishing an asymptotic upper bound on the variance of the global error. Applying our method to a real-world dataset related to appliances energy prediction, we observe that our empirical findings verify the theoretical results.
Tasks
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09736v2
PDF	https://arxiv.org/pdf/1909.09736v2.pdf
PWC	https://paperswithcode.com/paper/190909736
Repo
Framework

The Knowledge Within: Methods for Data-Free Model Compression


Title	The Knowledge Within: Methods for Data-Free Model Compression
Authors	Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry
Abstract	Background: Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNNs). So far, high compression rate algorithms required the entire training dataset, or its subset, for fine-tuning and low precision calibration process. However, this requirement is unacceptable when sensitive data is involved as in medical and biometric use-cases. Contributions: We present three methods for generating synthetic samples from trained models. Then, we demonstrate how these samples can be used to fine-tune or to calibrate quantized models with negligible accuracy degradation compared to the original training set — without using any real data in the process. Furthermore, we suggest that our best performing method, leveraging intrinsic batch normalization layers’ statistics of a trained model, can be used to evaluate data similarity. Our approach opens a path towards genuine data-free model compression, alleviating the need for training data during deployment.
Tasks	Calibration, Model Compression
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01274v1
PDF	https://arxiv.org/pdf/1912.01274v1.pdf
PWC	https://paperswithcode.com/paper/the-knowledge-within-methods-for-data-free
Repo
Framework

Understanding LSTM – a tutorial into Long Short-Term Memory Recurrent Neural Networks


Title	Understanding LSTM – a tutorial into Long Short-Term Memory Recurrent Neural Networks
Authors	Ralf C. Staudemeyer, Eric Rothstein Morris
Abstract	Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) are one of the most powerful dynamic classifiers publicly known. The network itself and the related learning algorithms are reasonably well documented to get an idea how it works. This paper will shed more light into understanding how LSTM-RNNs evolved and why they work impressively well, focusing on the early, ground-breaking publications. We significantly improved documentation and fixed a number of errors and inconsistencies that accumulated in previous publications. To support understanding we as well revised and unified the notation used.
Tasks
Published	2019-09-12
URL	https://arxiv.org/abs/1909.09586v1
PDF	https://arxiv.org/pdf/1909.09586v1.pdf
PWC	https://paperswithcode.com/paper/understanding-lstm-a-tutorial-into-long-short
Repo
Framework

Communication-Efficient Distributed Online Learning with Kernels


Title	Communication-Efficient Distributed Online Learning with Kernels
Authors	Michael Kamp, Sebastian Bothe, Mario Boley, Michael Mock
Abstract	We propose an efficient distributed online learning protocol for low-latency real-time services. It extends a previously presented protocol to kernelized online learners that represent their models by a support vector expansion. While such learners often achieve higher predictive performance than their linear counterparts, communicating the support vector expansions becomes inefficient for large numbers of support vectors. The proposed extension allows for a larger class of online learning algorithms—including those alleviating the problem above through model compression. In addition, we characterize the quality of the proposed protocol by introducing a novel criterion that requires the communication to be bounded by the loss suffered.
Tasks	Model Compression
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12899v1
PDF	https://arxiv.org/pdf/1911.12899v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-online
Repo
Framework

Atomistic structure learning


Title	Atomistic structure learning
Authors	Mathias S. Jørgensen, Henrik L. Mortensen, Søren A. Meldgaard, Esben L. Kolsbjerg, Thomas L. Jacobsen, Knud H. Sørensen, Bjørk Hammer
Abstract	One endeavour of modern physical chemistry is to use bottom-up approaches to design materials and drugs with desired properties. Here we introduce an atomistic structure learning algorithm (ASLA) that utilizes a convolutional neural network to build 2D compounds and layered structures atom by atom. The algorithm takes no prior data or knowledge on atomic interactions but inquires a first-principles quantum mechanical program for physical properties. Using reinforcement learning, the algorithm accumulates knowledge of chemical compound space for a given number and type of atoms and stores this in the neural network, ultimately learning the blueprint for the optimal structural arrangement of the atoms for a given target property. ASLA is demonstrated to work on diverse problems, including grain boundaries in graphene sheets, organic compound formation and a surface oxide structure. This approach to structure prediction is a first step toward direct manipulation of atoms with artificially intelligent first principles computer codes.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10501v1
PDF	http://arxiv.org/pdf/1902.10501v1.pdf
PWC	https://paperswithcode.com/paper/atomistic-structure-learning
Repo
Framework

Morphy: A Datamorphic Software Test Automation Tool


Title	Morphy: A Datamorphic Software Test Automation Tool
Authors	Hong Zhu, Ian Bayley, Dongmei Liu, Xiaoyu Zheng
Abstract	This paper presents an automated tool called Morphy for datamorphic testing. It classifies software test artefacts into test entities and test morphisms, which are mappings on testing entities. In addition to datamorphisms, metamorphisms and seed test case makers, Morphy also employs a set of other test morphisms including test case metrics and filters, test set metrics and filters, test result analysers and test executers to realise test automation. In particular, basic testing activities can be automated by invoking test morphisms. Test strategies can be realised as complex combinations of test morphisms. Test processes can be automated by recording, editing and playing test scripts that invoke test morphisms and strategies. Three types of test strategies have been implemented in Morphy: datamorphism combination strategies, cluster border exploration strategies and strategies for test set optimisation via genetic algorithms. This paper focuses on the datamorphism combination strategies by giving their definitions and implementation algorithms. The paper also illustrates their uses for testing both traditional software and AI applications with three case studies.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09881v1
PDF	https://arxiv.org/pdf/1912.09881v1.pdf
PWC	https://paperswithcode.com/paper/morphy-a-datamorphic-software-test-automation
Repo
Framework

Financial Time Series Forecasting with Deep Learning : A Systematic Literature Review: 2005-2019


Title	Financial Time Series Forecasting with Deep Learning : A Systematic Literature Review: 2005-2019
Authors	Omer Berat Sezer, Mehmet Ugur Gudelek, Ahmet Murat Ozbayoglu
Abstract	Financial time series forecasting is, without a doubt, the top choice of computational intelligence for finance researchers from both academia and financial industry due to its broad implementation areas and substantial impact. Machine Learning (ML) researchers came up with various models and a vast number of studies have been published accordingly. As such, a significant amount of surveys exist covering ML for financial time series forecasting studies. Lately, Deep Learning (DL) models started appearing within the field, with results that significantly outperform traditional ML counterparts. Even though there is a growing interest in developing models for financial time series forecasting research, there is a lack of review papers that were solely focused on DL for finance. Hence, our motivation in this paper is to provide a comprehensive literature review on DL studies for financial time series forecasting implementations. We not only categorized the studies according to their intended forecasting implementation areas, such as index, forex, commodity forecasting, but also grouped them based on their DL model choices, such as Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), Long-Short Term Memory (LSTM). We also tried to envision the future for the field by highlighting the possible setbacks and opportunities, so the interested researchers can benefit.
Tasks	Time Series, Time Series Forecasting
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13288v1
PDF	https://arxiv.org/pdf/1911.13288v1.pdf
PWC	https://paperswithcode.com/paper/financial-time-series-forecasting-with-deep
Repo
Framework

Horizontal Flows and Manifold Stochastics in Geometric Deep Learning


Title	Horizontal Flows and Manifold Stochastics in Geometric Deep Learning
Authors	Stefan Sommer, Alex Bronstein
Abstract	We introduce two constructions in geometric deep learning for 1) transporting orientation-dependent convolutional filters over a manifold in a continuous way and thereby defining a convolution operator that naturally incorporates the rotational effect of holonomy; and 2) allowing efficient evaluation of manifold convolution layers by sampling manifold valued random variables that center around a weighted Brownian motion maximum likelihood mean. Both methods are inspired by stochastics on manifolds and geometric statistics, and provide examples of how stochastic methods – here horizontal frame bundle flows and non-linear bridge sampling schemes, can be used in geometric deep learning. We outline the theoretical foundation of the two methods, discuss their relation to Euclidean deep networks and existing methodology in geometric deep learning, and establish important properties of the proposed constructions.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06397v1
PDF	https://arxiv.org/pdf/1909.06397v1.pdf
PWC	https://paperswithcode.com/paper/horizontal-flows-and-manifold-stochastics-in
Repo
Framework

Dragonfly Algorithm and its Applications in Applied Science – Survey


Title	Dragonfly Algorithm and its Applications in Applied Science – Survey
Authors	Chnoor M. Rahman, Tarik A. Rashid
Abstract	One of the most recently developed heuristic optimization algorithms is dragonfly by Mirjalili. Dragonfly algorithm has shown its ability to optimizing different real world problems. It has three variants. In this work, an overview of the algorithm and its variants is presented. Moreover, the hybridization versions of the algorithm are discussed. Furthermore, the results of the applications that utilized dragonfly algorithm in applied science are offered in the following area: Machine Learning, Image Processing, Wireless, and Networking. It is then compared with some other metaheuristic algorithms. In addition, the algorithm is tested on the CEC-C06 2019 benchmark functions. The results prove that the algorithm has great exploration ability and its convergence rate is better than other algorithms in the literature, such as PSO and GA. In general, in this survey the strong and weak points of the algorithm are discussed. Furthermore, some future works that will help in improving the algorithm’s weak points are recommended. This study is conducted with the hope of offering beneficial information about dragonfly algorithm to the researchers who want to study the algorithm.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/2001.02292v1
PDF	https://arxiv.org/pdf/2001.02292v1.pdf
PWC	https://paperswithcode.com/paper/dragonfly-algorithm-and-its-applications-in
Repo
Framework

Foreground-aware Image Inpainting


Title	Foreground-aware Image Inpainting
Authors	Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, Jiebo Luo
Abstract	Existing image inpainting methods typically fill holes by borrowing information from surrounding pixels. They often produce unsatisfactory results when the holes overlap with or touch foreground objects due to lack of information about the actual extent of foreground and background regions within the holes. These scenarios, however, are very important in practice, especially for applications such as the removal of distracting objects. To address the problem, we propose a foreground-aware image inpainting system that explicitly disentangles structure inference and content completion. Specifically, our model learns to predict the foreground contour first, and then inpaints the missing region using the predicted contour as guidance. We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting. Experiments show that our method significantly outperforms existing methods and achieves superior inpainting results on challenging cases with complex compositions.
Tasks	Image Inpainting
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05945v3
PDF	http://arxiv.org/pdf/1901.05945v3.pdf
PWC	https://paperswithcode.com/paper/foreground-aware-image-inpainting
Repo
Framework

Building a Production Model for Retrieval-Based Chatbots


Title	Building a Production Model for Retrieval-Based Chatbots
Authors	Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei
Abstract	Response suggestion is an important task for building human-computer conversation systems. Recent approaches to conversation modeling have introduced new model architectures with impressive results, but relatively little attention has been paid to whether these models would be practical in a production setting. In this paper, we describe the unique challenges of building a production retrieval-based conversation system, which selects outputs from a whitelist of candidate responses. To address these challenges, we propose a dual encoder architecture which performs rapid inference and scales well with the size of the whitelist. We also introduce and compare two methods for generating whitelists, and we carry out a comprehensive analysis of the model and whitelists. Experimental results on a large, proprietary help desk chat dataset, including both offline metrics and a human evaluation, indicate production-quality performance and illustrate key lessons about conversation modeling in practice.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03209v2
PDF	https://arxiv.org/pdf/1906.03209v2.pdf
PWC	https://paperswithcode.com/paper/building-a-production-model-for-retrieval
Repo
Framework

Spotting insects from satellites: modeling the presence of Culicoides imicola through Deep CNNs


Title	Spotting insects from satellites: modeling the presence of Culicoides imicola through Deep CNNs
Authors	Stefano Vincenzi, Angelo Porrello, Pietro Buzzega, Annamaria Conte, Carla Ippoliti, Luca Candeloro, Alessio Di Lorenzo, Andrea Capobianco Dondona, Simone Calderara
Abstract	Nowadays, Vector-Borne Diseases (VBDs) raise a severe threat for public health, accounting for a considerable amount of human illnesses. Recently, several surveillance plans have been put in place for limiting the spread of such diseases, typically involving on-field measurements. Such a systematic and effective plan still misses, due to the high costs and efforts required for implementing it. Ideally, any attempt in this field should consider the triangle vectors-host-pathogen, which is strictly linked to the environmental and climatic conditions. In this paper, we exploit satellite imagery from Sentinel-2 mission, as we believe they encode the environmental factors responsible for the vector’s spread. Our analysis - conducted in a data-driver fashion - couples spectral images with ground-truth information on the abundance of Culicoides imicola. In this respect, we frame our task as a binary classification problem, underpinning Convolutional Neural Networks (CNNs) as being able to learn useful representation from multi-band images. Additionally, we provide a multi-instance variant, aimed at extracting temporal patterns from a short sequence of spectral images. Experiments show promising results, providing the foundations for novel supportive tools, which could depict where surveillance and prevention measures could be prioritized.
Tasks
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10024v1
PDF	https://arxiv.org/pdf/1911.10024v1.pdf
PWC	https://paperswithcode.com/paper/spotting-insects-from-satellites-modeling-the
Repo
Framework

1D-Convolutional Capsule Network for Hyperspectral Image Classification


Title	1D-Convolutional Capsule Network for Hyperspectral Image Classification
Authors	Haitao Zhang, Lingguo Meng, Xian Wei, Xiaoliang Tang, Xuan Tang, Xingping Wang, Bo Jin, Wei Yao
Abstract	Recently, convolutional neural networks (CNNs) have achieved excellent performances in many computer vision tasks. Specifically, for hyperspectral images (HSIs) classification, CNNs often require very complex structure due to the high dimension of HSIs. The complex structure of CNNs results in prohibitive training efforts. Moreover, the common situation in HSIs classification task is the lack of labeled samples, which results in accuracy deterioration of CNNs. In this work, we develop an easy-to-implement capsule network to alleviate the aforementioned problems, i.e., 1D-convolution capsule network (1D-ConvCapsNet). Firstly, 1D-ConvCapsNet separately extracts spatial and spectral information on spatial and spectral domains, which is more lightweight than 3D-convolution due to fewer parameters. Secondly, 1D-ConvCapsNet utilizes the capsule-wise constraint window method to reduce parameter amount and computational complexity of conventional capsule network. Finally, 1D-ConvCapsNet obtains accurate predictions with respect to input samples via dynamic routing. The effectiveness of the 1D-ConvCapsNet is verified by three representative HSI datasets. Experimental results demonstrate that 1D-ConvCapsNet is superior to state-of-the-art methods in both the accuracy and training effort.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09834v1
PDF	http://arxiv.org/pdf/1903.09834v1.pdf
PWC	https://paperswithcode.com/paper/1d-convolutional-capsule-network-for
Repo
Framework

Understanding Spatial Language in Radiology: Representation Framework, Annotation, and Spatial Relation Extraction from Chest X-ray Reports using Deep Learning


Title	Understanding Spatial Language in Radiology: Representation Framework, Annotation, and Spatial Relation Extraction from Chest X-ray Reports using Deep Learning
Authors	Surabhi Datta, Yuqi Si, Laritza Rodriguez, Sonya E Shooshan, Dina Demner-Fushman, Kirk Roberts
Abstract	We define a representation framework for extracting spatial information from radiology reports (Rad-SpRL). We annotated a total of 2000 chest X-ray reports with 4 spatial roles corresponding to the common radiology entities. Our focus is on extracting detailed information of a radiologist’s interpretation containing a radiographic finding, its anatomical location, corresponding probable diagnoses, as well as associated hedging terms. For this, we propose a deep learning-based natural language processing (NLP) method involving both word and character-level encodings. Specifically, we utilize a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) model for extracting the spatial roles. The model achieved average F1 measures of 90.28 and 94.61 for extracting the Trajector and Landmark roles respectively whereas the performance was moderate for Diagnosis and Hedge roles with average F1 of 71.47 and 73.27 respectively. The corpus will soon be made available upon request.
Tasks	Relation Extraction
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04485v1
PDF	https://arxiv.org/pdf/1908.04485v1.pdf
PWC	https://paperswithcode.com/paper/understanding-spatial-language-in-radiology
Repo
Framework