October 17, 2019

3247 words 16 mins read

Paper Group ANR 874

Paper Group ANR 874

Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference. A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks. A note on concentration inequality for vector-valued martingales with weak exponential-type tails. BCR-Net: a neural network based on the nonstandard wavelet fo …

Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference

Title Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference
Authors Zhize Li, Tianyi Zhang, Shuyu Cheng, Jun Zhu, Jian Li
Abstract Gradient-based Monte Carlo sampling algorithms, like Langevin dynamics and Hamiltonian Monte Carlo, are important methods for Bayesian inference. In large-scale settings, full-gradients are not affordable and thus stochastic gradients evaluated on mini-batches are used as a replacement. In order to reduce the high variance of noisy stochastic gradients, Dubey et al. [2016] applied the standard variance reduction technique on stochastic gradient Langevin dynamics and obtained both theoretical and experimental improvements. In this paper, we apply the variance reduction tricks on Hamiltonian Monte Carlo and achieve better theoretical convergence results compared with the variance-reduced Langevin dynamics. Moreover, we apply the symmetric splitting scheme in our variance-reduced Hamiltonian Monte Carlo algorithms to further improve the theoretical results. The experimental results are also consistent with the theoretical results. As our experiment shows, variance-reduced Hamiltonian Monte Carlo demonstrates better performance than variance-reduced Langevin dynamics in Bayesian regression and classification tasks on real-world datasets.
Tasks Bayesian Inference
Published 2018-03-29
URL https://arxiv.org/abs/1803.11159v3
PDF https://arxiv.org/pdf/1803.11159v3.pdf
PWC https://paperswithcode.com/paper/stochastic-gradient-hamiltonian-monte-carlo-1
Repo
Framework

A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks

Title A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks
Authors Han Peng, Abolfazl Razi, Fatemeh Afghah, Jonathan Ashdown
Abstract In recent years, using a network of autonomous and cooperative unmanned aerial vehicles (UAVs) without command and communication from the ground station has become more imperative, in particular in search-and-rescue operations, disaster management, and other applications where human intervention is limited. In such scenarios, UAVs can make more efficient decisions if they acquire more information about the mobility, sensing and actuation capabilities of their neighbor nodes. In this paper, we develop an unsupervised online learning algorithm for joint mobility prediction and object profiling of UAVs to facilitate control and communication protocols. The proposed method not only predicts the future locations of the surrounding flying objects, but also classifies them into different groups with similar levels of maneuverability (e.g. rotatory, and fixed-wing UAVs) without prior knowledge about these classes. This method is flexible in admitting new object types with unknown mobility profiles, thereby applicable to emerging flying Ad-hoc networks with heterogeneous nodes.
Tasks
Published 2018-07-31
URL http://arxiv.org/abs/1808.00058v1
PDF http://arxiv.org/pdf/1808.00058v1.pdf
PWC https://paperswithcode.com/paper/a-unified-framework-for-joint-mobility
Repo
Framework

A note on concentration inequality for vector-valued martingales with weak exponential-type tails

Title A note on concentration inequality for vector-valued martingales with weak exponential-type tails
Authors Chris Junchi Li
Abstract We present novel martingale concentration inequalities for martingale differences with finite Orlicz-$\psi_\alpha$ norms. Such martingale differences with weak exponential-type tails scatters in many statistical applications and can be heavier than sub-exponential distributions. In the case of one dimension, we prove in general that for a sequence of scalar-valued supermartingale difference, the tail bound depends solely on the sum of squared Orlicz-$\psi_\alpha$ norms instead of the maximal Orlicz-$\psi_\alpha$ norm, generalizing the results of Lesigne & Voln'y (2001) and Fan et al. (2012). In the multidimensional case, using a dimension reduction lemma proposed by Kallenberg & Sztencel (1991) we show that essentially the same concentration tail bound holds for vector-valued martingale difference sequences.
Tasks Dimensionality Reduction
Published 2018-09-06
URL https://arxiv.org/abs/1809.02495v3
PDF https://arxiv.org/pdf/1809.02495v3.pdf
PWC https://paperswithcode.com/paper/a-note-on-concentration-inequality-for-vector
Repo
Framework

BCR-Net: a neural network based on the nonstandard wavelet form

Title BCR-Net: a neural network based on the nonstandard wavelet form
Authors Yuwei Fan, Cindy Orozco Bohorquez, Lexing Ying
Abstract This paper proposes a novel neural network architecture inspired by the nonstandard form proposed by Beylkin, Coifman, and Rokhlin in [Communications on Pure and Applied Mathematics, 44(2), 141-183]. The nonstandard form is a highly effective wavelet-based compression scheme for linear integral operators. In this work, we first represent the matrix-vector product algorithm of the nonstandard form as a linear neural network where every scale of the multiresolution computation is carried out by a locally connected linear sub-network. In order to address nonlinear problems, we propose an extension, called BCR-Net, by replacing each linear sub-network with a deeper and more powerful nonlinear one. Numerical results demonstrate the efficiency of the new architecture by approximating nonlinear maps that arise in homogenization theory and stochastic computation.
Tasks
Published 2018-10-20
URL http://arxiv.org/abs/1810.08754v1
PDF http://arxiv.org/pdf/1810.08754v1.pdf
PWC https://paperswithcode.com/paper/bcr-net-a-neural-network-based-on-the
Repo
Framework
Title A Unified Implicit Dialog Framework for Conversational Search
Authors Song Feng, R. Chulaka Gunasekara, Sunil Shashidhara, Kshitij P. Fadnis, Lazaros C. Polymenakos
Abstract We propose a unified Implicit Dialog framework for goal-oriented, information seeking tasks of Conversational Search applications. It aims to enable dialog interactions with domain data without replying on explicitly encoded the rules but utilizing the underlying data representation to build the components required for dialog interaction, which we refer as Implicit Dialog in this work. The proposed framework consists of a pipeline of End-to-End trainable modules. A centralized knowledge representation is used to semantically ground multiple dialog modules. An associated set of tools are integrated with the framework to gather end users’ input for continuous improvement of the system. The goal is to facilitate development of conversational systems by identifying the components and the data that can be adapted and reused across many end-user applications. We demonstrate our approach by creating conversational agents for several independent domains.
Tasks
Published 2018-02-12
URL http://arxiv.org/abs/1802.04358v1
PDF http://arxiv.org/pdf/1802.04358v1.pdf
PWC https://paperswithcode.com/paper/a-unified-implicit-dialog-framework-for
Repo
Framework

Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling

Title Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling
Authors Yunzhe Tao, Qi Sun, Qiang Du, Wei Liu
Abstract Nonlocal neural networks have been proposed and shown to be effective in several computer vision tasks, where the nonlocal operations can directly capture long-range dependencies in the feature space. In this paper, we study the nature of diffusion and damping effect of nonlocal networks by doing spectrum analysis on the weight matrices of the well-trained networks, and then propose a new formulation of the nonlocal block. The new block not only learns the nonlocal interactions but also has stable dynamics, thus allowing deeper nonlocal structures. Moreover, we interpret our formulation from the general nonlocal modeling perspective, where we make connections between the proposed nonlocal network and other nonlocal models, such as nonlocal diffusion process and Markov jump process.
Tasks
Published 2018-06-02
URL http://arxiv.org/abs/1806.00681v4
PDF http://arxiv.org/pdf/1806.00681v4.pdf
PWC https://paperswithcode.com/paper/nonlocal-neural-networks-nonlocal-diffusion
Repo
Framework

Aggregation using input-output trade-off

Title Aggregation using input-output trade-off
Authors Aurélie Fischer, Mathilde Mougeot
Abstract In this paper, we introduce a new learning strategy based on a seminal idea of Mojirsheibani (1999, 2000, 2002a, 2002b), who proposed a smart method for combining several classifiers, relying on a consensus notion. In many aggregation methods, the prediction for a new observation x is computed by building a linear or convex combination over a collection of basic estimators r1(x),. .. , rm(x) previously calibrated using a training data set. Mojirsheibani proposes to compute the prediction associated to a new observation by combining selected outputs of the training examples. The output of a training example is selected if some kind of consensus is observed: the predictions computed for the training example with the different machines have to be “similar” to the prediction for the new observation. This approach has been recently extended to the context of regression in Biau et al. (2016). In the original scheme, the agreement condition is actually required to hold for all individual estimators, which appears inadequate if there is one bad initial estimator. In practice, a few disagreements are allowed ; for establishing the theoretical results, the proportion of estimators satisfying the condition is required to tend to 1. In this paper, we propose an alternative procedure, mixing the previous consensus ideas on the predictions with the Euclidean distance computed between entries. This may be seen as an alternative approach allowing to reduce the effect of a possibly bad estimator in the initial list, using a constraint on the inputs. We prove the consistency of our strategy in classification and in regression. We also provide some numerical experiments on simulated and real data to illustrate the benefits of this new aggregation method. On the whole, our practical study shows that our method may perform much better than the original combination technique, and, in particular, exhibit far less variance. We also show on simulated examples that this procedure mixing inputs and outputs is still robust to high dimensional inputs.
Tasks
Published 2018-03-08
URL http://arxiv.org/abs/1803.03166v1
PDF http://arxiv.org/pdf/1803.03166v1.pdf
PWC https://paperswithcode.com/paper/aggregation-using-input-output-trade-off
Repo
Framework

Probabilistic Meta-Representations Of Neural Networks

Title Probabilistic Meta-Representations Of Neural Networks
Authors Theofanis Karaletsos, Peter Dayan, Zoubin Ghahramani
Abstract Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. This allows rich correlations between related weights, and can be seen as realizing a function prior with a Bayesian complexity regularizer ensuring simple solutions. We illustrate the resulting meta-representations and representations, elucidating the power of this prior.
Tasks
Published 2018-10-01
URL http://arxiv.org/abs/1810.00555v1
PDF http://arxiv.org/pdf/1810.00555v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-meta-representations-of-neural
Repo
Framework

Constructing Deep Neural Networks by Bayesian Network Structure Learning

Title Constructing Deep Neural Networks by Bayesian Network Structure Learning
Authors Raanan Y. Rohekar, Shami Nisimov, Yaniv Gurwicz, Guy Koren, Gal Novik
Abstract We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently. The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph. We prove that conditional-dependency relations among the latent variables in the generative graph are preserved in the class-conditional discriminative graph. We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy—state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU.
Tasks Image Classification
Published 2018-06-24
URL http://arxiv.org/abs/1806.09141v3
PDF http://arxiv.org/pdf/1806.09141v3.pdf
PWC https://paperswithcode.com/paper/constructing-deep-neural-networks-by-bayesian
Repo
Framework

Real-time Cardiovascular MR with Spatio-temporal Artifact Suppression using Deep Learning - Proof of Concept in Congenital Heart Disease

Title Real-time Cardiovascular MR with Spatio-temporal Artifact Suppression using Deep Learning - Proof of Concept in Congenital Heart Disease
Authors Andreas Hauptmann, Simon Arridge, Felix Lucka, Vivek Muthurangu, Jennifer A. Steeden
Abstract PURPOSE: Real-time assessment of ventricular volumes requires high acceleration factors. Residual convolutional neural networks (CNN) have shown potential for removing artifacts caused by data undersampling. In this study we investigated the effect of different radial sampling patterns on the accuracy of a CNN. We also acquired actual real-time undersampled radial data in patients with congenital heart disease (CHD), and compare CNN reconstruction to Compressed Sensing (CS). METHODS: A 3D (2D plus time) CNN architecture was developed, and trained using 2276 gold-standard paired 3D data sets, with 14x radial undersampling. Four sampling schemes were tested, using 169 previously unseen 3D ‘synthetic’ test data sets. Actual real-time tiny Golden Angle (tGA) radial SSFP data was acquired in 10 new patients (122 3D data sets), and reconstructed using the 3D CNN as well as a CS algorithm; GRASP. RESULTS: Sampling pattern was shown to be important for image quality, and accurate visualisation of cardiac structures. For actual real-time data, overall reconstruction time with CNN (including creation of aliased images) was shown to be more than 5x faster than GRASP. Additionally, CNN image quality and accuracy of biventricular volumes was observed to be superior to GRASP for the same raw data. CONCLUSION: This paper has demonstrated the potential for the use of a 3D CNN for deep de-aliasing of real-time radial data, within the clinical setting. Clinical measures of ventricular volumes using real-time data with CNN reconstruction are not statistically significantly different from the gold-standard, cardiac gated, BH techniques.
Tasks De-aliasing
Published 2018-03-14
URL http://arxiv.org/abs/1803.05192v3
PDF http://arxiv.org/pdf/1803.05192v3.pdf
PWC https://paperswithcode.com/paper/real-time-cardiovascular-mr-with-spatio
Repo
Framework

Video Storytelling: Textual Summaries for Events

Title Video Storytelling: Textual Summaries for Events
Authors Junnan Li, Yongkang Wong, Qi Zhao, Mohan S. Kankanhalli
Abstract Bridging vision and natural language is a longstanding goal in computer vision and multimedia research. While earlier works focus on generating a single-sentence description for visual content, recent works have studied paragraph generation. In this work, we introduce the problem of video storytelling, which aims at generating coherent and succinct stories for long videos. Video storytelling introduces new challenges, mainly due to the diversity of the story and the length and complexity of the video. We propose novel methods to address the challenges. First, we propose a context-aware framework for multimodal embedding learning, where we design a Residual Bidirectional Recurrent Neural Network to leverage contextual information from past and future. Second, we propose a Narrator model to discover the underlying storyline. The Narrator is formulated as a reinforcement learning agent which is trained by directly optimizing the textual metric of the generated story. We evaluate our method on the Video Story dataset, a new dataset that we have collected to enable the study. We compare our method with multiple state-of-the-art baselines, and show that our method achieves better performance, in terms of quantitative measures and user study.
Tasks
Published 2018-07-25
URL https://arxiv.org/abs/1807.09418v2
PDF https://arxiv.org/pdf/1807.09418v2.pdf
PWC https://paperswithcode.com/paper/video-storytelling
Repo
Framework

Deterministic Fitting of Multiple Structures using Iterative MaxFS with Inlier Scale Estimation and Subset Updating

Title Deterministic Fitting of Multiple Structures using Iterative MaxFS with Inlier Scale Estimation and Subset Updating
Authors Kwang Hee Lee, Sang Wook Lee
Abstract We present an efficient deterministic hypothesis generation algorithm for robust fitting of multiple structures based on the maximum feasible subsystem (MaxFS) framework. Despite its advantage, a global optimization method such as MaxFS has two main limitations for geometric model fitting. First, its performance is much influenced by the user-specified inlier scale. Second, it is computationally inefficient for large data. The presented MaxFS-based algorithm iteratively estimates model parameters and inlier scale and also overcomes the second limitation by reducing data for the MaxFS problem. Further it generates hypotheses only with top-n ranked subsets based on matching scores and data fitting residuals. This reduction of data for the MaxFS problem makes the algorithm computationally realistic. Our method, called iterative MaxFS with inlier scale estimation and subset updating (IMaxFS-ISE-SU) in this paper, performs hypothesis generation and fitting alternately until all of true structures are found. The IMaxFS-ISE-SU algorithm generates substantially more reliable hypotheses than random sampling-based methods especially as (pseudo-)outlier ratios increase. Experimental results demonstrate that our method can generate more reliable and consistent hypotheses than random sampling-based methods for estimating multiple structures from data with many outliers.
Tasks
Published 2018-07-24
URL http://arxiv.org/abs/1807.09210v1
PDF http://arxiv.org/pdf/1807.09210v1.pdf
PWC https://paperswithcode.com/paper/deterministic-fitting-of-multiple-structures
Repo
Framework

Communication Efficient Parallel Algorithms for Optimization on Manifolds

Title Communication Efficient Parallel Algorithms for Optimization on Manifolds
Authors Bayan Saparbayeva, Michael Minyi Zhang, Lizhen Lin
Abstract The last decade has witnessed an explosion in the development of models, theory and computational algorithms for “big data” analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical inference. However, the existing literature on parallel inference almost exclusively focuses on Euclidean data and parameters. While this assumption is valid for many applications, it is increasingly more common to encounter problems where the data or the parameters lie on a non-Euclidean space, like a manifold for example. Our work aims to fill a critical gap in the literature by generalizing parallel inference algorithms to optimization on manifolds. We show that our proposed algorithm is both communication efficient and carries theoretical convergence guarantees. In addition, we demonstrate the performance of our algorithm to the estimation of Fr'echet means on simulated spherical data and the low-rank matrix completion problem over Grassmann manifolds applied to the Netflix prize data set.
Tasks Low-Rank Matrix Completion, Matrix Completion
Published 2018-10-26
URL http://arxiv.org/abs/1810.11155v3
PDF http://arxiv.org/pdf/1810.11155v3.pdf
PWC https://paperswithcode.com/paper/communication-efficient-parallel-algorithms
Repo
Framework

Parallel Transport Unfolding: A Connection-based Manifold Learning Approach

Title Parallel Transport Unfolding: A Connection-based Manifold Learning Approach
Authors Max Budninskiy, Glorian Yin, Leman Feng, Yiying Tong, Mathieu Desbrun
Abstract Manifold learning offers nonlinear dimensionality reduction of high-dimensional datasets. In this paper, we bring geometry processing to bear on manifold learning by introducing a new approach based on metric connection for generating a quasi-isometric, low-dimensional mapping from a sparse and irregular sampling of an arbitrary manifold embedded in a high-dimensional space. Geodesic distances of discrete paths over the input pointset are evaluated through “parallel transport unfolding” (PTU) to offer robustness to poor sampling and arbitrary topology. Our new geometric procedure exhibits the same strong resilience to noise as one of the staples of manifold learning, the Isomap algorithm, as it also exploits all pairwise geodesic distances to compute a low-dimensional embedding. While Isomap is limited to geodesically-convex sampled domains, parallel transport unfolding does not suffer from this crippling limitation, resulting in an improved robustness to irregularity and voids in the sampling. Moreover, it involves only simple linear algebra, significantly improves the accuracy of all pairwise geodesic distance approximations, and has the same computational complexity as Isomap. Finally, we show that our connection-based distance estimation can be used for faster variants of Isomap such as L-Isomap.
Tasks Dimensionality Reduction
Published 2018-06-23
URL http://arxiv.org/abs/1806.09039v2
PDF http://arxiv.org/pdf/1806.09039v2.pdf
PWC https://paperswithcode.com/paper/parallel-transport-unfolding-a-connection
Repo
Framework

Sim-to-Real Reinforcement Learning for Deformable Object Manipulation

Title Sim-to-Real Reinforcement Learning for Deformable Object Manipulation
Authors Jan Matas, Stephen James, Andrew J. Davison
Abstract We have seen much recent progress in rigid object manipulation, but interaction with deformable objects has notably lagged behind. Due to the large configuration space of deformable objects, solutions using traditional modelling approaches require significant engineering work. Perhaps then, bypassing the need for explicit modelling and instead learning the control in an end-to-end manner serves as a better approach? Despite the growing interest in the use of end-to-end robot learning approaches, only a small amount of work has focused on their applicability to deformable object manipulation. Moreover, due to the large amount of data needed to learn these end-to-end solutions, an emerging trend is to learn control policies in simulation and then transfer them over to the real world. To-date, no work has explored whether it is possible to learn and transfer deformable object policies. We believe that if sim-to-real methods are to be employed further, then it should be possible to learn to interact with a wide variety of objects, and not only rigid objects. In this work, we use a combination of state-of-the-art deep reinforcement learning algorithms to solve the problem of manipulating deformable objects (specifically cloth). We evaluate our approach on three tasks — folding a towel up to a mark, folding a face towel diagonally, and draping a piece of cloth over a hanger. Our agents are fully trained in simulation with domain randomisation, and then successfully deployed in the real world without having seen any real deformable objects.
Tasks Deformable Object Manipulation
Published 2018-06-20
URL http://arxiv.org/abs/1806.07851v2
PDF http://arxiv.org/pdf/1806.07851v2.pdf
PWC https://paperswithcode.com/paper/sim-to-real-reinforcement-learning-for
Repo
Framework
comments powered by Disqus