July 28, 2019

3154 words 15 mins read

Paper Group ANR 391

Paper Group ANR 391

Semiparametric spectral modeling of the Drosophila connectome. Scalable and Incremental Learning of Gaussian Mixture Models. Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality. Integration of Preferences in Decomposition Multi-Objective Optimization. TensorLog: Deep Learning Meets Probabilistic DBs. MR image re …

Semiparametric spectral modeling of the Drosophila connectome

Title Semiparametric spectral modeling of the Drosophila connectome
Authors Carey E. Priebe, Youngser Park, Minh Tang, Avanti Athreya, Vince Lyzinski, Joshua T. Vogelstein, Yichen Qin, Ben Cocanougher, Katharina Eichler, Marta Zlatic, Albert Cardona
Abstract We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome. Motivated by a thorough exploratory data analysis of the network via Gaussian mixture modeling (GMM) in the adjacency spectral embedding (ASE) representation space, we introduce the latent structure model (LSM) for network modeling and inference. LSM is a generalization of the stochastic block model (SBM) and a special case of the random dot product graph (RDPG) latent position model, and is amenable to semiparametric GMM in the ASE representation space. The resulting connectome code derived via semiparametric GMM composed with ASE captures latent connectome structure and elucidates biologically relevant neuronal properties.
Tasks
Published 2017-05-09
URL http://arxiv.org/abs/1705.03297v1
PDF http://arxiv.org/pdf/1705.03297v1.pdf
PWC https://paperswithcode.com/paper/semiparametric-spectral-modeling-of-the
Repo
Framework

Scalable and Incremental Learning of Gaussian Mixture Models

Title Scalable and Incremental Learning of Gaussian Mixture Models
Authors Rafael Pinto, Paulo Engel
Abstract This work presents a fast and scalable algorithm for incremental learning of Gaussian mixture models. By performing rank-one updates on its precision matrices and determinants, its asymptotic time complexity is of \BigO{NKD^2} for $N$ data points, $K$ Gaussian components and $D$ dimensions. The resulting algorithm can be applied to high dimensional tasks, and this is confirmed by applying it to the classification datasets MNIST and CIFAR-10. Additionally, in order to show the algorithm’s applicability to function approximation and control tasks, it is applied to three reinforcement learning tasks and its data-efficiency is evaluated.
Tasks
Published 2017-01-14
URL http://arxiv.org/abs/1701.03940v1
PDF http://arxiv.org/pdf/1701.03940v1.pdf
PWC https://paperswithcode.com/paper/scalable-and-incremental-learning-of-gaussian
Repo
Framework

Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality

Title Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality
Authors Woon Sang Cho, Mengdi Wang
Abstract We develop a parameterized Primal-Dual $\pi$ Learning method based on deep neural networks for Markov decision process with large state space and off-policy reinforcement learning. In contrast to the popular Q-learning and actor-critic methods that are based on successive approximations to the nonlinear Bellman equation, our method makes primal-dual updates to the policy and value functions utilizing the fundamental linear Bellman duality. Naive parametrization of the primal-dual $\pi$ learning method using deep neural networks would encounter two major challenges: (1) each update requires computing a probability distribution over the state space and is intractable; (2) the iterates are unstable since the parameterized Lagrangian function is no longer linear. We address these challenges by proposing a relaxed Lagrangian formulation with a regularization penalty using the advantage function. We show that the dual policy update step in our method is equivalent to the policy gradient update in the actor-critic method in some special case, while the value updates differ substantially. The main advantage of the primal-dual $\pi$ learning method lies in that the value and policy updates are closely coupled together using the Bellman duality and therefore more informative. Experiments on a simple cart-pole problem show that the algorithm significantly outperforms the one-step temporal-difference actor-critic method, which is the most relevant benchmark method to compare with. We believe that the primal-dual updates to the value and policy functions would expedite the learning process. The proposed methods might open a door to more efficient algorithms and sharper theoretical analysis.
Tasks Q-Learning
Published 2017-12-07
URL http://arxiv.org/abs/1712.02467v1
PDF http://arxiv.org/pdf/1712.02467v1.pdf
PWC https://paperswithcode.com/paper/deep-primal-dual-reinforcement-learning
Repo
Framework

Integration of Preferences in Decomposition Multi-Objective Optimization

Title Integration of Preferences in Decomposition Multi-Objective Optimization
Authors Ke Li, Kalyanmoy Deb, Xin Yao
Abstract Most existing studies on evolutionary multi-objective optimization focus on approximating the whole Pareto-optimal front. Nevertheless, rather than the whole front, which demands for too many points (especially in a high-dimensional space), the decision maker might only interest in a partial region, called the region of interest. In this case, solutions outside this region can be noisy to the decision making procedure. Even worse, there is no guarantee that we can find the preferred solutions when tackling problems with complicated properties or a large number of objectives. In this paper, we develop a systematic way to incorporate the decision maker’s preference information into the decomposition-based evolutionary multi-objective optimization methods. Generally speaking, our basic idea is a non-uniform mapping scheme by which the originally uniformly distributed reference points on a canonical simplex can be mapped to the new positions close to the aspiration level vector specified by the decision maker. By these means, we are able to steer the search process towards the region of interest either directly or in an interactive manner and also handle a large number of objectives. In the meanwhile, the boundary solutions can be approximated given the decision maker’s requirements. Furthermore, the extent of the region of the interest is intuitively understandable and controllable in a closed form. Extensive experiments, both proof-of-principle and on a variety of problems with 3 to 10 objectives, fully demonstrate the effectiveness of our proposed method for approximating the preferred solutions in the region of interest.
Tasks Decision Making
Published 2017-01-20
URL http://arxiv.org/abs/1701.05935v1
PDF http://arxiv.org/pdf/1701.05935v1.pdf
PWC https://paperswithcode.com/paper/integration-of-preferences-in-decomposition
Repo
Framework

TensorLog: Deep Learning Meets Probabilistic DBs

Title TensorLog: Deep Learning Meets Probabilistic DBs
Authors William W. Cohen, Fan Yang, Kathryn Rivard Mazaitis
Abstract We present an implementation of a probabilistic first-order logic called TensorLog, in which classes of logical queries are compiled into differentiable functions in a neural-network infrastructure such as Tensorflow or Theano. This leads to a close integration of probabilistic logical reasoning with deep-learning infrastructure: in particular, it enables high-performance deep learning frameworks to be used for tuning the parameters of a probabilistic logic. Experimental results show that TensorLog scales to problems involving hundreds of thousands of knowledge-base triples and tens of thousands of examples.
Tasks
Published 2017-07-17
URL http://arxiv.org/abs/1707.05390v1
PDF http://arxiv.org/pdf/1707.05390v1.pdf
PWC https://paperswithcode.com/paper/tensorlog-deep-learning-meets-probabilistic
Repo
Framework

MR image reconstruction using deep density priors

Title MR image reconstruction using deep density priors
Authors Kerem C. Tezcan, Christian F. Baumgartner, Roger Luechinger, Klaas P. Pruessmann, Ender Konukoglu
Abstract Algorithms for Magnetic Resonance (MR) image reconstruction from undersampled measurements exploit prior information to compensate for missing k-space data. Deep learning (DL) provides a powerful framework for extracting such information from existing image datasets, through learning, and then using it for reconstruction. Leveraging this, recent methods employed DL to learn mappings from undersampled to fully sampled images using paired datasets, including undersampled and corresponding fully sampled images, integrating prior knowledge implicitly. In this article, we propose an alternative approach that learns the probability distribution of fully sampled MR images using unsupervised DL, specifically Variational Autoencoders (VAE), and use this as an explicit prior term in reconstruction, completely decoupling the encoding operation from the prior. The resulting reconstruction algorithm enjoys a powerful image prior to compensate for missing k-space data without requiring paired datasets for training nor being prone to associated sensitivities, such as deviations in undersampling patterns used in training and test time or coil settings. We evaluated the proposed method with T1 weighted images from a publicly available dataset, multi-coil complex images acquired from healthy volunteers (N=8) and images with white matter lesions. The proposed algorithm, using the VAE prior, produced visually high quality reconstructions and achieved low RMSE values, outperforming most of the alternative methods on the same dataset. On multi-coil complex data, the algorithm yielded accurate magnitude and phase reconstruction results. In the experiments on images with white matter lesions, the method faithfully reconstructed the lesions. Keywords: Reconstruction, MRI, prior probability, machine learning, deep learning, unsupervised learning, density estimation
Tasks Density Estimation, Image Reconstruction
Published 2017-11-30
URL http://arxiv.org/abs/1711.11386v4
PDF http://arxiv.org/pdf/1711.11386v4.pdf
PWC https://paperswithcode.com/paper/mr-image-reconstruction-using-deep-density
Repo
Framework

Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent

Title Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent
Authors Yudong Chen, Lili Su, Jiaming Xu
Abstract We consider the problem of distributed statistical machine learning in adversarial settings, where some unknown and time-varying subset of working machines may be compromised and behave arbitrarily to prevent an accurate model from being learned. This setting captures the potential adversarial attacks faced by Federated Learning – a modern machine learning paradigm that is proposed by Google researchers and has been intensively studied for ensuring user privacy. Formally, we focus on a distributed system consisting of a parameter server and $m$ working machines. Each working machine keeps $N/m$ data samples, where $N$ is the total number of samples. The goal is to collectively learn the underlying true model parameter of dimension $d$. In classical batch gradient descent methods, the gradients reported to the server by the working machines are aggregated via simple averaging, which is vulnerable to a single Byzantine failure. In this paper, we propose a Byzantine gradient descent method based on the geometric median of means of the gradients. We show that our method can tolerate $q \le (m-1)/2$ Byzantine failures, and the parameter estimate converges in $O(\log N)$ rounds with an estimation error of $\sqrt{d(2q+1)/N}$, hence approaching the optimal error rate $\sqrt{d/N}$ in the centralized and failure-free setting. The total computational complexity of our algorithm is of $O((Nd/m) \log N)$ at each working machine and $O(md + kd \log^3 N)$ at the central server, and the total communication cost is of $O(m d \log N)$. We further provide an application of our general results to the linear regression problem. A key challenge arises in the above problem is that Byzantine failures create arbitrary and unspecified dependency among the iterations and the aggregated gradients. We prove that the aggregated gradient converges uniformly to the true gradient function.
Tasks
Published 2017-05-16
URL http://arxiv.org/abs/1705.05491v2
PDF http://arxiv.org/pdf/1705.05491v2.pdf
PWC https://paperswithcode.com/paper/distributed-statistical-machine-learning-in
Repo
Framework

Deep Reinforcement Learning for Sepsis Treatment

Title Deep Reinforcement Learning for Sepsis Treatment
Authors Aniruddh Raghu, Matthieu Komorowski, Imran Ahmed, Leo Celi, Peter Szolovits, Marzyeh Ghassemi
Abstract Sepsis is a leading cause of mortality in intensive care units and costs hospitals billions annually. Treating a septic patient is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we propose an approach to deduce treatment policies for septic patients by using continuous state-space models and deep reinforcement learning. Our model learns clinically interpretable treatment policies, similar in important aspects to the treatment policies of physicians. The learned policies could be used to aid intensive care clinicians in medical decision making and improve the likelihood of patient survival.
Tasks Decision Making
Published 2017-11-27
URL http://arxiv.org/abs/1711.09602v1
PDF http://arxiv.org/pdf/1711.09602v1.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-for-sepsis
Repo
Framework

The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations

Title The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations
Authors Nikita Nangia, Adina Williams, Angeliki Lazaridou, Samuel R. Bowman
Abstract This paper presents the results of the RepEval 2017 Shared Task, which evaluated neural network sentence representation learning models on the Multi-Genre Natural Language Inference corpus (MultiNLI) recently introduced by Williams et al. (2017). All of the five participating teams beat the bidirectional LSTM (BiLSTM) and continuous bag of words baselines reported in Williams et al.. The best single model used stacked BiLSTMs with residual connections to extract sentence features and reached 74.5% accuracy on the genre-matched test set. Surprisingly, the results of the competition were fairly consistent across the genre-matched and genre-mismatched test sets, and across subsets of the test data representing a variety of linguistic phenomena, suggesting that all of the submitted systems learned reasonably domain-independent representations for sentence meaning.
Tasks Natural Language Inference, Representation Learning
Published 2017-07-25
URL http://arxiv.org/abs/1707.08172v1
PDF http://arxiv.org/pdf/1707.08172v1.pdf
PWC https://paperswithcode.com/paper/the-repeval-2017-shared-task-multi-genre
Repo
Framework

EDEN: Evolutionary Deep Networks for Efficient Machine Learning

Title EDEN: Evolutionary Deep Networks for Efficient Machine Learning
Authors Emmanuel Dufourq, Bruce A. Bassett
Abstract Deep neural networks continue to show improved performance with increasing depth, an encouraging trend that implies an explosion in the possible permutations of network architectures and hyperparameters for which there is little intuitive guidance. To address this increasing complexity, we propose Evolutionary DEep Networks (EDEN), a computationally efficient neuro-evolutionary algorithm which interfaces to any deep neural network platform, such as TensorFlow. We show that EDEN evolves simple yet successful architectures built from embedding, 1D and 2D convolutional, max pooling and fully connected layers along with their hyperparameters. Evaluation of EDEN across seven image and sentiment classification datasets shows that it reliably finds good networks – and in three cases achieves state-of-the-art results – even on a single GPU, in just 6-24 hours. Our study provides a first attempt at applying neuro-evolution to the creation of 1D convolutional networks for sentiment analysis including the optimisation of the embedding layer.
Tasks Sentiment Analysis
Published 2017-09-26
URL http://arxiv.org/abs/1709.09161v1
PDF http://arxiv.org/pdf/1709.09161v1.pdf
PWC https://paperswithcode.com/paper/eden-evolutionary-deep-networks-for-efficient
Repo
Framework

Unbiased Online Recurrent Optimization

Title Unbiased Online Recurrent Optimization
Authors Corentin Tallec, Yann Ollivier
Abstract The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for online learning of general recurrent computational graphs such as recurrent network models. It works in a streaming fashion and avoids backtracking through past activations and inputs. UORO is computationally as costly as Truncated Backpropagation Through Time (truncated BPTT), a widespread algorithm for online learning of recurrent networks. UORO is a modification of NoBackTrack that bypasses the need for model sparsity and makes implementation easy in current deep learning frameworks, even for complex models. Like NoBackTrack, UORO provides unbiased gradient estimates; unbiasedness is the core hypothesis in stochastic gradient descent theory, without which convergence to a local optimum is not guaranteed. On the contrary, truncated BPTT does not provide this property, leading to possible divergence. On synthetic tasks where truncated BPTT is shown to diverge, UORO converges. For instance, when a parameter has a positive short-term but negative long-term influence, truncated BPTT diverges unless the truncation span is very significantly longer than the intrinsic temporal range of the interactions, while UORO performs well thanks to the unbiasedness of its gradients.
Tasks
Published 2017-02-16
URL http://arxiv.org/abs/1702.05043v3
PDF http://arxiv.org/pdf/1702.05043v3.pdf
PWC https://paperswithcode.com/paper/unbiased-online-recurrent-optimization
Repo
Framework

Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks

Title Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks
Authors Hao Zhang, Shizhen Xu, Graham Neubig, Wei Dai, Qirong Ho, Guangwen Yang, Eric P. Xing
Abstract Recent deep learning (DL) models have moved beyond static network architectures to dynamic ones, handling data where the network structure changes every example, such as sequences of variable lengths, trees, and graphs. Existing dataflow-based programming models for DL—both static and dynamic declaration—either cannot readily express these dynamic models, or are inefficient due to repeated dataflow graph construction and processing, and difficulties in batched execution. We present Cavs, a vertex-centric programming interface and optimized system implementation for dynamic DL models. Cavs represents dynamic network structure as a static vertex function $\mathcal{F}$ and a dynamic instance-specific graph $\mathcal{G}$, and performs backpropagation by scheduling the execution of $\mathcal{F}$ following the dependencies in $\mathcal{G}$. Cavs bypasses expensive graph construction and preprocessing overhead, allows for the use of static graph optimization techniques on pre-defined operations in $\mathcal{F}$, and naturally exposes batched execution opportunities over different graphs. Experiments comparing Cavs to two state-of-the-art frameworks for dynamic NNs (TensorFlow Fold and DyNet) demonstrate the efficacy of this approach: Cavs achieves a near one order of magnitude speedup on training of various dynamic NN architectures, and ablations demonstrate the contribution of our proposed batching and memory management strategies.
Tasks graph construction
Published 2017-12-11
URL http://arxiv.org/abs/1712.04048v1
PDF http://arxiv.org/pdf/1712.04048v1.pdf
PWC https://paperswithcode.com/paper/cavs-a-vertex-centric-programming-interface
Repo
Framework

Explaining Trained Neural Networks with Semantic Web Technologies: First Steps

Title Explaining Trained Neural Networks with Semantic Web Technologies: First Steps
Authors Md Kamruzzaman Sarker, Ning Xie, Derek Doran, Michael Raymer, Pascal Hitzler
Abstract The ever increasing prevalence of publicly available structured data on the World Wide Web enables new applications in a variety of domains. In this paper, we provide a conceptual approach that leverages such data in order to explain the input-output behavior of trained artificial neural networks. We apply existing Semantic Web technologies in order to provide an experimental proof of concept.
Tasks
Published 2017-10-11
URL http://arxiv.org/abs/1710.04324v1
PDF http://arxiv.org/pdf/1710.04324v1.pdf
PWC https://paperswithcode.com/paper/explaining-trained-neural-networks-with
Repo
Framework

Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion

Title Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion
Authors Runmin Cong, Jianjun Lei, Changqing Zhang, Qingming Huang, Xiaochun Cao, Chunping Hou
Abstract Stereoscopic perception is an important part of human visual system that allows the brain to perceive depth. However, depth information has not been well explored in existing saliency detection models. In this letter, a novel saliency detection method for stereoscopic images is proposed. Firstly, we propose a measure to evaluate the reliability of depth map, and use it to reduce the influence of poor depth map on saliency detection. Then, the input image is represented as a graph, and the depth information is introduced into graph construction. After that, a new definition of compactness using color and depth cues is put forward to compute the compactness saliency map. In order to compensate the detection errors of compactness saliency when the salient regions have similar appearances with background, foreground saliency map is calculated based on depth-refined foreground seeds selection mechanism and multiple cues contrast. Finally, these two saliency maps are integrated into a final saliency map through weighted-sum method according to their importance. Experiments on two publicly available stereo datasets demonstrate that the proposed method performs better than other 10 state-of-the-art approaches.
Tasks graph construction, Saliency Detection
Published 2017-10-14
URL http://arxiv.org/abs/1710.05174v1
PDF http://arxiv.org/pdf/1710.05174v1.pdf
PWC https://paperswithcode.com/paper/saliency-detection-for-stereoscopic-images
Repo
Framework

GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion

Title GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion
Authors Qiaodong Cui, Victor Fragoso, Chris Sweeney, Pradeep Sen
Abstract We present GraphMatch, an approximate yet efficient method for building the matching graph for large-scale structure-from-motion (SfM) pipelines. Unlike modern SfM pipelines that use vocabulary (Voc.) trees to quickly build the matching graph and avoid a costly brute-force search of matching image pairs, GraphMatch does not require an expensive offline pre-processing phase to construct a Voc. tree. Instead, GraphMatch leverages two priors that can predict which image pairs are likely to match, thereby making the matching process for SfM much more efficient. The first is a score computed from the distance between the Fisher vectors of any two images. The second prior is based on the graph distance between vertices in the underlying matching graph. GraphMatch combines these two priors into an iterative “sample-and-propagate” scheme similar to the PatchMatch algorithm. Its sampling stage uses Fisher similarity priors to guide the search for matching image pairs, while its propagation stage explores neighbors of matched pairs to find new ones with a high image similarity score. Our experiments show that GraphMatch finds the most image pairs as compared to competing, approximate methods while at the same time being the most efficient.
Tasks graph construction
Published 2017-10-04
URL http://arxiv.org/abs/1710.01602v1
PDF http://arxiv.org/pdf/1710.01602v1.pdf
PWC https://paperswithcode.com/paper/graphmatch-efficient-large-scale-graph
Repo
Framework
comments powered by Disqus