July 27, 2019

3157 words 15 mins read

Paper Group ANR 554

Paper Group ANR 554

Identification of Probabilities. Multi-Task Learning of Keyphrase Boundary Classification. Automatic Estimation of Ice Bottom Surfaces from Radar Imagery. Recurrent Segmentation for Variable Computational Budgets. Cosmological model discrimination with Deep Learning. Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions. Graph Co …

Identification of Probabilities

Title Identification of Probabilities
Authors Paul M. B. Vitanyi, Nick Chater
Abstract Within psychology, neuroscience and artificial intelligence, there has been increasing interest in the proposal that the brain builds probabilistic models of sensory and linguistic input: that is, to infer a probabilistic model from a sample. The practical problems of such inference are substantial: the brain has limited data and restricted computational resources. But there is a more fundamental question: is the problem of inferring a probabilistic model from a sample possible even in principle? We explore this question and find some surprisingly positive and general results. First, for a broad class of probability distributions characterised by computability restrictions, we specify a learning algorithm that will almost surely identify a probability distribution in the limit given a finite i.i.d. sample of sufficient but unknown length. This is similarly shown to hold for sequences generated by a broad class of Markov chains, subject to computability assumptions. The technical tool is the strong law of large numbers. Second, for a large class of dependent sequences, we specify an algorithm which identifies in the limit a computable measure for which the sequence is typical, in the sense of Martin-Lof (there may be more than one such measure). The technical tool is the theory of Kolmogorov complexity. We analyse the associated predictions in both cases. We also briefly consider special cases, including language learning, and wider theoretical implications for psychology.
Tasks
Published 2017-08-04
URL http://arxiv.org/abs/1708.01611v1
PDF http://arxiv.org/pdf/1708.01611v1.pdf
PWC https://paperswithcode.com/paper/identification-of-probabilities
Repo
Framework

Multi-Task Learning of Keyphrase Boundary Classification

Title Multi-Task Learning of Keyphrase Boundary Classification
Authors Isabelle Augenstein, Anders Søgaard
Abstract Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to predefined types. Although important in practice, this task is so far underexplored, partly due to the lack of labelled data. To overcome this, we explore several auxiliary tasks, including semantic super-sense tagging and identification of multi-word expressions, and cast the task as a multi-task learning problem with deep recurrent neural networks. Our multi-task models perform significantly better than previous state of the art approaches on two scientific KBC datasets, particularly for long keyphrases.
Tasks Multi-Task Learning
Published 2017-04-03
URL http://arxiv.org/abs/1704.00514v2
PDF http://arxiv.org/pdf/1704.00514v2.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-of-keyphrase-boundary
Repo
Framework

Automatic Estimation of Ice Bottom Surfaces from Radar Imagery

Title Automatic Estimation of Ice Bottom Surfaces from Radar Imagery
Authors Mingze Xu, David J Crandall, Geoffrey C Fox, John D Paden
Abstract Ground-penetrating radar on planes and satellites now makes it practical to collect 3D observations of the subsurface structure of the polar ice sheets, providing crucial data for understanding and tracking global climate change. But converting these noisy readings into useful observations is generally done by hand, which is impractical at a continental scale. In this paper, we propose a computer vision-based technique for extracting 3D ice-bottom surfaces by viewing the task as an inference problem on a probabilistic graphical model. We first generate a seed surface subject to a set of constraints, and then incorporate additional sources of evidence to refine it via discrete energy minimization. We evaluate the performance of the tracking algorithm on 7 topographic sequences (each with over 3000 radar images) collected from the Canadian Arctic Archipelago with respect to human-labeled ground truth.
Tasks
Published 2017-12-21
URL http://arxiv.org/abs/1712.07758v1
PDF http://arxiv.org/pdf/1712.07758v1.pdf
PWC https://paperswithcode.com/paper/automatic-estimation-of-ice-bottom-surfaces
Repo
Framework

Recurrent Segmentation for Variable Computational Budgets

Title Recurrent Segmentation for Variable Computational Budgets
Authors Lane McIntosh, Niru Maheswaranathan, David Sussillo, Jonathon Shlens
Abstract State-of-the-art systems for semantic image segmentation use feed-forward pipelines with fixed computational costs. Building an image segmentation system that works across a range of computational budgets is challenging and time-intensive as new architectures must be designed and trained for every computational setting. To address this problem we develop a recurrent neural network that successively improves prediction quality with each iteration. Importantly, the RNN may be deployed across a range of computational budgets by merely running the model for a variable number of iterations. We find that this architecture is uniquely suited for efficiently segmenting videos. By exploiting the segmentation of past frames, the RNN can perform video segmentation at similar quality but reduced computational cost compared to state-of-the-art image segmentation methods. When applied to static images in the PASCAL VOC 2012 and Cityscapes segmentation datasets, the RNN traces out a speed-accuracy curve that saturates near the performance of state-of-the-art segmentation methods.
Tasks Semantic Segmentation, Video Semantic Segmentation
Published 2017-11-28
URL http://arxiv.org/abs/1711.10151v2
PDF http://arxiv.org/pdf/1711.10151v2.pdf
PWC https://paperswithcode.com/paper/recurrent-segmentation-for-variable
Repo
Framework

Cosmological model discrimination with Deep Learning

Title Cosmological model discrimination with Deep Learning
Authors Jorit Schmelzle, Aurelien Lucchi, Tomasz Kacprzak, Adam Amara, Raphael Sgier, Alexandre Réfrégier, Thomas Hofmann
Abstract We demonstrate the potential of Deep Learning methods for measurements of cosmological parameters from density fields, focusing on the extraction of non-Gaussian information. We consider weak lensing mass maps as our dataset. We aim for our method to be able to distinguish between five models, which were chosen to lie along the $\sigma_8$ - $\Omega_m$ degeneracy, and have nearly the same two-point statistics. We design and implement a Deep Convolutional Neural Network (DCNN) which learns the relation between five cosmological models and the mass maps they generate. We develop a new training strategy which ensures the good performance of the network for high levels of noise. We compare the performance of this approach to commonly used non-Gaussian statistics, namely the skewness and kurtosis of the convergence maps. We find that our implementation of DCNN outperforms the skewness and kurtosis statistics, especially for high noise levels. The network maintains the mean discrimination efficiency greater than $85%$ even for noise levels corresponding to ground based lensing observations, while the other statistics perform worse in this setting, achieving efficiency less than $70%$. This demonstrates the ability of CNN-based methods to efficiently break the $\sigma_8$ - $\Omega_m$ degeneracy with weak lensing mass maps alone. We discuss the potential of this method to be applied to the analysis of real weak lensing data and other datasets.
Tasks
Published 2017-07-17
URL http://arxiv.org/abs/1707.05167v2
PDF http://arxiv.org/pdf/1707.05167v2.pdf
PWC https://paperswithcode.com/paper/cosmological-model-discrimination-with-deep
Repo
Framework

Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions

Title Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions
Authors Nadav Cohen, Ronen Tamari, Amnon Shashua
Abstract The driving force behind deep networks is their ability to compactly represent rich classes of functions. The primary notion for formally reasoning about this phenomenon is expressive efficiency, which refers to a situation where one network must grow unfeasibly large in order to realize (or approximate) functions of another. To date, expressive efficiency analyses focused on the architectural feature of depth, showing that deep networks are representationally superior to shallow ones. In this paper we study the expressive efficiency brought forth by connectivity, motivated by the observation that modern networks interconnect their layers in elaborate ways. We focus on dilated convolutional networks, a family of deep models delivering state of the art performance in sequence processing tasks. By introducing and analyzing the concept of mixed tensor decompositions, we prove that interconnecting dilated convolutional networks can lead to expressive efficiency. In particular, we show that even a single connection between intermediate layers can already lead to an almost quadratic gap, which in large-scale settings typically makes the difference between a model that is practical and one that is not. Empirical evaluation demonstrates how the expressive efficiency of connectivity, similarly to that of depth, translates into gains in accuracy. This leads us to believe that expressive efficiency may serve a key role in the development of new tools for deep network design.
Tasks
Published 2017-03-20
URL http://arxiv.org/abs/1703.06846v3
PDF http://arxiv.org/pdf/1703.06846v3.pdf
PWC https://paperswithcode.com/paper/boosting-dilated-convolutional-networks-with
Repo
Framework

Graph Convolution: A High-Order and Adaptive Approach

Title Graph Convolution: A High-Order and Adaptive Approach
Authors Zhenpeng Zhou, Xiaocheng Li
Abstract In this paper, we presented a novel convolutional neural network framework for graph modeling, with the introduction of two new modules specially designed for graph-structured data: the $k$-th order convolution operator and the adaptive filtering module. Importantly, our framework of High-order and Adaptive Graph Convolutional Network (HA-GCN) is a general-purposed architecture that fits various applications on both node and graph centrics, as well as graph generative models. We conducted extensive experiments on demonstrating the advantages of our framework. Particularly, our HA-GCN outperforms the state-of-the-art models on node classification and molecule property prediction tasks. It also generates 32% more real molecules on the molecule generation task, both of which will significantly benefit real-world applications such as material design and drug screening.
Tasks Node Classification
Published 2017-06-29
URL http://arxiv.org/abs/1706.09916v2
PDF http://arxiv.org/pdf/1706.09916v2.pdf
PWC https://paperswithcode.com/paper/graph-convolution-a-high-order-and-adaptive
Repo
Framework

Bayesian Semisupervised Learning with Deep Generative Models

Title Bayesian Semisupervised Learning with Deep Generative Models
Authors Jonathan Gordon, José Miguel Hernández-Lobato
Abstract Neural network based generative models with discriminative components are a powerful approach for semi-supervised learning. However, these techniques a) cannot account for model uncertainty in the estimation of the model’s discriminative component and b) lack flexibility to capture complex stochastic patterns in the label generation process. To avoid these problems, we first propose to use a discriminative component with stochastic inputs for increased noise flexibility. We show how an efficient Gibbs sampling procedure can marginalize the stochastic inputs when inferring missing labels in this model. Following this, we extend the discriminative component to be fully Bayesian and produce estimates of uncertainty in its parameter values. This opens the door for semi-supervised Bayesian active learning.
Tasks Active Learning
Published 2017-06-29
URL http://arxiv.org/abs/1706.09751v1
PDF http://arxiv.org/pdf/1706.09751v1.pdf
PWC https://paperswithcode.com/paper/bayesian-semisupervised-learning-with-deep
Repo
Framework

Machine learning for graph-based representations of three-dimensional discrete fracture networks

Title Machine learning for graph-based representations of three-dimensional discrete fracture networks
Authors Manuel Valera, Zhengyang Guo, Priscilla Kelly, Sean Matz, Vito Adrian Cantu, Allon G. Percus, Jeffrey D. Hyman, Gowri Srinivasan, Hari S. Viswanathan
Abstract Structural and topological information play a key role in modeling flow and transport through fractured rock in the subsurface. Discrete fracture network (DFN) computational suites such as dfnWorks are designed to simulate flow and transport in such porous media. Flow and transport calculations reveal that a small backbone of fractures exists, where most flow and transport occurs. Restricting the flowing fracture network to this backbone provides a significant reduction in the network’s effective size. However, the particle tracking simulations needed to determine the reduction are computationally intensive. Such methods may be impractical for large systems or for robust uncertainty quantification of fracture networks, where thousands of forward simulations are needed to bound system behavior. In this paper, we develop an alternative network reduction approach to characterizing transport in DFNs, by combining graph theoretical and machine learning methods. We consider a graph representation where nodes signify fractures and edges denote their intersections. Using random forest and support vector machines, we rapidly identify a subnetwork that captures the flow patterns of the full DFN, based primarily on node centrality features in the graph. Our supervised learning techniques train on particle-tracking backbone paths found by dfnWorks, but run in negligible time compared to those simulations. We find that our predictions can reduce the network to approximately 20% of its original size, while still generating breakthrough curves consistent with those of the original network.
Tasks
Published 2017-05-27
URL http://arxiv.org/abs/1705.09866v4
PDF http://arxiv.org/pdf/1705.09866v4.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-graph-based
Repo
Framework

A statistical model for aggregating judgments by incorporating peer predictions

Title A statistical model for aggregating judgments by incorporating peer predictions
Authors John McCoy, Drazen Prelec
Abstract We propose a probabilistic model to aggregate the answers of respondents answering multiple-choice questions. The model does not assume that everyone has access to the same information, and so does not assume that the consensus answer is correct. Instead, it infers the most probable world state, even if only a minority vote for it. Each respondent is modeled as receiving a signal contingent on the actual world state, and as using this signal to both determine their own answer and predict the answers given by others. By incorporating respondent’s predictions of others’ answers, the model infers latent parameters corresponding to the prior over world states and the probability of different signals being received in all possible world states, including counterfactual ones. Unlike other probabilistic models for aggregation, our model applies to both single and multiple questions, in which case it estimates each respondent’s expertise. The model shows good performance, compared to a number of other probabilistic models, on data from seven studies covering different types of expertise.
Tasks
Published 2017-03-14
URL http://arxiv.org/abs/1703.04778v1
PDF http://arxiv.org/pdf/1703.04778v1.pdf
PWC https://paperswithcode.com/paper/a-statistical-model-for-aggregating-judgments
Repo
Framework

Sum-Product Graphical Models

Title Sum-Product Graphical Models
Authors Mattia Desana, Christoph Schnörr
Abstract This paper introduces a new probabilistic architecture called Sum-Product Graphical Model (SPGM). SPGMs combine traits from Sum-Product Networks (SPNs) and Graphical Models (GMs): Like SPNs, SPGMs always enable tractable inference using a class of models that incorporate context specific independence. Like GMs, SPGMs provide a high-level model interpretation in terms of conditional independence assumptions and corresponding factorizations. Thus, the new architecture represents a class of probability distributions that combines, for the first time, the semantics of graphical models with the evaluation efficiency of SPNs. We also propose a novel algorithm for learning both the structure and the parameters of SPGMs. A comparative empirical evaluation demonstrates competitive performances of our approach in density estimation.
Tasks Density Estimation
Published 2017-08-21
URL http://arxiv.org/abs/1708.06438v1
PDF http://arxiv.org/pdf/1708.06438v1.pdf
PWC https://paperswithcode.com/paper/sum-product-graphical-models
Repo
Framework

A Joint 3D-2D based Method for Free Space Detection on Roads

Title A Joint 3D-2D based Method for Free Space Detection on Roads
Authors Suvam Patra, Pranjal Maheshwari, Shashank Yadav, Chetan Arora, Subhashis Banerjee
Abstract In this paper, we address the problem of road segmentation and free space detection in the context of autonomous driving. Traditional methods either use 3-dimensional (3D) cues such as point clouds obtained from LIDAR, RADAR or stereo cameras or 2-dimensional (2D) cues such as lane markings, road boundaries and object detection. Typical 3D point clouds do not have enough resolution to detect fine differences in heights such as between road and pavement. Image based 2D cues fail when encountering uneven road textures such as due to shadows, potholes, lane markings or road restoration. We propose a novel free road space detection technique combining both 2D and 3D cues. In particular, we use CNN based road segmentation from 2D images and plane/box fitting on sparse depth data obtained from SLAM as priors to formulate an energy minimization using conditional random field (CRF), for road pixels classification. While the CNN learns the road texture and is unaffected by depth boundaries, the 3D information helps in overcoming texture based classification failures. Finally, we use the obtained road segmentation with the 3D depth data from monocular SLAM to detect the free space for the navigation purposes. Our experiments on KITTI odometry dataset, Camvid dataset, as well as videos captured by us, validate the superiority of the proposed approach over the state of the art.
Tasks Autonomous Driving, Object Detection
Published 2017-11-06
URL http://arxiv.org/abs/1711.02144v3
PDF http://arxiv.org/pdf/1711.02144v3.pdf
PWC https://paperswithcode.com/paper/a-joint-3d-2d-based-method-for-free-space
Repo
Framework

On the (Statistical) Detection of Adversarial Examples

Title On the (Statistical) Detection of Adversarial Examples
Authors Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, Patrick McDaniel
Abstract Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understanding adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. Using thus knowledge, we introduce a complimentary approach to identify specific inputs that are adversarial. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs confidently at sample sizes between 10 and 100 data points. Furthermore, our augmented model either detects adversarial examples as outliers with high accuracy (> 80%) or increases the adversary’s cost - the perturbation added - by more than 150%. In this way, we show that statistical properties of adversarial examples are essential to their detection.
Tasks Intrusion Detection, Malware Classification, Network Intrusion Detection
Published 2017-02-21
URL http://arxiv.org/abs/1702.06280v2
PDF http://arxiv.org/pdf/1702.06280v2.pdf
PWC https://paperswithcode.com/paper/on-the-statistical-detection-of-adversarial
Repo
Framework

Patch-based adaptive weighting with segmentation and scale (PAWSS) for visual tracking

Title Patch-based adaptive weighting with segmentation and scale (PAWSS) for visual tracking
Authors Xiaofei Du, Alessio Dore, Danail Stoyanov
Abstract Tracking-by-detection algorithms are widely used for visual tracking, where the problem is treated as a classification task where an object model is updated over time using online learning techniques. In challenging conditions where an object undergoes deformation or scale variations, the update step is prone to include background information in the model appearance or to lack the ability to estimate the scale change, which degrades the performance of the classifier. In this paper, we incorporate a Patch-based Adaptive Weighting with Segmentation and Scale (PAWSS) tracking framework that tackles both the scale and background problems. A simple but effective colour-based segmentation model is used to suppress background information and multi-scale samples are extracted to enrich the training pool, which allows the tracker to handle both incremental and abrupt scale variations between frames. Experimentally, we evaluate our approach on the online tracking benchmark (OTB) dataset and Visual Object Tracking (VOT) challenge datasets. The results show that our approach outperforms recent state-of-the-art trackers, and it especially improves the successful rate score on the OTB dataset, while on the VOT datasets, PAWSS ranks among the top trackers while operating at real-time frame rates.
Tasks Object Tracking, Visual Object Tracking, Visual Tracking
Published 2017-08-03
URL http://arxiv.org/abs/1708.01179v1
PDF http://arxiv.org/pdf/1708.01179v1.pdf
PWC https://paperswithcode.com/paper/patch-based-adaptive-weighting-with
Repo
Framework

Stein Variational Policy Gradient

Title Stein Variational Policy Gradient
Authors Yang Liu, Prajit Ramachandran, Qiang Liu, Jian Peng
Abstract Policy gradient methods have been successfully applied to many complex reinforcement learning problems. However, policy gradient methods suffer from high variance, slow convergence, and inefficient exploration. In this work, we introduce a maximum entropy policy optimization framework which explicitly encourages parameter exploration, and show that this framework can be reduced to a Bayesian inference problem. We then propose a novel Stein variational policy gradient method (SVPG) which combines existing policy gradient methods and a repulsive functional to generate a set of diverse but well-behaved policies. SVPG is robust to initialization and can easily be implemented in a parallel manner. On continuous control problems, we find that implementing SVPG on top of REINFORCE and advantage actor-critic algorithms improves both average return and data efficiency.
Tasks Bayesian Inference, Continuous Control, Policy Gradient Methods
Published 2017-04-07
URL http://arxiv.org/abs/1704.02399v1
PDF http://arxiv.org/pdf/1704.02399v1.pdf
PWC https://paperswithcode.com/paper/stein-variational-policy-gradient
Repo
Framework
comments powered by Disqus