January 31, 2020

3014 words 15 mins read

Paper Group AWR 417

Paper Group AWR 417

Ordered Memory. PlantDoc: A Dataset for Visual Plant Disease Detection. Depth Completion via Deep Basis Fitting. An Adaptive Empirical Bayesian Method for Sparse Deep Learning. PDP: A General Neural Framework for Learning Constraint Satisfaction Solvers. Privacy Risks of Securing Machine Learning Models against Adversarial Examples. Control Regular …

Ordered Memory

Title Ordered Memory
Authors Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville
Abstract Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory. We also introduce a new Gated Recursive Cell to compose lower-level representations into higher-level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015)and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford SentimentTreebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13466v2
PDF https://arxiv.org/pdf/1910.13466v2.pdf
PWC https://paperswithcode.com/paper/ordered-memory
Repo https://github.com/yikangshen/Ordered-Memory
Framework pytorch

PlantDoc: A Dataset for Visual Plant Disease Detection

Title PlantDoc: A Dataset for Visual Plant Disease Detection
Authors Davinder Singh, Naman Jain, Pranjali Jain, Pratik Kayal, Sudhakar Kumawat, Nipun Batra
Abstract India loses 35% of the annual crop yield due to plant diseases. Early detection of plant diseases remains difficult due to the lack of lab infrastructure and expertise. In this paper, we explore the possibility of computer vision approaches for scalable and early plant disease detection. The lack of availability of sufficiently large-scale non-lab data set remains a major challenge for enabling vision based plant disease detection. Against this background, we present PlantDoc: a dataset for visual plant disease detection. Our dataset contains 2,598 data points in total across 13 plant species and up to 17 classes of diseases, involving approximately 300 human hours of effort in annotating internet scraped images. To show the efficacy of our dataset, we learn 3 models for the task of plant disease classification. Our results show that modelling using our dataset can increase the classification accuracy by up to 31%. We believe that our dataset can help reduce the entry barrier of computer vision techniques in plant disease detection.
Tasks
Published 2019-11-23
URL https://arxiv.org/abs/1911.10317v1
PDF https://arxiv.org/pdf/1911.10317v1.pdf
PWC https://paperswithcode.com/paper/plantdoc-a-dataset-for-visual-plant-disease
Repo https://github.com/pratikkayal/PlantDoc-Dataset
Framework none

Depth Completion via Deep Basis Fitting

Title Depth Completion via Deep Basis Fitting
Authors Chao Qu, Ty Nguyen, Camillo J. Taylor
Abstract In this paper we consider the task of image-guided depth completion where our system must infer the depth at every pixel of an input image based on the image content and a sparse set of depth measurements. We propose a novel approach that builds upon the strengths of modern deep learning techniques and classical optimization algorithms and significantly improves performance. The proposed method replaces the final $1\times 1$ convolutional layer employed in most depth completion networks with a least squares fitting module which computes weights by fitting the implicit depth bases to the given sparse depth measurements. In addition, we show how our proposed method can be naturally extended to a multi-scale formulation for improved self-supervised training. We demonstrate through extensive experiments on various datasets that our approach achieves consistent improvements over state-of-the-art baseline methods with small computational overhead.
Tasks Depth Completion
Published 2019-12-21
URL https://arxiv.org/abs/1912.10336v1
PDF https://arxiv.org/pdf/1912.10336v1.pdf
PWC https://paperswithcode.com/paper/depth-completion-via-deep-basis-fitting
Repo https://github.com/versatran01/svpy
Framework none

An Adaptive Empirical Bayesian Method for Sparse Deep Learning

Title An Adaptive Empirical Bayesian Method for Sparse Deep Learning
Authors Wei Deng, Xiao Zhang, Faming Liang, Guang Lin
Abstract We propose a novel adaptive empirical Bayesian (AEB) method for sparse deep learning, where the sparsity is ensured via a class of self-adaptive spike-and-slab priors. The proposed method works by alternatively sampling from an adaptive hierarchical posterior distribution using stochastic gradient Markov Chain Monte Carlo (MCMC) and smoothly optimizing the hyperparameters using stochastic approximation (SA). We further prove the convergence of the proposed method to the asymptotically correct distribution under mild conditions. Empirical applications of the proposed method lead to the state-of-the-art performance on MNIST and Fashion MNIST with shallow convolutional neural networks (CNN) and the state-of-the-art compression performance on CIFAR10 with Residual Networks. The proposed method also improves resistance to adversarial attacks.
Tasks
Published 2019-10-23
URL https://arxiv.org/abs/1910.10791v1
PDF https://arxiv.org/pdf/1910.10791v1.pdf
PWC https://paperswithcode.com/paper/an-adaptive-empirical-bayesian-method-for
Repo https://github.com/WayneDW/Bayesian-Sparse-Deep-Learning
Framework pytorch

PDP: A General Neural Framework for Learning Constraint Satisfaction Solvers

Title PDP: A General Neural Framework for Learning Constraint Satisfaction Solvers
Authors Saeed Amizadeh, Sergiy Matusevych, Markus Weimer
Abstract There have been recent efforts for incorporating Graph Neural Network models for learning full-stack solvers for constraint satisfaction problems (CSP) and particularly Boolean satisfiability (SAT). Despite the unique representational power of these neural embedding models, it is not clear how the search strategy in the learned models actually works. On the other hand, by fixing the search strategy (e.g. greedy search), we would effectively deprive the neural models of learning better strategies than those given. In this paper, we propose a generic neural framework for learning CSP solvers that can be described in terms of probabilistic inference and yet learn search strategies beyond greedy search. Our framework is based on the idea of propagation, decimation and prediction (and hence the name PDP) in graphical models, and can be trained directly toward solving CSP in a fully unsupervised manner via energy minimization, as shown in the paper. Our experimental results demonstrate the effectiveness of our framework for SAT solving compared to both neural and the state-of-the-art baselines.
Tasks
Published 2019-03-05
URL http://arxiv.org/abs/1903.01969v1
PDF http://arxiv.org/pdf/1903.01969v1.pdf
PWC https://paperswithcode.com/paper/pdp-a-general-neural-framework-for-learning
Repo https://github.com/Microsoft/PDP-Solver
Framework pytorch

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

Title Privacy Risks of Securing Machine Learning Models against Adversarial Examples
Authors Liwei Song, Reza Shokri, Prateek Mittal
Abstract The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards resolving this limitation by combining the two domains. In particular, we measure the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples (i.e., evasion attacks). Membership inference attacks determine whether or not an individual data record has been part of a model’s training set. The accuracy of such attacks reflects the information leakage of training algorithms about individual members of the training set. Adversarial defense methods against adversarial examples influence the model’s decision boundaries such that model predictions remain unchanged for a small area around each input. However, this objective is optimized on training data. Thus, individual data records in the training set have a significant influence on robust models. This makes the models more vulnerable to inference attacks. To perform the membership inference attacks, we leverage the existing inference methods that exploit model predictions. We also propose two new inference methods that exploit structural properties of robust models on adversarially perturbed data. Our experimental evaluation demonstrates that compared with the natural training (undefended) approach, adversarial defense methods can indeed increase the target model’s risk against membership inference attacks.
Tasks Adversarial Defense, Inference Attack
Published 2019-05-24
URL https://arxiv.org/abs/1905.10291v3
PDF https://arxiv.org/pdf/1905.10291v3.pdf
PWC https://paperswithcode.com/paper/privacy-risks-of-securing-machine-learning
Repo https://github.com/inspire-group/privacy-vs-robustness
Framework tf

Control Regularization for Reduced Variance Reinforcement Learning

Title Control Regularization for Reduced Variance Reinforcement Learning
Authors Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick
Abstract Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.
Tasks Continuous Control
Published 2019-05-14
URL https://arxiv.org/abs/1905.05380v1
PDF https://arxiv.org/pdf/1905.05380v1.pdf
PWC https://paperswithcode.com/paper/control-regularization-for-reduced-variance
Repo https://github.com/rcheng805/CORE-RL
Framework tf

PointHop: An Explainable Machine Learning Method for Point Cloud Classification

Title PointHop: An Explainable Machine Learning Method for Point Cloud Classification
Authors Min Zhang, Haoxuan You, Pranav Kadam, Shan Liu, C. -C. Jay Kuo
Abstract An explainable machine learning method for point cloud classification, called the PointHop method, is proposed in this work. The PointHop method consists of two stages: 1) local-to-global attribute building through iterative one-hop information exchange, and 2) classification and ensembles. In the attribute building stage, we address the problem of unordered point cloud data using a space partitioning procedure and developing a robust descriptor that characterizes the relationship between a point and its one-hop neighbor in a PointHop unit. When we put multiple PointHop units in cascade, the attributes of a point will grow by taking its relationship with one-hop neighbor points into account iteratively. Furthermore, to control the rapid dimension growth of the attribute vector associated with a point, we use the Saab transform to reduce the attribute dimension in each PointHop unit. In the classification and ensemble stage, we feed the feature vector obtained from multiple PointHop units to a classifier. We explore ensemble methods to improve the classification performance furthermore. It is shown by experimental results that the PointHop method offers classification performance that is comparable with state-of-the-art methods while demanding much lower training complexity.
Tasks
Published 2019-07-30
URL https://arxiv.org/abs/1907.12766v2
PDF https://arxiv.org/pdf/1907.12766v2.pdf
PWC https://paperswithcode.com/paper/pointhop-an-explainable-machine-learning
Repo https://github.com/minzhang-1/PointHop
Framework pytorch

What Does BERT Look At? An Analysis of BERT’s Attention

Title What Does BERT Look At? An Analysis of BERT’s Attention
Authors Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning
Abstract Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most recent analysis has focused on model outputs (e.g., language model surprisal) or internal vector representations (e.g., probing classifiers). Complementary to these works, we propose methods for analyzing the attention mechanisms of pre-trained models and apply them to BERT. BERT’s attention heads exhibit patterns such as attending to delimiter tokens, specific positional offsets, or broadly attending over the whole sentence, with heads in the same layer often exhibiting similar behaviors. We further show that certain attention heads correspond well to linguistic notions of syntax and coreference. For example, we find heads that attend to the direct objects of verbs, determiners of nouns, objects of prepositions, and coreferent mentions with remarkably high accuracy. Lastly, we propose an attention-based probing classifier and use it to further demonstrate that substantial syntactic information is captured in BERT’s attention.
Tasks Language Modelling
Published 2019-06-11
URL https://arxiv.org/abs/1906.04341v1
PDF https://arxiv.org/pdf/1906.04341v1.pdf
PWC https://paperswithcode.com/paper/what-does-bert-look-at-an-analysis-of-berts
Repo https://github.com/clarkkev/attention-analysis
Framework tf

Mercator: uncovering faithful hyperbolic embeddings of complex networks

Title Mercator: uncovering faithful hyperbolic embeddings of complex networks
Authors Guillermo García-Pérez, Antoine Allard, M. Ángeles Serrano, Marián Boguñá
Abstract We introduce Mercator, a reliable embedding method to map real complex networks into their hyperbolic latent geometry. The method assumes that the structure of networks is well described by the Popularity$\times$Similarity $\mathbb{S}^1/\mathbb{H}^2$ static geometric network model, which can accommodate arbitrary degree distributions and reproduces many pivotal properties of real networks, including self-similarity patterns. The algorithm mixes machine learning and maximum likelihood approaches to infer the coordinates of the nodes in the underlying hyperbolic disk with the best matching between the observed network topology and the geometric model. In its fast mode, Mercator uses a model-adjusted machine learning technique performing dimensional reduction to produce a fast and accurate map, whose quality already outperform other embedding algorithms in the literature. In the refined Mercator mode, the fast-mode embedding result is taken as an initial condition in a Maximum Likelihood estimation, which significantly improves the quality of the final embedding. Apart from its accuracy as an embedding tool, Mercator has the clear advantage of systematically inferring not only node orderings, or angular positions, but also the hidden degrees and global model parameters, and has the ability to embed networks with arbitrary degree distributions. Overall, our results suggest that mixing machine learning and maximum likelihood techniques in a model-dependent framework can boost the meaningful mapping of complex networks.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10814v1
PDF http://arxiv.org/pdf/1904.10814v1.pdf
PWC https://paperswithcode.com/paper/mercator-uncovering-faithful-hyperbolic
Repo https://github.com/networkgeometry/mercator
Framework none

RayTracer.jl: A Differentiable Renderer that supports Parameter Optimization for Scene Reconstruction

Title RayTracer.jl: A Differentiable Renderer that supports Parameter Optimization for Scene Reconstruction
Authors Avik Pal
Abstract In this paper, we present RayTracer.jl, a renderer in Julia that is fully differentiable using source-to-source Automatic Differentiation (AD). This means that RayTracer not only renders 2D images from 3D scene parameters, but it can be used to optimize for model parameters that generate a target image in a Differentiable Programming (DP) pipeline. We interface our renderer with the deep learning library Flux for use in combination with neural networks. We demonstrate the use of this differentiable renderer in rendering tasks and in solving inverse graphics problems.
Tasks
Published 2019-07-16
URL https://arxiv.org/abs/1907.07198v3
PDF https://arxiv.org/pdf/1907.07198v3.pdf
PWC https://paperswithcode.com/paper/raytracerjl-a-differentiable-renderer-that
Repo https://github.com/avik-pal/RayTracer.jl
Framework none

Self-boosted Time-series Forecasting with Multi-task and Multi-view Learning

Title Self-boosted Time-series Forecasting with Multi-task and Multi-view Learning
Authors Long H. Nguyen, Zhenhe Pan, Opeyemi Openiyi, Hashim Abu-gellban, Mahdi Moghadasi, Fang Jin
Abstract A robust model for time series forecasting is highly important in many domains, including but not limited to financial forecast, air temperature and electricity consumption. To improve forecasting performance, traditional approaches usually require additional feature sets. However, adding more feature sets from different sources of data is not always feasible due to its accessibility limitation. In this paper, we propose a novel self-boosted mechanism in which the original time series is decomposed into multiple time series. These time series played the role of additional features in which the closely related time series group is used to feed into multi-task learning model, and the loosely related group is fed into multi-view learning part to utilize its complementary information. We use three real-world datasets to validate our model and show the superiority of our proposed method over existing state-of-the-art baseline methods.
Tasks Multi-Task Learning, MULTI-VIEW LEARNING, Time Series, Time Series Forecasting
Published 2019-09-17
URL https://arxiv.org/abs/1909.08181v1
PDF https://arxiv.org/pdf/1909.08181v1.pdf
PWC https://paperswithcode.com/paper/self-boosted-time-series-forecasting-with
Repo https://github.com/KurochkinAlexey/Self-boosted-Time-series-Forecasting
Framework pytorch

Efficient Plane-Based Optimization of Geometry and Texture for Indoor RGB-D Reconstruction

Title Efficient Plane-Based Optimization of Geometry and Texture for Indoor RGB-D Reconstruction
Authors Chao Wang, Xiaohu Guo
Abstract We propose a novel approach to reconstruct RGB-D indoor scene based on plane primitives. Our approach takes as input a RGB-D sequence and a dense coarse mesh reconstructed from it, and generates a lightweight, low-polygonal mesh with clear face textures and sharp features without losing geometry details from the original scene. Compared to existing methods which only cover large planar regions in the scene, our method builds the entire scene by adaptive planes without losing geometry details and also preserves sharp features in the mesh. Experiments show that our method is more efficient to generate textured mesh from RGB-D data than state-of-the-arts.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08853v1
PDF https://arxiv.org/pdf/1905.08853v1.pdf
PWC https://paperswithcode.com/paper/efficient-plane-based-optimization-of
Repo https://github.com/chaowang15/plane-opt-rgbd
Framework none

Deep Active Localization

Title Deep Active Localization
Authors Sai Krishna, Keehong Seo, Dhaivat Bhatt, Vincent Mai, Krishna Murthy, Liam Paull
Abstract Active localization is the problem of generating robot actions that allow it to maximally disambiguate its pose within a reference map. Traditional approaches to this use an information-theoretic criterion for action selection and hand-crafted perceptual models. In this work we propose an end-to-end differentiable method for learning to take informative actions that is trainable entirely in simulation and then transferable to real robot hardware with zero refinement. The system is composed of two modules: a convolutional neural network for perception, and a deep reinforcement learned planning module. We introduce a multi-scale approach to the learned perceptual model since the accuracy needed to perform action selection with reinforcement learning is much less than the accuracy needed for robot control. We demonstrate that the resulting system outperforms using the traditional approach for either perception or planning. We also demonstrate our approaches robustness to different map configurations and other nuisance parameters through the use of domain randomization in training. The code is also compatible with the OpenAI gym framework, as well as the Gazebo simulator.
Tasks
Published 2019-03-05
URL http://arxiv.org/abs/1903.01669v1
PDF http://arxiv.org/pdf/1903.01669v1.pdf
PWC https://paperswithcode.com/paper/deep-active-localization
Repo https://github.com/montrealrobotics/dal
Framework pytorch

Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models

Title Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
Authors Michael Oberst, David Sontag
Abstract We introduce an off-policy evaluation procedure for highlighting episodes where applying a reinforcement learned (RL) policy is likely to have produced a substantially different outcome than the observed policy. In particular, we introduce a class of structural causal models (SCMs) for generating counterfactual trajectories in finite partially observable Markov Decision Processes (POMDPs). We see this as a useful procedure for off-policy “debugging” in high-risk settings (e.g., healthcare); by decomposing the expected difference in reward between the RL and observed policy into specific episodes, we can identify episodes where the counterfactual difference in reward is most dramatic. This in turn can be used to facilitate review of specific episodes by domain experts. We demonstrate the utility of this procedure with a synthetic environment of sepsis management.
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.05824v3
PDF https://arxiv.org/pdf/1905.05824v3.pdf
PWC https://paperswithcode.com/paper/counterfactual-off-policy-evaluation-with
Repo https://github.com/clinicalml/gumbel-max-scm
Framework none
comments powered by Disqus