Paper Group AWR 15
Gaussian Process Behaviour in Wide Deep Neural Networks. Marginal Policy Gradients: A Unified Family of Estimators for Bounded Action Spaces with Applications. Automated Directed Fairness Testing. Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication. Modeling Irregularly Sampled Clinical Time Series. Pol …
Gaussian Process Behaviour in Wide Deep Neural Networks
Title | Gaussian Process Behaviour in Wide Deep Neural Networks |
Authors | Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani |
Abstract | Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward networks with more than one hidden layer and Gaussian processes with a recursive kernel definition. We show that, under broad conditions, as we make the architecture increasingly wide, the implied random function converges in distribution to a Gaussian process, formalising and extending existing results by Neal (1996) to deep networks. To evaluate convergence rates empirically, we use maximum mean discrepancy. We then compare finite Bayesian deep networks from the literature to Gaussian processes in terms of the key predictive quantities of interest, finding that in some cases the agreement can be very close. We discuss the desirability of Gaussian process behaviour and review non-Gaussian alternative models from the literature. |
Tasks | Gaussian Processes |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11271v2 |
http://arxiv.org/pdf/1804.11271v2.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-process-behaviour-in-wide-deep |
Repo | https://github.com/widedeepnetworks/widedeepnetworks |
Framework | pytorch |
Marginal Policy Gradients: A Unified Family of Estimators for Bounded Action Spaces with Applications
Title | Marginal Policy Gradients: A Unified Family of Estimators for Bounded Action Spaces with Applications |
Authors | Carson Eisenach, Haichuan Yang, Ji Liu, Han Liu |
Abstract | Many complex domains, such as robotics control and real-time strategy (RTS) games, require an agent to learn a continuous control. In the former, an agent learns a policy over $\mathbb{R}^d$ and in the latter, over a discrete set of actions each of which is parametrized by a continuous parameter. Such problems are naturally solved using policy based reinforcement learning (RL) methods, but unfortunately these often suffer from high variance leading to instability and slow convergence. Unnecessary variance is introduced whenever policies over bounded action spaces are modeled using distributions with unbounded support by applying a transformation $T$ to the sampled action before execution in the environment. Recently, the variance reduced clipped action policy gradient (CAPG) was introduced for actions in bounded intervals, but to date no variance reduced methods exist when the action is a direction, something often seen in RTS games. To this end we introduce the angular policy gradient (APG), a stochastic policy gradient method for directional control. With the marginal policy gradients family of estimators we present a unified analysis of the variance reduction properties of APG and CAPG; our results provide a stronger guarantee than existing analyses for CAPG. Experimental results on a popular RTS game and a navigation task show that the APG estimator offers a substantial improvement over the standard policy gradient. |
Tasks | Continuous Control |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.05134v3 |
http://arxiv.org/pdf/1806.05134v3.pdf | |
PWC | https://paperswithcode.com/paper/marginal-policy-gradients-a-unified-family-of |
Repo | https://github.com/ceisenach/MPG |
Framework | pytorch |
Automated Directed Fairness Testing
Title | Automated Directed Fairness Testing |
Authors | Sakshi Udeshi, Pryanshu Arora, Sudipta Chattopadhyay |
Abstract | Fairness is a critical trait in decision making. As machine-learning models are increasingly being used in sensitive application domains (e.g. education and employment) for decision making, it is crucial that the decisions computed by such models are free of unintended bias. But how can we automatically validate the fairness of arbitrary machine-learning models? For a given machine-learning model and a set of sensitive input parameters, our AEQUITAS approach automatically discovers discriminatory inputs that highlight fairness violation. At the core of AEQUITAS are three novel strategies to employ probabilistic search over the input space with the objective of uncovering fairness violation. Our AEQUITAS approach leverages inherent robustness property in common machine-learning models to design and implement scalable test generation methodologies. An appealing feature of our generated test inputs is that they can be systematically added to the training set of the underlying model and improve its fairness. To this end, we design a fully automated module that guarantees to improve the fairness of the underlying model. We implemented AEQUITAS and we have evaluated it on six state-of-the-art classifiers, including a classifier that was designed with fairness constraints. We show that AEQUITAS effectively generates inputs to uncover fairness violation in all the subject classifiers and systematically improves the fairness of the respective models using the generated test inputs. In our evaluation, AEQUITAS generates up to 70% discriminatory inputs (w.r.t. the total number of inputs generated) and leverages these inputs to improve the fairness up to 94%. |
Tasks | Decision Making |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00468v2 |
http://arxiv.org/pdf/1807.00468v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-directed-fairness-testing |
Repo | https://github.com/sakshiudeshi/Aequitas |
Framework | none |
Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication
Title | Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication |
Authors | Jayadev Acharya, Ziteng Sun, Huanyu Zhang |
Abstract | We study the problem of estimating $k$-ary distributions under $\varepsilon$-local differential privacy. $n$ samples are distributed across users who send privatized versions of their sample to a central server. All previously known sample optimal algorithms require linear (in $k$) communication from each user in the high privacy regime $(\varepsilon=O(1))$, and run in time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$. We propose Hadamard Response (HR}, a local privatization scheme that requires no shared randomness and is symmetric with respect to the users. Our scheme has order optimal sample complexity for all $\varepsilon$, a communication of at most $\log k+2$ bits per user, and nearly linear running time of $\tilde{O}(n + k)$. Our encoding and decoding are based on Hadamard matrices, and are simple to implement. The statistical performance relies on the coding theoretic aspects of Hadamard matrices, ie, the large Hamming distance between the rows. An efficient implementation of the algorithm using the Fast Walsh-Hadamard transform gives the computational gains. We compare our approach with Randomized Response (RR), RAPPOR, and subset-selection mechanisms (SS), both theoretically, and experimentally. For $k=10000$, our algorithm runs about 100x faster than SS, and RAPPOR. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04705v2 |
http://arxiv.org/pdf/1802.04705v2.pdf | |
PWC | https://paperswithcode.com/paper/hadamard-response-estimating-distributions |
Repo | https://github.com/jlyx417353617/hadamard_response |
Framework | none |
Modeling Irregularly Sampled Clinical Time Series
Title | Modeling Irregularly Sampled Clinical Time Series |
Authors | Satya Narayan Shukla, Benjamin M. Marlin |
Abstract | While the volume of electronic health records (EHR) data continues to grow, it remains rare for hospital systems to capture dense physiological data streams, even in the data-rich intensive care unit setting. Instead, typical EHR records consist of sparse and irregularly observed multivariate time series, which are well understood to present particularly challenging problems for machine learning methods. In this paper, we present a new deep learning architecture for addressing this problem based on the use of a semi-parametric interpolation network followed by the application of a prediction network. The interpolation network allows for information to be shared across multiple dimensions during the interpolation stage, while any standard deep learning model can be used for the prediction network. We investigate the performance of this architecture on the problems of mortality and length of stay prediction. |
Tasks | Length-of-Stay prediction, Time Series |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00531v1 |
http://arxiv.org/pdf/1812.00531v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-irregularly-sampled-clinical-time |
Repo | https://github.com/mlds-lab/interp-net |
Framework | tf |
Polarimetric Convolutional Network for PolSAR Image Classification
Title | Polarimetric Convolutional Network for PolSAR Image Classification |
Authors | Xu Liu, Licheng Jiao, Xu Tang, Qigong Sun, Dan Zhang |
Abstract | The approaches for analyzing the polarimetric scattering matrix of polarimetric synthetic aperture radar (PolSAR) data have always been the focus of PolSAR image classification. Generally, the polarization coherent matrix and the covariance matrix obtained by the polarimetric scattering matrix only show a limited number of polarimetric information. In order to solve this problem, we propose a sparse scattering coding way to deal with polarimetric scattering matrix and obtain a close complete feature. This encoding mode can also maintain polarimetric information of scattering matrix completely. At the same time, in view of this encoding way, we design a corresponding classification algorithm based on convolution network to combine this feature. Based on sparse scattering coding and convolution neural network, the polarimetric convolutional network is proposed to classify PolSAR images by making full use of polarimetric information. We perform the experiments on the PolSAR images acquired by AIRSAR and RADARSAT-2 to verify the proposed method. The experimental results demonstrate that the proposed method get better results and has huge potential for PolSAR data classification. Source code for sparse scattering coding is available at https://github.com/liuxuvip/Polarimetric-Scattering-Coding. |
Tasks | Image Classification |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.02975v2 |
http://arxiv.org/pdf/1807.02975v2.pdf | |
PWC | https://paperswithcode.com/paper/polarimetric-convolutional-network-for-polsar |
Repo | https://github.com/liuxuvip/Polarimetric-Scattering-Coding |
Framework | none |
DropLasso: A robust variant of Lasso for single cell RNA-seq data
Title | DropLasso: A robust variant of Lasso for single cell RNA-seq data |
Authors | Beyrem Khalfaoui, Jean-Philippe Vert |
Abstract | Single-cell RNA sequencing (scRNA-seq) is a fast growing approach to measure the genome-wide transcriptome of many individual cells in parallel, but results in noisy data with many dropout events. Existing methods to learn molecular signatures from bulk transcriptomic data may therefore not be adapted to scRNA-seq data, in order to automatically classify individual cells into predefined classes. We propose a new method called DropLasso to learn a molecular signature from scRNA-seq data. DropLasso extends the dropout regularisation technique, popular in neural network training, to esti- mate sparse linear models. It is well adapted to data corrupted by dropout noise, such as scRNA-seq data, and we clarify how it relates to elastic net regularisation. We provide promising results on simulated and real scRNA-seq data, suggesting that DropLasso may be better adapted than standard regularisa- tions to infer molecular signatures from scRNA-seq data. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09381v1 |
http://arxiv.org/pdf/1802.09381v1.pdf | |
PWC | https://paperswithcode.com/paper/droplasso-a-robust-variant-of-lasso-for |
Repo | https://github.com/jpvert/droplasso |
Framework | none |
Adversarial Attacks on Node Embeddings via Graph Poisoning
Title | Adversarial Attacks on Node Embeddings via Graph Poisoning |
Authors | Aleksandar Bojchevski, Stephan Günnemann |
Abstract | The goal of network representation learning is to learn low-dimensional node embeddings that capture the graph structure and are useful for solving downstream tasks. However, despite the proliferation of such methods, there is currently no study of their robustness to adversarial attacks. We provide the first adversarial vulnerability analysis on the widely used family of methods based on random walks. We derive efficient adversarial perturbations that poison the network structure and have a negative effect on both the quality of the embeddings and the downstream tasks. We further show that our attacks are transferable since they generalize to many models and are successful even when the attacker is restricted. |
Tasks | Representation Learning |
Published | 2018-09-04 |
URL | https://arxiv.org/abs/1809.01093v3 |
https://arxiv.org/pdf/1809.01093v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-attacks-on-node-embeddings |
Repo | https://github.com/abojchevski/node_embedding_attack |
Framework | tf |
A Multi-Objective Anytime Rule Mining System to Ease Iterative Feedback from Domain Experts
Title | A Multi-Objective Anytime Rule Mining System to Ease Iterative Feedback from Domain Experts |
Authors | Tobias Baum, Steffen Herbold, Kurt Schneider |
Abstract | Data extracted from software repositories is used intensively in Software Engineering research, for example, to predict defects in source code. In our research in this area, with data from open source projects as well as an industrial partner, we noticed several shortcomings of conventional data mining approaches for classification problems: (1) Domain experts’ acceptance is of critical importance, and domain experts can provide valuable input, but it is hard to use this feedback. (2) The evaluation of the model is not a simple matter of calculating AUC or accuracy. Instead, there are multiple objectives of varying importance, but their importance cannot be easily quantified. Furthermore, the performance of the model cannot be evaluated on a per-instance level in our case, because it shares aspects with the set cover problem. To overcome these problems, we take a holistic approach and develop a rule mining system that simplifies iterative feedback from domain experts and can easily incorporate the domain-specific evaluation needs. A central part of the system is a novel multi-objective anytime rule mining algorithm. The algorithm is based on the GRASP-PR meta-heuristic but extends it with ideas from several other approaches. We successfully applied the system in the industrial context. In the current article, we focus on the description of the algorithm and the concepts of the system. We provide an implementation of the system for reuse. |
Tasks | |
Published | 2018-12-23 |
URL | http://arxiv.org/abs/1812.09746v1 |
http://arxiv.org/pdf/1812.09746v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-objective-anytime-rule-mining-system |
Repo | https://github.com/tobiasbaum/GIMO-m |
Framework | none |
Reverse engineering of CAD models via clustering and approximate implicitization
Title | Reverse engineering of CAD models via clustering and approximate implicitization |
Authors | Andrea Raffo, Oliver J. D. Barrowclough, Georg Muntingh |
Abstract | In applications like computer aided design, geometric models are often represented numerically as polynomial splines or NURBS, even when they originate from primitive geometry. For purposes such as redesign and isogeometric analysis, it is of interest to extract information about the underlying geometry through reverse engineering. In this work we develop a novel method to determine these primitive shapes by combining clustering analysis with approximate implicitization. The proposed method is automatic and can recover algebraic hypersurfaces of any degree in any dimension. In exact arithmetic, the algorithm returns exact results. All the required parameters, such as the implicit degree of the patches and the number of clusters of the model, are inferred using numerical approaches in order to obtain an algorithm that requires as little manual input as possible. The effectiveness, efficiency and robustness of the method are shown both in a theoretical analysis and in numerical examples implemented in Python. |
Tasks | |
Published | 2018-10-17 |
URL | https://arxiv.org/abs/1810.07451v2 |
https://arxiv.org/pdf/1810.07451v2.pdf | |
PWC | https://paperswithcode.com/paper/reverse-engineering-of-cad-models-via |
Repo | https://github.com/georgmuntingh/ImplicitClustering |
Framework | none |
Reproducing AmbientGAN: Generative models from lossy measurements
Title | Reproducing AmbientGAN: Generative models from lossy measurements |
Authors | Mehdi Ahmadi, Timothy Nest, Mostafa Abdelnaim, Thanh-Dung Le |
Abstract | In recent years, Generative Adversarial Networks (GANs) have shown substantial progress in modeling complex distributions of data. These networks have received tremendous attention since they can generate implicit probabilistic models that produce realistic data using a stochastic procedure. While such models have proven highly effective in diverse scenarios, they require a large set of fully-observed training samples. In many applications access to such samples are difficult or even impractical and only noisy or partial observations of the desired distribution is available. Recent research has tried to address the problem of incompletely observed samples to recover the distribution of the data. \citep{zhu2017unpaired} and \citep{yeh2016semantic} proposed methods to solve ill-posed inverse problem using cycle-consistency and latent-space mappings in adversarial networks, respectively. \citep{bora2017compressed} and \citep{kabkab2018task} have applied similar adversarial approaches to the problem of compressed sensing. In this work, we focus on a new variant of GAN models called AmbientGAN, which incorporates a measurement process (e.g. adding noise, data removal and projection) into the GAN training. While in the standard GAN, the discriminator distinguishes a generated image from a real image, in AmbientGAN model the discriminator has to separate a real measurement from a simulated measurement of a generated image. The results shown by \citep{bora2018ambientgan} are quite promising for the problem of incomplete data, and have potentially important implications for generative approaches to compressed sensing and ill-posed problems. |
Tasks | |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.10108v1 |
http://arxiv.org/pdf/1810.10108v1.pdf | |
PWC | https://paperswithcode.com/paper/reproducing-ambientgan-generative-models-from |
Repo | https://github.com/AshishBora/ambient-gan |
Framework | tf |
ProxQuant: Quantized Neural Networks via Proximal Operators
Title | ProxQuant: Quantized Neural Networks via Proximal Operators |
Authors | Yu Bai, Yu-Xiang Wang, Edo Liberty |
Abstract | To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights. One common technique for quantizing neural networks is the straight-through gradient method, which enables back-propagation through the quantization mapping. Despite its empirical success, little is understood about why the straight-through gradient method works. Building upon a novel observation that the straight-through gradient method is in fact identical to the well-known Nesterov’s dual-averaging algorithm on a quantization constrained optimization problem, we propose a more principled alternative approach, called ProxQuant, that formulates quantized network training as a regularized learning problem instead and optimizes it via the prox-gradient method. ProxQuant does back-propagation on the underlying full-precision vector and applies an efficient prox-operator in between stochastic gradient steps to encourage quantizedness. For quantizing ResNets and LSTMs, ProxQuant outperforms state-of-the-art results on binary quantization and is on par with state-of-the-art on multi-bit quantization. For binary quantization, our analysis shows both theoretically and experimentally that ProxQuant is more stable than the straight-through gradient method (i.e. BinaryConnect), challenging the indispensability of the straight-through gradient method and providing a powerful alternative. |
Tasks | Quantization |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.00861v3 |
http://arxiv.org/pdf/1810.00861v3.pdf | |
PWC | https://paperswithcode.com/paper/proxquant-quantized-neural-networks-via |
Repo | https://github.com/allenbai01/ProxQuant |
Framework | pytorch |
Revisiting Spatial-Temporal Similarity: A Deep Learning Framework for Traffic Prediction
Title | Revisiting Spatial-Temporal Similarity: A Deep Learning Framework for Traffic Prediction |
Authors | Huaxiu Yao, Xianfeng Tang, Hua Wei, Guanjie Zheng, Zhenhui Li |
Abstract | Traffic prediction has drawn increasing attention in AI research field due to the increasing availability of large-scale traffic data and its importance in the real world. For example, an accurate taxi demand prediction can assist taxi companies in pre-allocating taxis. The key challenge of traffic prediction lies in how to model the complex spatial dependencies and temporal dynamics. Although both factors have been considered in modeling, existing works make strong assumptions about spatial dependence and temporal dynamics, i.e., spatial dependence is stationary in time, and temporal dynamics is strictly periodical. However, in practice, the spatial dependence could be dynamic (i.e., changing from time to time), and the temporal dynamics could have some perturbation from one period to another period. In this paper, we make two important observations: (1) the spatial dependencies between locations are dynamic; and (2) the temporal dependency follows daily and weekly pattern but it is not strictly periodic for its dynamic temporal shifting. To address these two issues, we propose a novel Spatial-Temporal Dynamic Network (STDN), in which a flow gating mechanism is introduced to learn the dynamic similarity between locations, and a periodically shifted attention mechanism is designed to handle long-term periodic temporal shifting. To the best of our knowledge, this is the first work that tackles both issues in a unified framework. Our experimental results on real-world traffic datasets verify the effectiveness of the proposed method. |
Tasks | Traffic Prediction |
Published | 2018-03-03 |
URL | http://arxiv.org/abs/1803.01254v2 |
http://arxiv.org/pdf/1803.01254v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-spatial-temporal-similarity-a-deep |
Repo | https://github.com/tpepin96/NYCDatasetProcessing |
Framework | none |
Convolutional Neural Networks In Convolution
Title | Convolutional Neural Networks In Convolution |
Authors | Xiaobo Huang |
Abstract | Currently, increasingly deeper neural networks have been applied to improve their accuracy. In contrast, We propose a novel wider Convolutional Neural Networks (CNN) architecture, motivated by the Multi-column Deep Neural Networks and the Network In Network(NIN), aiming for higher accuracy without input data transmutation. In our architecture, namely “CNN In Convolution”(CNNIC), a small CNN, instead of the original generalized liner model(GLM) based filters, is convoluted as kernel on the original image, serving as feature extracting layer of this networks. And further classifications are then carried out by a global average pooling layer and a softmax layer. Dropout and orthonormal initialization are applied to overcome training difficulties including slow convergence and over-fitting. Persuasive classification performance is demonstrated on MNIST. |
Tasks | |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03946v1 |
http://arxiv.org/pdf/1810.03946v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-in-convolution |
Repo | https://github.com/MyWorkShop/Convolutional-Neural-Networks-in-Convolution |
Framework | tf |
Approximating the solution to wave propagation using deep neural networks
Title | Approximating the solution to wave propagation using deep neural networks |
Authors | Wilhelm E. Sorteberg, Stef Garasto, Alison S. Pouplin, Chris D. Cantwell, Anil A. Bharath |
Abstract | Humans gain an implicit understanding of physical laws through observing and interacting with the world. Endowing an autonomous agent with an understanding of physical laws through experience and observation is seldom practical: we should seek alternatives. Fortunately, many of the laws of behaviour of the physical world can be derived from prior knowledge of dynamical systems, expressed through the use of partial differential equations. In this work, we suggest a neural network capable of understanding a specific physical phenomenon: wave propagation in a two-dimensional medium. We define `understanding’ in this context as the ability to predict the future evolution of the spatial patterns of rendered wave amplitude from a relatively small set of initial observations. The inherent complexity of the wave equations – together with the existence of reflections and interference – makes the prediction problem non-trivial. A network capable of making approximate predictions also unlocks the opportunity to speed-up numerical simulations for wave propagation. To this aim, we created a novel dataset of simulated wave motion and built a predictive deep neural network comprising of three main blocks: an encoder, a propagator made by 3 LSTMs, and a decoder. Results show reasonable predictions for as long as 80 time steps into the future on a dataset not seen during training. Furthermore, the network is able to generalize to an initial condition that is qualitatively different from those seen during training. | |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01609v1 |
http://arxiv.org/pdf/1812.01609v1.pdf | |
PWC | https://paperswithcode.com/paper/approximating-the-solution-to-wave |
Repo | https://github.com/stathius/wave_propagation |
Framework | pytorch |