February 2, 2020

3052 words 15 mins read

Paper Group AWR 60

Paper Group AWR 60

On Evaluating Adversarial Robustness. Diffprivlib: The IBM Differential Privacy Library. Fast-SCNN: Fast Semantic Segmentation Network. Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics. Graphonomy: Universal Human Parsing via Graph Transfer Learning. Self-Regulated Int …

On Evaluating Adversarial Robustness

Title On Evaluating Adversarial Robustness
Authors Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, Alexey Kurakin
Abstract Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls.
Tasks Adversarial Attack, Adversarial Defense
Published 2019-02-18
URL http://arxiv.org/abs/1902.06705v2
PDF http://arxiv.org/pdf/1902.06705v2.pdf
PWC https://paperswithcode.com/paper/on-evaluating-adversarial-robustness
Repo https://github.com/locuslab/fast_adversarial
Framework pytorch

Diffprivlib: The IBM Differential Privacy Library

Title Diffprivlib: The IBM Differential Privacy Library
Authors Naoise Holohan, Stefano Braghin, Pól Mac Aonghusa, Killian Levacher
Abstract Since its conception in 2006, differential privacy has emerged as the de-facto standard in data privacy, owing to its robust mathematical guarantees, generalised applicability and rich body of literature. Over the years, researchers have studied differential privacy and its applicability to an ever-widening field of topics. Mechanisms have been created to optimise the process of achieving differential privacy, for various data types and scenarios. Until this work however, all previous work on differential privacy has been conducted on a ad-hoc basis, without a single, unifying codebase to implement results. In this work, we present the IBM Differential Privacy Library, a general purpose, open source library for investigating, experimenting and developing differential privacy applications in the Python programming language. The library includes a host of mechanisms, the building blocks of differential privacy, alongside a number of applications to machine learning and other data analytics tasks. Simplicity and accessibility has been prioritised in developing the library, making it suitable to a wide audience of users, from those using the library for their first investigations in data privacy, to the privacy experts looking to contribute their own models and mechanisms for others to use.
Tasks
Published 2019-07-04
URL https://arxiv.org/abs/1907.02444v1
PDF https://arxiv.org/pdf/1907.02444v1.pdf
PWC https://paperswithcode.com/paper/diffprivlib-the-ibm-differential-privacy
Repo https://github.com/IBM/differential-privacy-library
Framework none

Fast-SCNN: Fast Semantic Segmentation Network

Title Fast-SCNN: Fast Semantic Segmentation Network
Authors Rudra P K Poudel, Stephan Liwicki, Roberto Cipolla
Abstract The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory. Building on existing two-branch methods for fast segmentation, we introduce our `learning to downsample’ module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our metric in experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications. |
Tasks Real-Time Semantic Segmentation, Semantic Segmentation
Published 2019-02-12
URL http://arxiv.org/abs/1902.04502v1
PDF http://arxiv.org/pdf/1902.04502v1.pdf
PWC https://paperswithcode.com/paper/fast-scnn-fast-semantic-segmentation-network
Repo https://github.com/SkyWa7ch3r/ImageSegmentation
Framework tf

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

Title Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
Authors Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov, Yoshua Bengio, Guillaume Lajoie
Abstract A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.12080v2
PDF https://arxiv.org/pdf/1905.12080v2.pdf
PWC https://paperswithcode.com/paper/non-normal-recurrent-neural-network-nnrnn
Repo https://github.com/nnRNN/nnRNN_release
Framework pytorch

Graphonomy: Universal Human Parsing via Graph Transfer Learning

Title Graphonomy: Universal Human Parsing via Graph Transfer Learning
Authors Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin
Abstract Prior highly-tuned human parsing models tend to fit towards each dataset in a specific domain or with discrepant label granularity, and can hardly be adapted to other human parsing tasks without extensive re-training. In this paper, we aim to learn a single universal human parsing model that can tackle all kinds of human parsing needs by unifying label annotations from different domains or at various levels of granularity. This poses many fundamental learning challenges, e.g. discovering underlying semantic structures among different label granularity, performing proper transfer learning across different image domains, and identifying and utilizing label redundancies across related tasks. To address these challenges, we propose a new universal human parsing agent, named “Graphonomy”, which incorporates hierarchical graph transfer learning upon the conventional parsing network to encode the underlying label semantic structures and propagate relevant semantic information. In particular, Graphonomy first learns and propagates compact high-level graph representation among the labels within one dataset via Intra-Graph Reasoning, and then transfers semantic information across multiple datasets via Inter-Graph Transfer. Various graph transfer dependencies (\eg, similarity, linguistic knowledge) between different datasets are analyzed and encoded to enhance graph transfer capability. By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity. Experimental results show Graphonomy effectively achieves the state-of-the-art results on three human parsing benchmarks as well as advantageous universal human parsing performance.
Tasks Human Parsing, Transfer Learning
Published 2019-04-09
URL http://arxiv.org/abs/1904.04536v1
PDF http://arxiv.org/pdf/1904.04536v1.pdf
PWC https://paperswithcode.com/paper/graphonomy-universal-human-parsing-via-graph
Repo https://github.com/Gaoyiminggithub/Graphonomy
Framework pytorch

Self-Regulated Interactive Sequence-to-Sequence Learning

Title Self-Regulated Interactive Sequence-to-Sequence Learning
Authors Julia Kreutzer, Stefan Riezler
Abstract Not all types of supervision signals are created equal: Different types of feedback have different costs and effects on learning. We show how self-regulation strategies that decide when to ask for which kind of feedback from a teacher (or from oneself) can be cast as a learning-to-learn problem leading to improved cost-aware sequence-to-sequence learning. In experiments on interactive neural machine translation, we find that the self-regulator discovers an $\epsilon$-greedy strategy for the optimal cost-quality trade-off by mixing different feedback types including corrections, error markups, and self-supervision. Furthermore, we demonstrate its robustness under domain shift and identify it as a promising alternative to active learning.
Tasks Active Learning, Machine Translation
Published 2019-07-11
URL https://arxiv.org/abs/1907.05190v2
PDF https://arxiv.org/pdf/1907.05190v2.pdf
PWC https://paperswithcode.com/paper/self-regulated-interactive-sequence-to
Repo https://github.com/juliakreutzer/joeynmt
Framework pytorch

Learning Discrete Structures for Graph Neural Networks

Title Learning Discrete Structures for Graph Neural Networks
Authors Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He
Abstract Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Tasks Music Genre Recognition, Node Classification
Published 2019-03-28
URL https://arxiv.org/abs/1903.11960v3
PDF https://arxiv.org/pdf/1903.11960v3.pdf
PWC https://paperswithcode.com/paper/learning-discrete-structures-for-graph-neural
Repo https://github.com/lucfra/LDS-GNN
Framework tf

Learning Counterfactual Representations for Estimating Individual Dose-Response Curves

Title Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Authors Patrick Schwab, Lorenz Linhardt, Stefan Bauer, Joachim M. Buhmann, Walter Karlen
Abstract Estimating what would be an individual’s potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings with only two treatments that do not have an associated dosage parameter. Here, we present a novel machine-learning approach towards learning counterfactual representations for estimating individual dose-response curves for any number of treatments with continuous dosage parameters with neural networks. Building on the established potential outcomes framework, we introduce performance metrics, model selection criteria, model architectures, and open benchmarks for estimating individual dose-response curves. Our experiments show that the methods developed in this work set a new state-of-the-art in estimating individual dose-response.
Tasks Model Selection
Published 2019-02-03
URL https://arxiv.org/abs/1902.00981v2
PDF https://arxiv.org/pdf/1902.00981v2.pdf
PWC https://paperswithcode.com/paper/learning-counterfactual-representations-for
Repo https://github.com/d909b/drnet
Framework none

Variable-lag Granger Causality for Time Series Analysis

Title Variable-lag Granger Causality for Time Series Analysis
Authors Chainarong Amornbunchornvej, Elena Zheleva, Tanya Y. Berger-Wolf
Abstract Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis.
Tasks Causal Inference, Time Series, Time Series Analysis
Published 2019-12-18
URL https://arxiv.org/abs/1912.10829v1
PDF https://arxiv.org/pdf/1912.10829v1.pdf
PWC https://paperswithcode.com/paper/variable-lag-granger-causality-for-time
Repo https://github.com/DarkEyes/VLTimeSeriesCausality
Framework none

Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time

Title Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time
Authors Christoph Heindl, Thomas Pönitz, Gernot Stübl, Andreas Pichler, Josef Scharinger
Abstract Commodity RGB-D sensors capture color images along with dense pixel-wise depth information in real-time. Typical RGB-D sensors are provided with a factory calibration and exhibit erratic depth readings due to coarse calibration values, ageing and thermal influence effects. This limits their applicability in computer vision and robotics. We propose a novel method to accurately calibrate depth considering spatial and thermal influences jointly. Our work is based on Gaussian Process Regression in a four dimensional Cartesian and thermal domain. We propose to leverage modern GPUs for dense depth map correction in real-time. For reproducibility we make our dataset and source code publicly available.
Tasks Calibration, Gaussian Processes
Published 2019-07-01
URL https://arxiv.org/abs/1907.00549v1
PDF https://arxiv.org/pdf/1907.00549v1.pdf
PWC https://paperswithcode.com/paper/spatio-thermal-depth-correction-of-rgb-d
Repo https://github.com/cheind/rgbd-correction
Framework tf

Explainability and Adversarial Robustness for RNNs

Title Explainability and Adversarial Robustness for RNNs
Authors Alexander Hartl, Maximilian Bachl, Joachim Fabini, Tanja Zseby
Abstract Recurrent Neural Networks (RNNs) yield attractive properties for constructing Intrusion Detection Systems (IDSs) for network data. With the rise of ubiquitous Machine Learning (ML) systems, malicious actors have been catching up quickly to find new ways to exploit ML vulnerabilities for profit. Recently developed adversarial ML techniques focus on computer vision and their applicability to network traffic is not straightforward: Network packets expose fewer features than an image, are sequential and impose several constraints on their features. We show that despite these completely different characteristics, adversarial samples can be generated reliably for RNNs. To understand a classifier’s potential for misclassification, we extend existing explainability techniques and propose new ones, suitable particularly for sequential data. Applying them shows that already the first packets of a communication flow are of crucial importance and are likely to be targeted by attackers. Feature importance methods show that even relatively unimportant features can be effectively abused to generate adversarial samples. Since traditional evaluation metrics such as accuracy are not sufficient for quantifying the adversarial threat, we propose the Adversarial Robustness Score (ARS) for comparing IDSs, capturing a common notion of adversarial robustness, and show that an adversarial training procedure can significantly and successfully reduce the attack surface.
Tasks Feature Importance, Intrusion Detection
Published 2019-12-20
URL https://arxiv.org/abs/1912.09855v2
PDF https://arxiv.org/pdf/1912.09855v2.pdf
PWC https://paperswithcode.com/paper/explainability-and-adversarial-robustness-for
Repo https://github.com/CN-TU/adversarial-recurrent-ids
Framework none

Minimal Achievable Sufficient Statistic Learning

Title Minimal Achievable Sufficient Statistic Learning
Authors Milan Cvitkovic, Günther Koliander
Abstract We introduce Minimal Achievable Sufficient Statistic (MASS) Learning, a training method for machine learning models that attempts to produce minimal sufficient statistics with respect to a class of functions (e.g. deep networks) being optimized over. In deriving MASS Learning, we also introduce Conserved Differential Information (CDI), an information-theoretic quantity that - unlike standard mutual information - can be usefully applied to deterministically-dependent continuous random variables like the input and output of a deep network. In a series of experiments, we show that deep networks trained with MASS Learning achieve competitive performance on supervised learning and uncertainty quantification benchmarks.
Tasks
Published 2019-05-19
URL https://arxiv.org/abs/1905.07822v2
PDF https://arxiv.org/pdf/1905.07822v2.pdf
PWC https://paperswithcode.com/paper/minimal-achievable-sufficient-statistic
Repo https://github.com/mwcvitkovic/MASS-Learning
Framework pytorch

The NN-Stacking: Feature weighted linear stacking through neural networks

Title The NN-Stacking: Feature weighted linear stacking through neural networks
Authors Victor Coscrato, Marco Henrique de Almeida Inácio, Rafael Izbicki
Abstract Stacking methods improve the prediction performance of regression models. A simple way to stack base regressions estimators is by combining them linearly, as done by \citet{breiman1996stacked}. Even though this approach is useful from an interpretative perspective, it often does not lead to high predictive power. We propose the NN-Stacking method (NNS), which generalizes Breiman’s method by allowing the linear parameters to vary with input features. This improvement enables NNS to take advantage of the fact that distinct base models often perform better at different regions of the feature space. Our method uses neural networks to estimate the stacking coefficients. We show that while our approach keeps the interpretative features of Breiman’s method at a local level, it leads to better predictive power, especially in datasets with large sample sizes.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.09735v1
PDF https://arxiv.org/pdf/1906.09735v1.pdf
PWC https://paperswithcode.com/paper/the-nn-stacking-feature-weighted-linear
Repo https://github.com/randommm/nnstacking
Framework pytorch

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Title Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation
Authors Zequn Sun, Chengming Wang, Wei Hu, Muhao Chen, Jian Dai, Wei Zhang, Yuzhong Qu
Abstract Graph neural networks (GNNs) have emerged as a powerful paradigm for embedding-based entity alignment due to their capability of identifying isomorphic subgraphs. However, in real knowledge graphs (KGs), the counterpart entities usually have non-isomorphic neighborhood structures, which easily causes GNNs to yield different representations for them. To tackle this problem, we propose a new KG alignment network, namely AliNet, aiming at mitigating the non-isomorphism of neighborhood structures in an end-to-end manner. As the direct neighbors of counterpart entities are usually dissimilar due to the schema heterogeneity, AliNet introduces distant neighbors to expand the overlap between their neighborhood structures. It employs an attention mechanism to highlight helpful distant neighbors and reduce noises. Then, it controls the aggregation of both direct and distant neighborhood information using a gating mechanism. We further propose a relation loss to refine entity representations. We perform thorough experiments with detailed ablation studies and analyses on five entity alignment datasets, demonstrating the effectiveness of AliNet.
Tasks Entity Alignment, Knowledge Graphs
Published 2019-11-20
URL https://arxiv.org/abs/1911.08936v1
PDF https://arxiv.org/pdf/1911.08936v1.pdf
PWC https://paperswithcode.com/paper/knowledge-graph-alignment-network-with-gated
Repo https://github.com/nju-websoft/AliNet
Framework tf

Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability

Title Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability
Authors Shubhomoy Das, Md Rakibul Islam, Nitthilan Kannappan Jayakodi, Janardhan Rao Doppa
Abstract Anomaly detection (AD) task corresponds to identifying the true anomalies from a given set of data instances. AD algorithms score the data instances and produce a ranked list of candidate anomalies, which are then analyzed by a human to discover the true anomalies. However, this process can be laborious for the human analyst when the number of false-positives is very high. Therefore, in many real-world AD applications including computer security and fraud prevention, the anomaly detector must be configurable by the human analyst to minimize the effort on false positives. In this paper, we study the problem of active learning to automatically tune ensemble of anomaly detectors to maximize the number of true anomalies discovered. We make four main contributions towards this goal. First, we present an important insight that explains the practical successes of AD ensembles and how ensembles are naturally suited for active learning. Second, we present several algorithms for active learning with tree-based AD ensembles. These algorithms help us to improve the diversity of discovered anomalies, generate rule sets for improved interpretability of anomalous instances, and adapt to streaming data settings in a principled manner. Third, we present a novel algorithm called GLocalized Anomaly Detection (GLAD) for active learning with generic AD ensembles. GLAD allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. Fourth, we present extensive experiments to evaluate our insights and algorithms. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Tasks Active Learning, Anomaly Detection
Published 2019-01-23
URL http://arxiv.org/abs/1901.08930v1
PDF http://arxiv.org/pdf/1901.08930v1.pdf
PWC https://paperswithcode.com/paper/active-anomaly-detection-via-ensembles-1
Repo https://github.com/shubhomoydas/ad_examples
Framework tf
comments powered by Disqus