February 2, 2020

3052 words 15 mins read

Paper Group AWR 60

On Evaluating Adversarial Robustness. Diffprivlib: The IBM Differential Privacy Library. Fast-SCNN: Fast Semantic Segmentation Network. Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics. Graphonomy: Universal Human Parsing via Graph Transfer Learning. Self-Regulated Int …

On Evaluating Adversarial Robustness


Title	On Evaluating Adversarial Robustness
Authors	Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, Alexey Kurakin
Abstract	Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls.
Tasks	Adversarial Attack, Adversarial Defense
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06705v2
PDF	http://arxiv.org/pdf/1902.06705v2.pdf
PWC	https://paperswithcode.com/paper/on-evaluating-adversarial-robustness
Repo	https://github.com/locuslab/fast_adversarial
Framework	pytorch

Diffprivlib: The IBM Differential Privacy Library


Title	Diffprivlib: The IBM Differential Privacy Library
Authors	Naoise Holohan, Stefano Braghin, Pól Mac Aonghusa, Killian Levacher
Abstract	Since its conception in 2006, differential privacy has emerged as the de-facto standard in data privacy, owing to its robust mathematical guarantees, generalised applicability and rich body of literature. Over the years, researchers have studied differential privacy and its applicability to an ever-widening field of topics. Mechanisms have been created to optimise the process of achieving differential privacy, for various data types and scenarios. Until this work however, all previous work on differential privacy has been conducted on a ad-hoc basis, without a single, unifying codebase to implement results. In this work, we present the IBM Differential Privacy Library, a general purpose, open source library for investigating, experimenting and developing differential privacy applications in the Python programming language. The library includes a host of mechanisms, the building blocks of differential privacy, alongside a number of applications to machine learning and other data analytics tasks. Simplicity and accessibility has been prioritised in developing the library, making it suitable to a wide audience of users, from those using the library for their first investigations in data privacy, to the privacy experts looking to contribute their own models and mechanisms for others to use.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02444v1
PDF	https://arxiv.org/pdf/1907.02444v1.pdf
PWC	https://paperswithcode.com/paper/diffprivlib-the-ibm-differential-privacy
Repo	https://github.com/IBM/differential-privacy-library
Framework	none

Fast-SCNN: Fast Semantic Segmentation Network


Title	Fast-SCNN: Fast Semantic Segmentation Network
Authors	Rudra P K Poudel, Stephan Liwicki, Roberto Cipolla
Abstract	The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory. Building on existing two-branch methods for fast segmentation, we introduce our `learning to downsample’ module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our metric in experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications. \|
Tasks	Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-02-12
URL	http://arxiv.org/abs/1902.04502v1
PDF	http://arxiv.org/pdf/1902.04502v1.pdf
PWC	https://paperswithcode.com/paper/fast-scnn-fast-semantic-segmentation-network
Repo	https://github.com/SkyWa7ch3r/ImageSegmentation
Framework	tf

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics


Title	Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
Authors	Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov, Yoshua Bengio, Guillaume Lajoie
Abstract	A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12080v2
PDF	https://arxiv.org/pdf/1905.12080v2.pdf
PWC	https://paperswithcode.com/paper/non-normal-recurrent-neural-network-nnrnn
Repo	https://github.com/nnRNN/nnRNN_release
Framework	pytorch

Graphonomy: Universal Human Parsing via Graph Transfer Learning


Title	Graphonomy: Universal Human Parsing via Graph Transfer Learning
Authors	Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin
Abstract	Prior highly-tuned human parsing models tend to fit towards each dataset in a specific domain or with discrepant label granularity, and can hardly be adapted to other human parsing tasks without extensive re-training. In this paper, we aim to learn a single universal human parsing model that can tackle all kinds of human parsing needs by unifying label annotations from different domains or at various levels of granularity. This poses many fundamental learning challenges, e.g. discovering underlying semantic structures among different label granularity, performing proper transfer learning across different image domains, and identifying and utilizing label redundancies across related tasks. To address these challenges, we propose a new universal human parsing agent, named “Graphonomy”, which incorporates hierarchical graph transfer learning upon the conventional parsing network to encode the underlying label semantic structures and propagate relevant semantic information. In particular, Graphonomy first learns and propagates compact high-level graph representation among the labels within one dataset via Intra-Graph Reasoning, and then transfers semantic information across multiple datasets via Inter-Graph Transfer. Various graph transfer dependencies (\eg, similarity, linguistic knowledge) between different datasets are analyzed and encoded to enhance graph transfer capability. By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity. Experimental results show Graphonomy effectively achieves the state-of-the-art results on three human parsing benchmarks as well as advantageous universal human parsing performance.
Tasks	Human Parsing, Transfer Learning
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04536v1
PDF	http://arxiv.org/pdf/1904.04536v1.pdf
PWC	https://paperswithcode.com/paper/graphonomy-universal-human-parsing-via-graph
Repo	https://github.com/Gaoyiminggithub/Graphonomy
Framework	pytorch

Self-Regulated Interactive Sequence-to-Sequence Learning


Title	Self-Regulated Interactive Sequence-to-Sequence Learning
Authors	Julia Kreutzer, Stefan Riezler
Abstract	Not all types of supervision signals are created equal: Different types of feedback have different costs and effects on learning. We show how self-regulation strategies that decide when to ask for which kind of feedback from a teacher (or from oneself) can be cast as a learning-to-learn problem leading to improved cost-aware sequence-to-sequence learning. In experiments on interactive neural machine translation, we find that the self-regulator discovers an $\epsilon$-greedy strategy for the optimal cost-quality trade-off by mixing different feedback types including corrections, error markups, and self-supervision. Furthermore, we demonstrate its robustness under domain shift and identify it as a promising alternative to active learning.
Tasks	Active Learning, Machine Translation
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05190v2
PDF	https://arxiv.org/pdf/1907.05190v2.pdf
PWC	https://paperswithcode.com/paper/self-regulated-interactive-sequence-to
Repo	https://github.com/juliakreutzer/joeynmt
Framework	pytorch

Learning Discrete Structures for Graph Neural Networks


Title	Learning Discrete Structures for Graph Neural Networks
Authors	Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He
Abstract	Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Tasks	Music Genre Recognition, Node Classification
Published	2019-03-28
URL	https://arxiv.org/abs/1903.11960v3
PDF	https://arxiv.org/pdf/1903.11960v3.pdf
PWC	https://paperswithcode.com/paper/learning-discrete-structures-for-graph-neural
Repo	https://github.com/lucfra/LDS-GNN
Framework	tf

Learning Counterfactual Representations for Estimating Individual Dose-Response Curves


Title	Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Authors	Patrick Schwab, Lorenz Linhardt, Stefan Bauer, Joachim M. Buhmann, Walter Karlen
Abstract	Estimating what would be an individual’s potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings with only two treatments that do not have an associated dosage parameter. Here, we present a novel machine-learning approach towards learning counterfactual representations for estimating individual dose-response curves for any number of treatments with continuous dosage parameters with neural networks. Building on the established potential outcomes framework, we introduce performance metrics, model selection criteria, model architectures, and open benchmarks for estimating individual dose-response curves. Our experiments show that the methods developed in this work set a new state-of-the-art in estimating individual dose-response.
Tasks	Model Selection
Published	2019-02-03
URL	https://arxiv.org/abs/1902.00981v2
PDF	https://arxiv.org/pdf/1902.00981v2.pdf
PWC	https://paperswithcode.com/paper/learning-counterfactual-representations-for
Repo	https://github.com/d909b/drnet
Framework	none

Variable-lag Granger Causality for Time Series Analysis


Title	Variable-lag Granger Causality for Time Series Analysis
Authors	Chainarong Amornbunchornvej, Elena Zheleva, Tanya Y. Berger-Wolf
Abstract	Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis.
Tasks	Causal Inference, Time Series, Time Series Analysis
Published	2019-12-18
URL	https://arxiv.org/abs/1912.10829v1
PDF	https://arxiv.org/pdf/1912.10829v1.pdf
PWC	https://paperswithcode.com/paper/variable-lag-granger-causality-for-time
Repo	https://github.com/DarkEyes/VLTimeSeriesCausality
Framework	none

Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time


Title	Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time
Authors	Christoph Heindl, Thomas Pönitz, Gernot Stübl, Andreas Pichler, Josef Scharinger
Abstract	Commodity RGB-D sensors capture color images along with dense pixel-wise depth information in real-time. Typical RGB-D sensors are provided with a factory calibration and exhibit erratic depth readings due to coarse calibration values, ageing and thermal influence effects. This limits their applicability in computer vision and robotics. We propose a novel method to accurately calibrate depth considering spatial and thermal influences jointly. Our work is based on Gaussian Process Regression in a four dimensional Cartesian and thermal domain. We propose to leverage modern GPUs for dense depth map correction in real-time. For reproducibility we make our dataset and source code publicly available.
Tasks	Calibration, Gaussian Processes
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00549v1
PDF	https://arxiv.org/pdf/1907.00549v1.pdf
PWC	https://paperswithcode.com/paper/spatio-thermal-depth-correction-of-rgb-d
Repo	https://github.com/cheind/rgbd-correction
Framework	tf

Explainability and Adversarial Robustness for RNNs


Title	Explainability and Adversarial Robustness for RNNs
Authors	Alexander Hartl, Maximilian Bachl, Joachim Fabini, Tanja Zseby
Abstract	Recurrent Neural Networks (RNNs) yield attractive properties for constructing Intrusion Detection Systems (IDSs) for network data. With the rise of ubiquitous Machine Learning (ML) systems, malicious actors have been catching up quickly to find new ways to exploit ML vulnerabilities for profit. Recently developed adversarial ML techniques focus on computer vision and their applicability to network traffic is not straightforward: Network packets expose fewer features than an image, are sequential and impose several constraints on their features. We show that despite these completely different characteristics, adversarial samples can be generated reliably for RNNs. To understand a classifier’s potential for misclassification, we extend existing explainability techniques and propose new ones, suitable particularly for sequential data. Applying them shows that already the first packets of a communication flow are of crucial importance and are likely to be targeted by attackers. Feature importance methods show that even relatively unimportant features can be effectively abused to generate adversarial samples. Since traditional evaluation metrics such as accuracy are not sufficient for quantifying the adversarial threat, we propose the Adversarial Robustness Score (ARS) for comparing IDSs, capturing a common notion of adversarial robustness, and show that an adversarial training procedure can significantly and successfully reduce the attack surface.
Tasks	Feature Importance, Intrusion Detection
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09855v2
PDF	https://arxiv.org/pdf/1912.09855v2.pdf
PWC	https://paperswithcode.com/paper/explainability-and-adversarial-robustness-for
Repo	https://github.com/CN-TU/adversarial-recurrent-ids
Framework	none

Minimal Achievable Sufficient Statistic Learning


Title	Minimal Achievable Sufficient Statistic Learning
Authors	Milan Cvitkovic, Günther Koliander
Abstract	We introduce Minimal Achievable Sufficient Statistic (MASS) Learning, a training method for machine learning models that attempts to produce minimal sufficient statistics with respect to a class of functions (e.g. deep networks) being optimized over. In deriving MASS Learning, we also introduce Conserved Differential Information (CDI), an information-theoretic quantity that - unlike standard mutual information - can be usefully applied to deterministically-dependent continuous random variables like the input and output of a deep network. In a series of experiments, we show that deep networks trained with MASS Learning achieve competitive performance on supervised learning and uncertainty quantification benchmarks.
Tasks
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07822v2
PDF	https://arxiv.org/pdf/1905.07822v2.pdf
PWC	https://paperswithcode.com/paper/minimal-achievable-sufficient-statistic
Repo	https://github.com/mwcvitkovic/MASS-Learning
Framework	pytorch

The NN-Stacking: Feature weighted linear stacking through neural networks


Title	The NN-Stacking: Feature weighted linear stacking through neural networks
Authors	Victor Coscrato, Marco Henrique de Almeida Inácio, Rafael Izbicki
Abstract	Stacking methods improve the prediction performance of regression models. A simple way to stack base regressions estimators is by combining them linearly, as done by \citet{breiman1996stacked}. Even though this approach is useful from an interpretative perspective, it often does not lead to high predictive power. We propose the NN-Stacking method (NNS), which generalizes Breiman’s method by allowing the linear parameters to vary with input features. This improvement enables NNS to take advantage of the fact that distinct base models often perform better at different regions of the feature space. Our method uses neural networks to estimate the stacking coefficients. We show that while our approach keeps the interpretative features of Breiman’s method at a local level, it leads to better predictive power, especially in datasets with large sample sizes.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09735v1
PDF	https://arxiv.org/pdf/1906.09735v1.pdf
PWC	https://paperswithcode.com/paper/the-nn-stacking-feature-weighted-linear
Repo	https://github.com/randommm/nnstacking
Framework	pytorch

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation


Title	Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation
Authors	Zequn Sun, Chengming Wang, Wei Hu, Muhao Chen, Jian Dai, Wei Zhang, Yuzhong Qu
Abstract	Graph neural networks (GNNs) have emerged as a powerful paradigm for embedding-based entity alignment due to their capability of identifying isomorphic subgraphs. However, in real knowledge graphs (KGs), the counterpart entities usually have non-isomorphic neighborhood structures, which easily causes GNNs to yield different representations for them. To tackle this problem, we propose a new KG alignment network, namely AliNet, aiming at mitigating the non-isomorphism of neighborhood structures in an end-to-end manner. As the direct neighbors of counterpart entities are usually dissimilar due to the schema heterogeneity, AliNet introduces distant neighbors to expand the overlap between their neighborhood structures. It employs an attention mechanism to highlight helpful distant neighbors and reduce noises. Then, it controls the aggregation of both direct and distant neighborhood information using a gating mechanism. We further propose a relation loss to refine entity representations. We perform thorough experiments with detailed ablation studies and analyses on five entity alignment datasets, demonstrating the effectiveness of AliNet.
Tasks	Entity Alignment, Knowledge Graphs
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08936v1
PDF	https://arxiv.org/pdf/1911.08936v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-graph-alignment-network-with-gated
Repo	https://github.com/nju-websoft/AliNet
Framework	tf

Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability


Title	Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability
Authors	Shubhomoy Das, Md Rakibul Islam, Nitthilan Kannappan Jayakodi, Janardhan Rao Doppa
Abstract	Anomaly detection (AD) task corresponds to identifying the true anomalies from a given set of data instances. AD algorithms score the data instances and produce a ranked list of candidate anomalies, which are then analyzed by a human to discover the true anomalies. However, this process can be laborious for the human analyst when the number of false-positives is very high. Therefore, in many real-world AD applications including computer security and fraud prevention, the anomaly detector must be configurable by the human analyst to minimize the effort on false positives. In this paper, we study the problem of active learning to automatically tune ensemble of anomaly detectors to maximize the number of true anomalies discovered. We make four main contributions towards this goal. First, we present an important insight that explains the practical successes of AD ensembles and how ensembles are naturally suited for active learning. Second, we present several algorithms for active learning with tree-based AD ensembles. These algorithms help us to improve the diversity of discovered anomalies, generate rule sets for improved interpretability of anomalous instances, and adapt to streaming data settings in a principled manner. Third, we present a novel algorithm called GLocalized Anomaly Detection (GLAD) for active learning with generic AD ensembles. GLAD allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. Fourth, we present extensive experiments to evaluate our insights and algorithms. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Tasks	Active Learning, Anomaly Detection
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08930v1
PDF	http://arxiv.org/pdf/1901.08930v1.pdf
PWC	https://paperswithcode.com/paper/active-anomaly-detection-via-ensembles-1
Repo	https://github.com/shubhomoydas/ad_examples
Framework	tf