Paper Group AWR 60
On Evaluating Adversarial Robustness. Diffprivlib: The IBM Differential Privacy Library. Fast-SCNN: Fast Semantic Segmentation Network. Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics. Graphonomy: Universal Human Parsing via Graph Transfer Learning. Self-Regulated Int …
On Evaluating Adversarial Robustness
Title | On Evaluating Adversarial Robustness |
Authors | Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, Alexey Kurakin |
Abstract | Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls. |
Tasks | Adversarial Attack, Adversarial Defense |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06705v2 |
http://arxiv.org/pdf/1902.06705v2.pdf | |
PWC | https://paperswithcode.com/paper/on-evaluating-adversarial-robustness |
Repo | https://github.com/locuslab/fast_adversarial |
Framework | pytorch |
Diffprivlib: The IBM Differential Privacy Library
Title | Diffprivlib: The IBM Differential Privacy Library |
Authors | Naoise Holohan, Stefano Braghin, Pól Mac Aonghusa, Killian Levacher |
Abstract | Since its conception in 2006, differential privacy has emerged as the de-facto standard in data privacy, owing to its robust mathematical guarantees, generalised applicability and rich body of literature. Over the years, researchers have studied differential privacy and its applicability to an ever-widening field of topics. Mechanisms have been created to optimise the process of achieving differential privacy, for various data types and scenarios. Until this work however, all previous work on differential privacy has been conducted on a ad-hoc basis, without a single, unifying codebase to implement results. In this work, we present the IBM Differential Privacy Library, a general purpose, open source library for investigating, experimenting and developing differential privacy applications in the Python programming language. The library includes a host of mechanisms, the building blocks of differential privacy, alongside a number of applications to machine learning and other data analytics tasks. Simplicity and accessibility has been prioritised in developing the library, making it suitable to a wide audience of users, from those using the library for their first investigations in data privacy, to the privacy experts looking to contribute their own models and mechanisms for others to use. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02444v1 |
https://arxiv.org/pdf/1907.02444v1.pdf | |
PWC | https://paperswithcode.com/paper/diffprivlib-the-ibm-differential-privacy |
Repo | https://github.com/IBM/differential-privacy-library |
Framework | none |
Fast-SCNN: Fast Semantic Segmentation Network
Title | Fast-SCNN: Fast Semantic Segmentation Network |
Authors | Rudra P K Poudel, Stephan Liwicki, Roberto Cipolla |
Abstract | The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory. Building on existing two-branch methods for fast segmentation, we introduce our `learning to downsample’ module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our metric in experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications. | |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04502v1 |
http://arxiv.org/pdf/1902.04502v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-scnn-fast-semantic-segmentation-network |
Repo | https://github.com/SkyWa7ch3r/ImageSegmentation |
Framework | tf |
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
Title | Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics |
Authors | Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov, Yoshua Bengio, Guillaume Lajoie |
Abstract | A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12080v2 |
https://arxiv.org/pdf/1905.12080v2.pdf | |
PWC | https://paperswithcode.com/paper/non-normal-recurrent-neural-network-nnrnn |
Repo | https://github.com/nnRNN/nnRNN_release |
Framework | pytorch |
Graphonomy: Universal Human Parsing via Graph Transfer Learning
Title | Graphonomy: Universal Human Parsing via Graph Transfer Learning |
Authors | Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin |
Abstract | Prior highly-tuned human parsing models tend to fit towards each dataset in a specific domain or with discrepant label granularity, and can hardly be adapted to other human parsing tasks without extensive re-training. In this paper, we aim to learn a single universal human parsing model that can tackle all kinds of human parsing needs by unifying label annotations from different domains or at various levels of granularity. This poses many fundamental learning challenges, e.g. discovering underlying semantic structures among different label granularity, performing proper transfer learning across different image domains, and identifying and utilizing label redundancies across related tasks. To address these challenges, we propose a new universal human parsing agent, named “Graphonomy”, which incorporates hierarchical graph transfer learning upon the conventional parsing network to encode the underlying label semantic structures and propagate relevant semantic information. In particular, Graphonomy first learns and propagates compact high-level graph representation among the labels within one dataset via Intra-Graph Reasoning, and then transfers semantic information across multiple datasets via Inter-Graph Transfer. Various graph transfer dependencies (\eg, similarity, linguistic knowledge) between different datasets are analyzed and encoded to enhance graph transfer capability. By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity. Experimental results show Graphonomy effectively achieves the state-of-the-art results on three human parsing benchmarks as well as advantageous universal human parsing performance. |
Tasks | Human Parsing, Transfer Learning |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04536v1 |
http://arxiv.org/pdf/1904.04536v1.pdf | |
PWC | https://paperswithcode.com/paper/graphonomy-universal-human-parsing-via-graph |
Repo | https://github.com/Gaoyiminggithub/Graphonomy |
Framework | pytorch |
Self-Regulated Interactive Sequence-to-Sequence Learning
Title | Self-Regulated Interactive Sequence-to-Sequence Learning |
Authors | Julia Kreutzer, Stefan Riezler |
Abstract | Not all types of supervision signals are created equal: Different types of feedback have different costs and effects on learning. We show how self-regulation strategies that decide when to ask for which kind of feedback from a teacher (or from oneself) can be cast as a learning-to-learn problem leading to improved cost-aware sequence-to-sequence learning. In experiments on interactive neural machine translation, we find that the self-regulator discovers an $\epsilon$-greedy strategy for the optimal cost-quality trade-off by mixing different feedback types including corrections, error markups, and self-supervision. Furthermore, we demonstrate its robustness under domain shift and identify it as a promising alternative to active learning. |
Tasks | Active Learning, Machine Translation |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05190v2 |
https://arxiv.org/pdf/1907.05190v2.pdf | |
PWC | https://paperswithcode.com/paper/self-regulated-interactive-sequence-to |
Repo | https://github.com/juliakreutzer/joeynmt |
Framework | pytorch |
Learning Discrete Structures for Graph Neural Networks
Title | Learning Discrete Structures for Graph Neural Networks |
Authors | Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He |
Abstract | Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin. |
Tasks | Music Genre Recognition, Node Classification |
Published | 2019-03-28 |
URL | https://arxiv.org/abs/1903.11960v3 |
https://arxiv.org/pdf/1903.11960v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-discrete-structures-for-graph-neural |
Repo | https://github.com/lucfra/LDS-GNN |
Framework | tf |
Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Title | Learning Counterfactual Representations for Estimating Individual Dose-Response Curves |
Authors | Patrick Schwab, Lorenz Linhardt, Stefan Bauer, Joachim M. Buhmann, Walter Karlen |
Abstract | Estimating what would be an individual’s potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings with only two treatments that do not have an associated dosage parameter. Here, we present a novel machine-learning approach towards learning counterfactual representations for estimating individual dose-response curves for any number of treatments with continuous dosage parameters with neural networks. Building on the established potential outcomes framework, we introduce performance metrics, model selection criteria, model architectures, and open benchmarks for estimating individual dose-response curves. Our experiments show that the methods developed in this work set a new state-of-the-art in estimating individual dose-response. |
Tasks | Model Selection |
Published | 2019-02-03 |
URL | https://arxiv.org/abs/1902.00981v2 |
https://arxiv.org/pdf/1902.00981v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-counterfactual-representations-for |
Repo | https://github.com/d909b/drnet |
Framework | none |
Variable-lag Granger Causality for Time Series Analysis
Title | Variable-lag Granger Causality for Time Series Analysis |
Authors | Chainarong Amornbunchornvej, Elena Zheleva, Tanya Y. Berger-Wolf |
Abstract | Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis. |
Tasks | Causal Inference, Time Series, Time Series Analysis |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.10829v1 |
https://arxiv.org/pdf/1912.10829v1.pdf | |
PWC | https://paperswithcode.com/paper/variable-lag-granger-causality-for-time |
Repo | https://github.com/DarkEyes/VLTimeSeriesCausality |
Framework | none |
Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time
Title | Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time |
Authors | Christoph Heindl, Thomas Pönitz, Gernot Stübl, Andreas Pichler, Josef Scharinger |
Abstract | Commodity RGB-D sensors capture color images along with dense pixel-wise depth information in real-time. Typical RGB-D sensors are provided with a factory calibration and exhibit erratic depth readings due to coarse calibration values, ageing and thermal influence effects. This limits their applicability in computer vision and robotics. We propose a novel method to accurately calibrate depth considering spatial and thermal influences jointly. Our work is based on Gaussian Process Regression in a four dimensional Cartesian and thermal domain. We propose to leverage modern GPUs for dense depth map correction in real-time. For reproducibility we make our dataset and source code publicly available. |
Tasks | Calibration, Gaussian Processes |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00549v1 |
https://arxiv.org/pdf/1907.00549v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-thermal-depth-correction-of-rgb-d |
Repo | https://github.com/cheind/rgbd-correction |
Framework | tf |
Explainability and Adversarial Robustness for RNNs
Title | Explainability and Adversarial Robustness for RNNs |
Authors | Alexander Hartl, Maximilian Bachl, Joachim Fabini, Tanja Zseby |
Abstract | Recurrent Neural Networks (RNNs) yield attractive properties for constructing Intrusion Detection Systems (IDSs) for network data. With the rise of ubiquitous Machine Learning (ML) systems, malicious actors have been catching up quickly to find new ways to exploit ML vulnerabilities for profit. Recently developed adversarial ML techniques focus on computer vision and their applicability to network traffic is not straightforward: Network packets expose fewer features than an image, are sequential and impose several constraints on their features. We show that despite these completely different characteristics, adversarial samples can be generated reliably for RNNs. To understand a classifier’s potential for misclassification, we extend existing explainability techniques and propose new ones, suitable particularly for sequential data. Applying them shows that already the first packets of a communication flow are of crucial importance and are likely to be targeted by attackers. Feature importance methods show that even relatively unimportant features can be effectively abused to generate adversarial samples. Since traditional evaluation metrics such as accuracy are not sufficient for quantifying the adversarial threat, we propose the Adversarial Robustness Score (ARS) for comparing IDSs, capturing a common notion of adversarial robustness, and show that an adversarial training procedure can significantly and successfully reduce the attack surface. |
Tasks | Feature Importance, Intrusion Detection |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09855v2 |
https://arxiv.org/pdf/1912.09855v2.pdf | |
PWC | https://paperswithcode.com/paper/explainability-and-adversarial-robustness-for |
Repo | https://github.com/CN-TU/adversarial-recurrent-ids |
Framework | none |
Minimal Achievable Sufficient Statistic Learning
Title | Minimal Achievable Sufficient Statistic Learning |
Authors | Milan Cvitkovic, Günther Koliander |
Abstract | We introduce Minimal Achievable Sufficient Statistic (MASS) Learning, a training method for machine learning models that attempts to produce minimal sufficient statistics with respect to a class of functions (e.g. deep networks) being optimized over. In deriving MASS Learning, we also introduce Conserved Differential Information (CDI), an information-theoretic quantity that - unlike standard mutual information - can be usefully applied to deterministically-dependent continuous random variables like the input and output of a deep network. In a series of experiments, we show that deep networks trained with MASS Learning achieve competitive performance on supervised learning and uncertainty quantification benchmarks. |
Tasks | |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07822v2 |
https://arxiv.org/pdf/1905.07822v2.pdf | |
PWC | https://paperswithcode.com/paper/minimal-achievable-sufficient-statistic |
Repo | https://github.com/mwcvitkovic/MASS-Learning |
Framework | pytorch |
The NN-Stacking: Feature weighted linear stacking through neural networks
Title | The NN-Stacking: Feature weighted linear stacking through neural networks |
Authors | Victor Coscrato, Marco Henrique de Almeida Inácio, Rafael Izbicki |
Abstract | Stacking methods improve the prediction performance of regression models. A simple way to stack base regressions estimators is by combining them linearly, as done by \citet{breiman1996stacked}. Even though this approach is useful from an interpretative perspective, it often does not lead to high predictive power. We propose the NN-Stacking method (NNS), which generalizes Breiman’s method by allowing the linear parameters to vary with input features. This improvement enables NNS to take advantage of the fact that distinct base models often perform better at different regions of the feature space. Our method uses neural networks to estimate the stacking coefficients. We show that while our approach keeps the interpretative features of Breiman’s method at a local level, it leads to better predictive power, especially in datasets with large sample sizes. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09735v1 |
https://arxiv.org/pdf/1906.09735v1.pdf | |
PWC | https://paperswithcode.com/paper/the-nn-stacking-feature-weighted-linear |
Repo | https://github.com/randommm/nnstacking |
Framework | pytorch |
Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation
Title | Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation |
Authors | Zequn Sun, Chengming Wang, Wei Hu, Muhao Chen, Jian Dai, Wei Zhang, Yuzhong Qu |
Abstract | Graph neural networks (GNNs) have emerged as a powerful paradigm for embedding-based entity alignment due to their capability of identifying isomorphic subgraphs. However, in real knowledge graphs (KGs), the counterpart entities usually have non-isomorphic neighborhood structures, which easily causes GNNs to yield different representations for them. To tackle this problem, we propose a new KG alignment network, namely AliNet, aiming at mitigating the non-isomorphism of neighborhood structures in an end-to-end manner. As the direct neighbors of counterpart entities are usually dissimilar due to the schema heterogeneity, AliNet introduces distant neighbors to expand the overlap between their neighborhood structures. It employs an attention mechanism to highlight helpful distant neighbors and reduce noises. Then, it controls the aggregation of both direct and distant neighborhood information using a gating mechanism. We further propose a relation loss to refine entity representations. We perform thorough experiments with detailed ablation studies and analyses on five entity alignment datasets, demonstrating the effectiveness of AliNet. |
Tasks | Entity Alignment, Knowledge Graphs |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08936v1 |
https://arxiv.org/pdf/1911.08936v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graph-alignment-network-with-gated |
Repo | https://github.com/nju-websoft/AliNet |
Framework | tf |
Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability
Title | Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability |
Authors | Shubhomoy Das, Md Rakibul Islam, Nitthilan Kannappan Jayakodi, Janardhan Rao Doppa |
Abstract | Anomaly detection (AD) task corresponds to identifying the true anomalies from a given set of data instances. AD algorithms score the data instances and produce a ranked list of candidate anomalies, which are then analyzed by a human to discover the true anomalies. However, this process can be laborious for the human analyst when the number of false-positives is very high. Therefore, in many real-world AD applications including computer security and fraud prevention, the anomaly detector must be configurable by the human analyst to minimize the effort on false positives. In this paper, we study the problem of active learning to automatically tune ensemble of anomaly detectors to maximize the number of true anomalies discovered. We make four main contributions towards this goal. First, we present an important insight that explains the practical successes of AD ensembles and how ensembles are naturally suited for active learning. Second, we present several algorithms for active learning with tree-based AD ensembles. These algorithms help us to improve the diversity of discovered anomalies, generate rule sets for improved interpretability of anomalous instances, and adapt to streaming data settings in a principled manner. Third, we present a novel algorithm called GLocalized Anomaly Detection (GLAD) for active learning with generic AD ensembles. GLAD allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. Fourth, we present extensive experiments to evaluate our insights and algorithms. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup. |
Tasks | Active Learning, Anomaly Detection |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.08930v1 |
http://arxiv.org/pdf/1901.08930v1.pdf | |
PWC | https://paperswithcode.com/paper/active-anomaly-detection-via-ensembles-1 |
Repo | https://github.com/shubhomoydas/ad_examples |
Framework | tf |