April 1, 2020

2949 words 14 mins read

Paper Group NANR 46

Paper Group NANR 46

A Simple Technique to Enable Saliency Methods to Pass the Sanity Checks. Unsupervised Learning of Graph Hierarchical Abstractions with Differentiable Coarsening and Optimal Transport. How can we generalise learning distributed representations of graphs?. Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification. Attacking Graph …

A Simple Technique to Enable Saliency Methods to Pass the Sanity Checks

Title A Simple Technique to Enable Saliency Methods to Pass the Sanity Checks
Authors Anonymous
Abstract {\em Saliency methods} attempt to explain a deep net’s decision by assigning a {\em score} to each feature/pixel in the input, often doing this credit-assignment via the gradient of the output with respect to input. Recently \citet{adebayosan} questioned the validity of many of these methods since they do not pass simple {\em sanity checks}, which test whether the scores shift/vanish when layers of the trained net are randomized, or when the net is retrained using random labels for inputs. % for the inputs. %Surprisingly, the tested methods did not pass these checks: the explanations were relatively unchanged. We propose a simple fix to existing saliency methods that helps them pass sanity checks, which we call {\em competition for pixels}. This involves computing saliency maps for all possible labels in the classification task, and using a simple competition among them to identify and remove less relevant pixels from the map. Some theoretical justification is provided for it and its performance is empirically demonstrated on several popular methods.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BJeGZxrFvS
PDF https://openreview.net/pdf?id=BJeGZxrFvS
PWC https://paperswithcode.com/paper/a-simple-technique-to-enable-saliency-methods
Repo
Framework

Unsupervised Learning of Graph Hierarchical Abstractions with Differentiable Coarsening and Optimal Transport

Title Unsupervised Learning of Graph Hierarchical Abstractions with Differentiable Coarsening and Optimal Transport
Authors Anonymous
Abstract Hierarchical abstractions are a methodology for solving large-scale graph problems in various disciplines. Coarsening is one such approach: it generates a pyramid of graphs whereby the one in the next level is a structural summary of the prior one. With a long history in scientific computing, many coarsening strategies were developed based on mathematically driven heuristics. Recently, resurgent interests exist in deep learning to design hierarchical methods learnable through differentiable parameterization. These approaches are paired with downstream tasks for supervised learning. In this work, we propose an unsupervised approach, coined \textsc{OTCoarsening}, with the use of optimal transport. Both the coarsening matrix and the transport cost matrix are parameterized, so that an optimal coarsening strategy can be learned and tailored for a given set of graphs. We demonstrate that the proposed approach produces meaningful coarse graphs and yields competitive performance compared with supervised methods for graph classification.
Tasks Graph Classification
Published 2020-01-01
URL https://openreview.net/forum?id=Bkf4XgrKvS
PDF https://openreview.net/pdf?id=Bkf4XgrKvS
PWC https://paperswithcode.com/paper/unsupervised-learning-of-graph-hierarchical
Repo
Framework

How can we generalise learning distributed representations of graphs?

Title How can we generalise learning distributed representations of graphs?
Authors Anonymous
Abstract We propose a general framework to construct unsupervised models capable of learning distributed representations of discrete structures such as graphs based on R-Convolution kernels and distributed semantics research. Our framework combines the insights and observations of Deep Graph Kernels and Graph2Vec towards a unified methodology for performing similarity learning on graphs of arbitrary size. This is exemplified by our own instance G2DR which extends Graph2Vec from labelled graphs towards unlabelled graphs and tackles issues of diagonal dominance through pruning of the subgraph vocabulary composing graphs. These changes produce new state of the art results in the downstream application of G2DR embeddings in graph classification tasks over datasets with small labelled graphs in binary classification to multi-class classification on large unlabelled graphs using an off-the-shelf support vector machine.
Tasks Graph Classification
Published 2020-01-01
URL https://openreview.net/forum?id=r1xI-gHFDH
PDF https://openreview.net/pdf?id=r1xI-gHFDH
PWC https://paperswithcode.com/paper/how-can-we-generalise-learning-distributed
Repo
Framework

Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification

Title Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification
Authors Anonymous
Abstract Graph Neural Nets (GNNs) have received increasing attentions, partially due to their superior performance in many node and graph classification tasks. However, there is a lack of understanding on what they are learning and how sophisticated the learned graph functions are. In this work, we propose a dissection of GNNs on graph classification into two parts: 1) the graph filtering, where graph-based neighbor aggregations are performed, and 2) the set function, where a set of hidden node features are composed for prediction. To study the importance of both parts, we propose to linearize them separately. We first linearize the graph filtering function, resulting Graph Feature Network (GFN), which is a simple lightweight neural net defined on a \textit{set} of graph augmented features. Further linearization of GFN’s set function results in Graph Linear Network (GLN), which is a linear function. Empirically we perform evaluations on common graph classification benchmarks. To our surprise, we find that, despite the simplification, GFN could match or exceed the best accuracies produced by recently proposed GNNs (with a fraction of computation cost), while GLN underperforms significantly. Our results demonstrate the importance of non-linear set function, and suggest that linear graph filtering with non-linear set function is an efficient and powerful scheme for modeling existing graph classification benchmarks.
Tasks Graph Classification
Published 2020-01-01
URL https://openreview.net/forum?id=BJxQxeBYwH
PDF https://openreview.net/pdf?id=BJxQxeBYwH
PWC https://paperswithcode.com/paper/are-powerful-graph-neural-nets-necessary-a
Repo
Framework

Attacking Graph Convolutional Networks via Rewiring

Title Attacking Graph Convolutional Networks via Rewiring
Authors Anonymous
Abstract Graph Neural Networks (GNNs) have boosted the performance of many graph related tasks such as node classification and graph classification. Recent researches show that graph neural networks are vulnerable to adversarial attacks, which deliberately add carefully created unnoticeable perturbation to the graph structure. The perturbation is usually created by adding/deleting a few edges, which might be noticeable even when the number of edges modified is small. In this paper, we propose a graph rewiring operation which affects the graph in a less noticeable way compared to adding/deleting edges. We then use reinforcement learning to learn the attack strategy based on the proposed rewiring operation. Experiments on real world graphs demonstrate the effectiveness of the proposed framework. To understand the proposed framework, we further analyze how its generated perturbation to the graph structure affects the output of the target model.
Tasks Graph Classification, Node Classification
Published 2020-01-01
URL https://openreview.net/forum?id=B1eXygBFPH
PDF https://openreview.net/pdf?id=B1eXygBFPH
PWC https://paperswithcode.com/paper/attacking-graph-convolutional-networks-via-1
Repo
Framework

Feature-based Augmentation for Semi-Supervised Learning

Title Feature-based Augmentation for Semi-Supervised Learning
Authors Anonymous
Abstract In this paper, we propose a feature-based augmentation, a simple and efficient method for semi-supervised learning, where only a small part of the data is labeled. In semi-supervised learning, input image augmentation is typically known to be a technique for ensuring generalization of unlabeled data. However, unlike general input augmentation(translation, filp, Gaussian noise, etc.), our method adds noise to features that have the most contribution on prediction, generating an augmented features. We call this method ``Feature-based augmentation” because the noise is determined by the network weight itself and augmentation is carried out at the feature level. A prediction by augmented features is used as a target for unlabeled data. The target is stable because it is augmented by the noise based on its extracted features. Feature-based augmentation is applied to semi-supervised learning on SVHN, CIFAR-10 datasets. This method achieved a state-of-the-art error rate. In particular, performance differences from other methods were more pronounced with the smaller the number of labeled data. |
Tasks Image Augmentation
Published 2020-01-01
URL https://openreview.net/forum?id=BkgE2yHYDr
PDF https://openreview.net/pdf?id=BkgE2yHYDr
PWC https://paperswithcode.com/paper/feature-based-augmentation-for-semi
Repo
Framework

Knowledge Graph Embedding: A Probabilistic Perspective and Generalization Bounds

Title Knowledge Graph Embedding: A Probabilistic Perspective and Generalization Bounds
Authors Anonymous
Abstract We study theoretical properties of embedding methods for knowledge graph completion under the missing completely at random assumption. We prove generalization error bounds for this setting. Even though the missing completely at random setting may seem naive, it is actually how knowledge graph embedding methods are typically benchmarked in the literature. Our results provide, to certain extent, an explanation for why knowledge graph embedding methods work (as much as classical learning theory results provide explanations for classical learning from i.i.d. data).
Tasks Graph Embedding, Knowledge Graph Completion, Knowledge Graph Embedding
Published 2020-01-01
URL https://openreview.net/forum?id=SJg2j0VFPB
PDF https://openreview.net/pdf?id=SJg2j0VFPB
PWC https://paperswithcode.com/paper/knowledge-graph-embedding-a-probabilistic
Repo
Framework

Generative Hierarchical Models for Parts, Objects, and Scenes

Title Generative Hierarchical Models for Parts, Objects, and Scenes
Authors Anonymous
Abstract Hierarchical structure such as part-whole relationship in objects and scenes are the most inherent structure in natural scenes. Learning such representation via unsupervised learning can provide various benefits such as interpretability, compositionality, and transferability, which are important in many downstream tasks. In this paper, we propose the first hierarchical generative model for learning multiple latent part-whole relationships in a scene. During inference, taking top-down approach, our model infers the representation of more abstract concept (e.g., objects) and then infers that of more specific concepts (e.g., parts) by conditioning on the corresponding abstract concept. This makes the model avoid a difficult problem of routing between parts and whole. In experiments on images containing multiple objects with different shapes and part compositions, we demonstrate that our model can learn the latent hierarchical structure between parts and wholes and generate imaginary scenes.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SkeATxrKwH
PDF https://openreview.net/pdf?id=SkeATxrKwH
PWC https://paperswithcode.com/paper/generative-hierarchical-models-for-parts
Repo
Framework

Analyzing Privacy Loss in Updates of Natural Language Models

Title Analyzing Privacy Loss in Updates of Natural Language Models
Authors Anonymous
Abstract To continuously improve quality and reflect changes in data, machine learning-based services have to regularly re-train and update their core models. In the setting of language models, we show that a comparative analysis of model snapshots before and after an update can reveal a surprising amount of detailed information about the changes in the data used for training before and after the update. We discuss the privacy implications of our findings, propose mitigation strategies and evaluate their effect.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=B1xoserKPH
PDF https://openreview.net/pdf?id=B1xoserKPH
PWC https://paperswithcode.com/paper/analyzing-privacy-loss-in-updates-of-natural
Repo
Framework

Wyner VAE: A Variational Autoencoder with Succinct Common Representation Learning

Title Wyner VAE: A Variational Autoencoder with Succinct Common Representation Learning
Authors Anonymous
Abstract A new variational autoencoder (VAE) model is proposed that learns a succinct common representation of two correlated data variables for conditional and joint generation tasks. The proposed Wyner VAE model is based on two information theoretic problems—distributed simulation and channel synthesis—in which Wyner’s common information arises as the fundamental limit of the succinctness of the common representation. The Wyner VAE decomposes a pair of correlated data variables into their common representation (e.g., a shared concept) and local representations that capture the remaining randomness (e.g., texture and style) in respective data variables by imposing the mutual information between the data variables and the common representation as a regularization term. The utility of the proposed approach is demonstrated through experiments for joint and conditional generation with and without style control using synthetic data and real images. Experimental results show that learning a succinct common representation achieves better generative performance and that the proposed model outperforms existing VAE variants and the variational information bottleneck method.
Tasks Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=r1g1CAEKDH
PDF https://openreview.net/pdf?id=r1g1CAEKDH
PWC https://paperswithcode.com/paper/wyner-vae-a-variational-autoencoder-with
Repo
Framework

Pre-trained Contextual Embedding of Source Code

Title Pre-trained Contextual Embedding of Source Code
Authors Anonymous
Abstract The source code of a program not only serves as a formal description of an executable task, but it also serves to communicate developer intent in a human-readable form. To facilitate this, developers use meaningful identifier names and natural-language documentation. This makes it possible to successfully apply sequence-modeling approaches, shown to be effective in natural-language processing, to source code. A major advancement in natural-language understanding has been the use of pre-trained token embeddings; BERT and other works have further shown that pre-trained contextual embeddings can be extremely powerful and can be finetuned effectively for a variety of downstream supervised tasks. Inspired by these developments, we present the first attempt to replicate this success on source code. We curate a massive corpus of Python programs from GitHub to pre-train a BERT model, which we call Code Understanding BERT (CuBERT). We also pre-train Word2Vec embeddings on the same dataset. We create a benchmark of five classification tasks and compare finetuned CuBERT against sequence models trained with and without the Word2Vec embeddings. Our results show that CuBERT outperforms the baseline methods by a margin of 2.9-22%. We also show its superiority when finetuned with smaller datasets, and over fewer epochs.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=rygoURNYvS
PDF https://openreview.net/pdf?id=rygoURNYvS
PWC https://paperswithcode.com/paper/pre-trained-contextual-embedding-of-source
Repo
Framework

Irrationality can help reward inference

Title Irrationality can help reward inference
Authors Anonymous
Abstract Specifying reward functions is difficult, which motivates the area of reward inference: learning rewards from human behavior. The starting assumption in the area is that human behavior is optimal given the desired reward function, but in reality people have many different forms of irrationality, from noise to myopia to risk aversion and beyond. This fact seems like it will be strictly harmful to reward inference: it is already hard to infer the reward from rational behavior, and noise and systematic biases make actions have less direct of a relationship to the reward. Our insight in this work is that, contrary to expectations, irrationality can actually help rather than hinder reward inference. For some types and amounts of irrationality, the expert now produces more varied policies compared to rational behavior, which help disambiguate among different reward parameters – those that otherwise correspond to the same rational behavior. We put this to the test in a systematic analysis of the effect of irrationality on reward inference. We start by covering the space of irrationalities as deviations from the Bellman update, simulate expert behavior, and measure the accuracy of inference to contrast the different types and study the gains and losses. We provide a mutual information-based analysis of our findings, and wrap up by discussing the need to accurately model irrationality, as well as to what extent we might expect (or be able to train) real people to exhibit helpful irrationalities when teaching rewards to learners.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BJlo91BYPr
PDF https://openreview.net/pdf?id=BJlo91BYPr
PWC https://paperswithcode.com/paper/irrationality-can-help-reward-inference
Repo
Framework

Improving the robustness of ImageNet classifiers using elements of human visual cognition

Title Improving the robustness of ImageNet classifiers using elements of human visual cognition
Authors Anonymous
Abstract We investigate the robustness properties of image recognition models equipped with two features inspired by human vision, an explicit episodic memory and a shape bias, at the ImageNet scale. As reported in previous work, we show that an explicit episodic memory improves the robustness of image recognition models against small-norm adversarial perturbations under some threat models. It does not, however, improve the robustness against more natural, and typically larger, perturbations. Learning more robust features during training appears to be necessary for robustness in this second sense. We show that features derived from a model that was encouraged to learn global, shape-based representations (Geirhos et al., 2019) do not only improve the robustness against natural perturbations, but when used in conjunction with an episodic memory, they also provide additional robustness against adversarial perturbations. Finally, we address three important design choices for the episodic memory: memory size, dimensionality of the memories and the retrieval method. We show that to make the episodic memory more compact, it is preferable to reduce the number of memories by clustering them, instead of reducing their dimensionality.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HylYtaVtwS
PDF https://openreview.net/pdf?id=HylYtaVtwS
PWC https://paperswithcode.com/paper/improving-the-robustness-of-imagenet-1
Repo
Framework

Contrastive Learning of Structured World Models

Title Contrastive Learning of Structured World Models
Authors Anonymous
Abstract A structured understanding of our world in terms of objects, relations, and hierarchies is an important component of human cognition. Learning such a structured world model from raw sensory data remains a challenge. As a step towards this goal, we introduce Contrastively-trained Structured World Models (C-SWMs). C-SWMs utilize a contrastive approach for representation learning in environments with compositional structure. We structure each state embedding as a set of object representations and their relations, modeled by a graph neural network. This allows objects to be discovered from raw pixel observations without direct supervision as part of the learning process. We evaluate C-SWMs on compositional environments involving multiple interacting objects that can be manipulated independently by an agent, simple Atari games, and a multi-object physics simulation. Our experiments demonstrate that C-SWMs can overcome limitations of models based on pixel reconstruction and outperform typical representatives of this model class in highly structured environments, while learning interpretable object-based representations.
Tasks Atari Games, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=H1gax6VtDB
PDF https://openreview.net/pdf?id=H1gax6VtDB
PWC https://paperswithcode.com/paper/contrastive-learning-of-structured-world
Repo
Framework

Efficient Systolic Array Based on Decomposable MAC for Quantized Deep Neural Networks

Title Efficient Systolic Array Based on Decomposable MAC for Quantized Deep Neural Networks
Authors Ning-Chi Huang, Huan-Jan Chou, Kai-Chiang Wu
Abstract Deep Neural Networks (DNNs) have achieved high accuracy in various machine learning applications in recent years. As the recognition accuracy of deep learning applications increases, reducing the complexity of these neural networks and performing the DNN computation on embedded systems or mobile devices become an emerging and crucial challenge. Quantization has been presented to reduce the utilization of computational resources by compressing the input data and weights from floating-point numbers to integers with shorter bit-width. For practical power reduction, it is necessary to operate these DNNs with quantized parameters on appropriate hardware. Therefore, systolic arrays are adopted to be the major computation units for matrix multiplication in DNN accelerators. To obtain a better tradeoff between the precision/accuracy and power consumption, using parameters with various bit-widths among different layers within a DNN is an advanced quantization method. In this paper, we propose a novel decomposition strategy to construct a low-power decomposable multiplier-accumulator (MAC) for the energy efficiency of quantized DNNs. In the experiments, when 65% multiplication operations of VGG-16 are operated in shorter bit-width with at most 1% accuracy loss on the CIFAR-10 dataset, our decomposable MAC has 50% energy reduction compared with a non-decomposable MAC.
Tasks Quantization
Published 2020-01-01
URL https://openreview.net/forum?id=Hye-p0VFPB
PDF https://openreview.net/pdf?id=Hye-p0VFPB
PWC https://paperswithcode.com/paper/efficient-systolic-array-based-on
Repo
Framework
comments powered by Disqus