April 1, 2020

2814 words 14 mins read

Paper Group NAWR 6

Paper Group NAWR 6

Encoder-Agnostic Adaptation for Conditional Language Generation. LEARNED STEP SIZE QUANTIZATION. Deep Graph Translation. RPGAN: random paths as a latent space for GAN interpretability. Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization. Composition-based Multi-Relational Graph Convolutional Networks. Argus: Efficient …

Encoder-Agnostic Adaptation for Conditional Language Generation

Title Encoder-Agnostic Adaptation for Conditional Language Generation
Authors Anonymous
Abstract Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However, it is an open question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work, we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretrained transformer models are sensitive to large parameter changes during tuning. Therefore, we propose an adaptation that directly injects arbitrary conditioning into self attention, an approach we call pseudo self attention. Through experiments on four diverse conditional text generation tasks, we show that this encoder-agnostic technique outperforms strong baselines, produces coherent generations, and is data-efficient.
Tasks Language Modelling, Text Generation
Published 2020-01-01
URL https://openreview.net/forum?id=B1xq264YvH
PDF https://openreview.net/pdf?id=B1xq264YvH
PWC https://paperswithcode.com/paper/encoder-agnostic-adaptation-for-conditional-1
Repo https://github.com/anon37234/encoder-agnostic-adaptation
Framework pytorch

LEARNED STEP SIZE QUANTIZATION

Title LEARNED STEP SIZE QUANTIZATION
Authors Anonymous
Abstract Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the ImageNet dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3- or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Our approach builds upon existing methods for learning weights in quantized networks by improving how the quantizer itself is configured. Specifically, we introduce a novel means to estimate and scale the task loss gradient at each weight and activation layer’s quantizer step size, such that it can be learned in conjunction with other network parameters. This approach works using different levels of precision as needed for a given system and requires only a simple modification of existing training code.
Tasks Quantization
Published 2020-01-01
URL https://openreview.net/forum?id=rkgO66VKDS
PDF https://openreview.net/pdf?id=rkgO66VKDS
PWC https://paperswithcode.com/paper/learned-step-size-quantization-1
Repo https://github.com/hustzxd/LSQuantization
Framework pytorch

Deep Graph Translation

Title Deep Graph Translation
Authors Anonymous
Abstract Deep graph generation models have achieved great successes recently, among which, however, are typically unconditioned generative models that have no control over the target graphs are given an input graph. In this paper, we propose a novel Graph-Translation-Generative-Adversarial-Networks (GT-GAN) that transforms the input graphs into their target output graphs. GT-GAN consists of a graph translator equipped with innovative graph convolution and deconvolution layers to learn the translation mapping considering both global and local features, and a new conditional graph discriminator to classify target graphs by conditioning on input graphs. Extensive experiments on multiple synthetic and real-world datasets demonstrate that our proposed GT-GAN significantly outperforms other baseline methods in terms of both effectiveness and scalability. For instance, GT-GAN achieves at least 10X and 15X faster runtimes than GraphRNN and RandomVAE, respectively, when the size of the graph is around 50.
Tasks Graph Generation
Published 2020-01-01
URL https://openreview.net/forum?id=r1e0G04Kvr
PDF https://openreview.net/pdf?id=r1e0G04Kvr
PWC https://paperswithcode.com/paper/deep-graph-translation-1
Repo https://github.com/anonymous1025/Deep-Graph-Translation-
Framework pytorch

RPGAN: random paths as a latent space for GAN interpretability

Title RPGAN: random paths as a latent space for GAN interpretability
Authors Anonymous
Abstract In this paper, we introduce Random Path Generative Adversarial Network (RPGAN) — an alternative scheme of GANs that can serve as a tool for generative model analysis. While the latent space of a typical GAN consists of input vectors, randomly sampled from the standard Gaussian distribution, the latent space of RPGAN consists of random paths in a generator network. As we show, this design allows to associate different layers of the generator with different regions of the latent space, providing their natural interpretability. With experiments on standard benchmarks, we demonstrate that RPGAN reveals several interesting insights about roles that different layers play in the image generation process. Aside from interpretability, the RPGAN model also provides competitive generation quality and allows efficient incremental learning on new data.
Tasks Image Generation
Published 2020-01-01
URL https://openreview.net/forum?id=BJgctpEKwr
PDF https://openreview.net/pdf?id=BJgctpEKwr
PWC https://paperswithcode.com/paper/rpgan-random-paths-as-a-latent-space-for-gan
Repo https://github.com/rpgan-ICLR2020/RPGAN
Framework pytorch

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization

Title Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization
Authors Anonymous
Abstract Transferring knowledge across tasks to improve data-efficiency is one of the open key challenges in the area of global optimization algorithms. Readily available algorithms are typically designed to be universal optimizers and, thus, often suboptimal for specific tasks. We propose a novel transfer learning method to obtain customized optimizers within the well-established framework of Bayesian optimization, allowing our algorithm to utilize the proven generalization capabilities of Gaussian processes. Using reinforcement learning to meta-train an acquisition function (AF) on a set of related tasks, the proposed method learns to extract implicit structural information and to exploit it for improved data-efficiency. We present experiments on a sim-to-real transfer task as well as on several simulated functions and two hyperparameter search problems. The results show that our algorithm (1) automatically identifies structural properties of objective functions from available source tasks or simulations, (2) performs favourably in settings with both scarse and abundant source data, and (3) falls back to the performance level of general AFs if no structure is present.
Tasks Gaussian Processes, Meta-Learning, Transfer Learning
Published 2020-01-01
URL https://openreview.net/forum?id=ryeYpJSKwr
PDF https://openreview.net/pdf?id=ryeYpJSKwr
PWC https://paperswithcode.com/paper/meta-learning-acquisition-functions-for-1
Repo https://github.com/metabo-iclr2020/MetaBO
Framework none

Composition-based Multi-Relational Graph Convolutional Networks

Title Composition-based Multi-Relational Graph Convolutional Networks
Authors Anonymous
Abstract Graph Convolutional Networks (GCNs) have recently been shown to be quite successful in modeling graph-structured data. However, the primary focus has been on handling simple undirected graphs. Multi-relational graphs are a more general and prevalent form of graphs where each edge has a label and direction associated with it. Most of the existing approaches to handle such graphs suffer from over-parameterization and are restricted to learning representations of nodes only. In this paper, we propose CompGCN, a novel Graph Convolutional framework which jointly embeds both nodes and relations in a relational graph. CompGCN leverages a variety of entity-relation composition operations from Knowledge Graph Embedding techniques and scales with the number of relations. It also generalizes several of the existing multi-relational GCN methods. We evaluate our proposed method on multiple tasks such as node classification, link prediction, and graph classification, and achieve demonstrably superior results. We make the source code of CompGCN available to foster reproducible research.
Tasks Graph Classification, Graph Embedding, Knowledge Graph Embedding, Link Prediction, Node Classification
Published 2020-01-01
URL https://openreview.net/forum?id=BylA_C4tPr
PDF https://openreview.net/pdf?id=BylA_C4tPr
PWC https://paperswithcode.com/paper/composition-based-multi-relational-graph-1
Repo https://github.com/malllabiisc/CompGCN
Framework pytorch

Argus: Efficient Activity Detection System for Extended Video Analysis

Title Argus: Efficient Activity Detection System for Extended Video Analysis
Authors Wenhe Liu, Guoliang Kang, Po-Yao Huang, Xiaojun Chang, Yijun Qian, Junwei Liang, Liangke Gui, Jing Wen, Peng Chen
Abstract We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario. For the spatial-temporal event detection in the surveillance video, we first generate video proposals by applying object detection and tracking algorithm which shared the detection features. After that, we extract several different features and apply sequential activity classification with them. Finally, we eliminate inaccurate events and fuse all the predictions from different features. The proposed system wins Trecvid Activities in Extended Video (ActEV) challenge 2019. It achieves the first place with 60.5 mean weighted Pmiss, out-performing the second place system by 14.5 and the baseline R-C3D by 29.0. In TRECVID 2019 Challenge, the proposed system wins the first place with pAUDC@ 0.2 tfa 0.48407
Tasks Action Detection, Activity Detection, Multi-Object Tracking, Object Detection, Video Object Detection, Video Object Tracking
Published 2020-03-02
URL http://openaccess.thecvf.com/content_WACVW_2020/html/w5/Liu_Argus_Efficient_Activity_Detection_System_for_Extended_Video_Analysis_WACVW_2020_paper.html
PDF http://openaccess.thecvf.com/content_WACVW_2020/papers/w5/Liu_Argus_Efficient_Activity_Detection_System_for_Extended_Video_Analysis_WACVW_2020_paper.pdf
PWC https://paperswithcode.com/paper/argus-efficient-activity-detection-system-for
Repo https://github.com/JunweiLiang/Object_Detection_Tracking
Framework tf

Multi-scale Attributed Node Embedding

Title Multi-scale Attributed Node Embedding
Authors Anonymous
Abstract We present network embedding algorithms that capture information about a node from the local distribution over node attributes around it, as observed over random walks following an approach similar to Skip-gram. Observations from neighborhoods of different sizes are either pooled (AE) or encoded distinctly in a multi-scale approach (MUSAE). Capturing attribute-neighborhood relationships over multiple scales is useful for a diverse range of applications, including latent feature identification across disconnected networks with similar attributes. We prove theoretically that matrices of node-feature pointwise mutual information are implicitly factorized by the embeddings. Experiments show that our algorithms are robust, computationally efficient and outperform comparable models on social, web and citation network datasets.
Tasks Network Embedding
Published 2020-01-01
URL https://openreview.net/forum?id=HJxiMAVtPH
PDF https://openreview.net/pdf?id=HJxiMAVtPH
PWC https://paperswithcode.com/paper/multi-scale-attributed-node-embedding-1
Repo https://github.com/iclr2020/MUSAE
Framework none

Deep symbolic regression

Title Deep symbolic regression
Authors Anonymous
Abstract Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of symbolic regression. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are lacking. We propose a framework that combines deep learning with symbolic regression via a simple idea: use a large model to search the space of small models. More specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions, and employ reinforcement learning to train the network to generate better-fitting expressions. Our algorithm significantly outperforms standard genetic programming-based symbolic regression in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate a priori constraints in situ.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=S1gKA6NtPS
PDF https://openreview.net/pdf?id=S1gKA6NtPS
PWC https://paperswithcode.com/paper/deep-symbolic-regression
Repo https://github.com/brendenpetersen/deep-symbolic-regression
Framework none

GraphSAINT: Graph Sampling Based Inductive Learning Method

Title GraphSAINT: Graph Sampling Based Inductive Learning Method
Authors Anonymous
Abstract Graph Convolutional Networks (GCNs) are powerful models for learning representations of attributed graphs. To scale GCNs to large graphs, state-of-the-art methods use various layer sampling techniques to alleviate the “neighbor explosion” problem during minibatch training. We propose GraphSAINT, a graph sampling based inductive learning method that improves training efficiency and accuracy in a fundamentally different way. By changing perspective, GraphSAINT constructs minibatches by sampling the training graph, rather than the nodes or edges across GCN layers. Each iteration, a complete GCN is built from the properly sampled subgraph. Thus, we ensure fixed number of well-connected nodes in all layers. We further propose normalization technique to eliminate bias, and sampling algorithms for variance reduction. Importantly, we can decouple the sampling from the forward and backward propagation, and extend GraphSAINT with many architecture variants (e.g., graph attention, jumping connection). GraphSAINT demonstrates superior performance in both accuracy and training time on five large graphs, and achieves new state-of-the-art F1 scores for PPI (0.995) and Reddit (0.970).
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BJe8pkHFwS
PDF https://openreview.net/pdf?id=BJe8pkHFwS
PWC https://paperswithcode.com/paper/graphsaint-graph-sampling-based-inductive-1
Repo https://github.com/GraphSAINT/GraphSAINT
Framework tf

Restricting the Flow: Information Bottlenecks for Attribution

Title Restricting the Flow: Information Bottlenecks for Attribution
Authors Anonymous
Abstract Attribution methods provide insights into the decision-making of machine learning models like artificial neural networks. For a given input sample, they assign a relevance score to each individual input variable, such as the pixels of an image. In this work we adapt the information bottleneck concept for attribution. By adding noise to intermediate feature maps we restrict the flow of information and can quantify (in bits) how much information image regions provide. We compare our method against ten baselines using three different metrics on VGG-16 and ResNet-50, and find that our methods outperform all baselines in five out of six settings. The method’s information-theoretic foundation provides an absolute frame of reference for attribution values (bits) and a guarantee that regions scored close to zero are not required for the network’s decision.
Tasks Decision Making
Published 2020-01-01
URL https://openreview.net/forum?id=S1xWh1rYwB
PDF https://openreview.net/pdf?id=S1xWh1rYwB
PWC https://paperswithcode.com/paper/restricting-the-flow-information-bottlenecks
Repo https://github.com/attribution-bottleneck/attribution-bottleneck-pytorch
Framework pytorch

Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks

Title Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks
Authors Anonymous
Abstract While tasks could come with varying the number of instances and classes in realistic settings, the existing meta-learning approaches for few-shot classification assume that number of instances per task and class is fixed. Due to such restriction, they learn to equally utilize the meta-knowledge across all the tasks, even when the number of instances per task and class largely varies. Moreover, they do not consider distributional difference in unseen tasks, on which the meta-knowledge may have less usefulness depending on the task relatedness. To overcome these limitations, we propose a novel meta-learning model that adaptively balances the effect of the meta-learning and task-specific learning within each task. Through the learning of the balancing variables, we can decide whether to obtain a solution by relying on the meta-knowledge or task-specific learning. We formulate this objective into a Bayesian inference framework and tackle it using variational inference. We validate our Bayesian Task-Adaptive Meta-Learning (Bayesian TAML) on two realistic task- and class-imbalanced datasets, on which it significantly outperforms existing meta-learning approaches. Further ablation study confirms the effectiveness of each balancing component and the Bayesian learning framework.
Tasks Bayesian Inference, Meta-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=rkeZIJBYvr
PDF https://openreview.net/pdf?id=rkeZIJBYvr
PWC https://paperswithcode.com/paper/learning-to-balance-bayesian-meta-learning-1
Repo https://github.com/haebeom-lee/l2b
Framework tf

Deep Bayesian Structure Networks

Title Deep Bayesian Structure Networks
Authors Anonymous
Abstract Bayesian neural networks (BNNs) introduce uncertainty estimation to deep networks by performing Bayesian inference on network weights. However, such models bring the challenges of inference, and further BNNs with weight uncertainty rarely achieve superior performance to standard models. In this paper, we investigate a new line of Bayesian deep learning by performing Bayesian reasoning on the structure of deep neural networks. Drawing inspiration from the neural architecture search, we define the network structure as random weights on the redundant operations between computational nodes, and apply stochastic variational inference techniques to learn the structure distributions of networks. Empirically, the proposed method substantially surpasses the advanced deep neural networks across a range of classification and segmentation tasks. More importantly, our approach also preserves benefits of Bayesian principles, producing improved uncertainty estimation than the strong baselines including MC dropout and variational BNNs algorithms (e.g. noisy EK-FAC).
Tasks Bayesian Inference, Neural Architecture Search
Published 2020-01-01
URL https://openreview.net/forum?id=B1gXR3NtwS
PDF https://openreview.net/pdf?id=B1gXR3NtwS
PWC https://paperswithcode.com/paper/deep-bayesian-structure-networks
Repo https://github.com/anonymousest/DBSN
Framework tf

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

Title U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
Authors Anonymous
Abstract We propose a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. The attention module guides our model to focus on more important regions distinguishing between source and target domains based on the attention map obtained by the auxiliary classifier. Unlike previous attention-based method which cannot handle the geometric changes between domains, our model can translate both images requiring holistic changes and images requiring large shape changes. Moreover, our new AdaLIN (Adaptive Layer-Instance Normalization) function helps our attention-guided model to flexibly control the amount of change in shape and texture by learned parameters depending on datasets. Experimental results show the superiority of the proposed method compared to the existing state-of-the-art models with a fixed network architecture and hyper-parameters.
Tasks Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published 2020-01-01
URL https://openreview.net/forum?id=BJlZ5ySKPH
PDF https://openreview.net/pdf?id=BJlZ5ySKPH
PWC https://paperswithcode.com/paper/u-gat-it-unsupervised-generative-attentional-1
Repo https://github.com/taki0112/UGATIT
Framework tf

Destruction from sky: Weakly supervised approach for destruction detection in satellite imagery

Title Destruction from sky: Weakly supervised approach for destruction detection in satellite imagery
Authors Muhammad Usman Ali, Waqas Sultani, Mohsen Ali
Abstract Natural and man-made disasters cause huge damage to built infrastructures and results in loss of human lives. The rehabilitation efforts and rescue operations are hampered by the non-availability of accurate and timely information regarding the location of damaged infrastructure and its extent. In this paper, we model the destruction in satellite imagery using a deep learning model employing a weakly-supervised approach. In stark contrast to previous approaches, instead of solving the problem as change detection (using pre and post-event images), we model to identify destruction itself using a single post-event image. To overcome the challenge of collecting pixel-level ground truth data mostly used during training, we only assume image-level labels, representing either destruction is present (at any location) in a given image or not. The proposed attention-based mechanism learns to identify the image-patches with destruction automatically under the sparsity constraint. Furthermore, to reduce false-positive and improve segmentation quality, a hard negative mining technique has been proposed that results in considerable improvement over baseline. To validate our approach, we have collected a new dataset containing destruction and non-destruction images from Indonesia, Yemen, Japan, and Pakistan. On testing-dataset, we obtained excellent destruction results with pixel-level accuracy of 93% and patch level accuracy of 91%. The source code and dataset will be made publicly available.
Tasks
Published 2020-04-01
URL http://im.itu.edu.pk/destruction-detection/
PDF http://im.itu.edu.pk/wp-content/uploads/2020/02/id_compressed.pdf
PWC https://paperswithcode.com/paper/destruction-from-sky-weakly-supervised
Repo https://github.com/usmanali414/Destruction-Detection-in-Satellite-Imagery
Framework tf
comments powered by Disqus