April 1, 2020

2814 words 14 mins read

Paper Group NAWR 6

Encoder-Agnostic Adaptation for Conditional Language Generation. LEARNED STEP SIZE QUANTIZATION. Deep Graph Translation. RPGAN: random paths as a latent space for GAN interpretability. Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization. Composition-based Multi-Relational Graph Convolutional Networks. Argus: Efficient …

Encoder-Agnostic Adaptation for Conditional Language Generation


Title	Encoder-Agnostic Adaptation for Conditional Language Generation
Authors	Anonymous
Abstract	Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However, it is an open question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work, we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretrained transformer models are sensitive to large parameter changes during tuning. Therefore, we propose an adaptation that directly injects arbitrary conditioning into self attention, an approach we call pseudo self attention. Through experiments on four diverse conditional text generation tasks, we show that this encoder-agnostic technique outperforms strong baselines, produces coherent generations, and is data-efficient.
Tasks	Language Modelling, Text Generation
Published	2020-01-01
URL	https://openreview.net/forum?id=B1xq264YvH
PDF	https://openreview.net/pdf?id=B1xq264YvH
PWC	https://paperswithcode.com/paper/encoder-agnostic-adaptation-for-conditional-1
Repo	https://github.com/anon37234/encoder-agnostic-adaptation
Framework	pytorch

LEARNED STEP SIZE QUANTIZATION


Title	LEARNED STEP SIZE QUANTIZATION
Authors	Anonymous
Abstract	Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the ImageNet dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3- or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Our approach builds upon existing methods for learning weights in quantized networks by improving how the quantizer itself is configured. Specifically, we introduce a novel means to estimate and scale the task loss gradient at each weight and activation layer’s quantizer step size, such that it can be learned in conjunction with other network parameters. This approach works using different levels of precision as needed for a given system and requires only a simple modification of existing training code.
Tasks	Quantization
Published	2020-01-01
URL	https://openreview.net/forum?id=rkgO66VKDS
PDF	https://openreview.net/pdf?id=rkgO66VKDS
PWC	https://paperswithcode.com/paper/learned-step-size-quantization-1
Repo	https://github.com/hustzxd/LSQuantization
Framework	pytorch

Deep Graph Translation


Title	Deep Graph Translation
Authors	Anonymous
Abstract	Deep graph generation models have achieved great successes recently, among which, however, are typically unconditioned generative models that have no control over the target graphs are given an input graph. In this paper, we propose a novel Graph-Translation-Generative-Adversarial-Networks (GT-GAN) that transforms the input graphs into their target output graphs. GT-GAN consists of a graph translator equipped with innovative graph convolution and deconvolution layers to learn the translation mapping considering both global and local features, and a new conditional graph discriminator to classify target graphs by conditioning on input graphs. Extensive experiments on multiple synthetic and real-world datasets demonstrate that our proposed GT-GAN significantly outperforms other baseline methods in terms of both effectiveness and scalability. For instance, GT-GAN achieves at least 10X and 15X faster runtimes than GraphRNN and RandomVAE, respectively, when the size of the graph is around 50.
Tasks	Graph Generation
Published	2020-01-01
URL	https://openreview.net/forum?id=r1e0G04Kvr
PDF	https://openreview.net/pdf?id=r1e0G04Kvr
PWC	https://paperswithcode.com/paper/deep-graph-translation-1
Repo	https://github.com/anonymous1025/Deep-Graph-Translation-
Framework	pytorch

RPGAN: random paths as a latent space for GAN interpretability


Title	RPGAN: random paths as a latent space for GAN interpretability
Authors	Anonymous
Abstract	In this paper, we introduce Random Path Generative Adversarial Network (RPGAN) — an alternative scheme of GANs that can serve as a tool for generative model analysis. While the latent space of a typical GAN consists of input vectors, randomly sampled from the standard Gaussian distribution, the latent space of RPGAN consists of random paths in a generator network. As we show, this design allows to associate different layers of the generator with different regions of the latent space, providing their natural interpretability. With experiments on standard benchmarks, we demonstrate that RPGAN reveals several interesting insights about roles that different layers play in the image generation process. Aside from interpretability, the RPGAN model also provides competitive generation quality and allows efficient incremental learning on new data.
Tasks	Image Generation
Published	2020-01-01
URL	https://openreview.net/forum?id=BJgctpEKwr
PDF	https://openreview.net/pdf?id=BJgctpEKwr
PWC	https://paperswithcode.com/paper/rpgan-random-paths-as-a-latent-space-for-gan
Repo	https://github.com/rpgan-ICLR2020/RPGAN
Framework	pytorch

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization


Title	Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization
Authors	Anonymous
Abstract	Transferring knowledge across tasks to improve data-efficiency is one of the open key challenges in the area of global optimization algorithms. Readily available algorithms are typically designed to be universal optimizers and, thus, often suboptimal for specific tasks. We propose a novel transfer learning method to obtain customized optimizers within the well-established framework of Bayesian optimization, allowing our algorithm to utilize the proven generalization capabilities of Gaussian processes. Using reinforcement learning to meta-train an acquisition function (AF) on a set of related tasks, the proposed method learns to extract implicit structural information and to exploit it for improved data-efficiency. We present experiments on a sim-to-real transfer task as well as on several simulated functions and two hyperparameter search problems. The results show that our algorithm (1) automatically identifies structural properties of objective functions from available source tasks or simulations, (2) performs favourably in settings with both scarse and abundant source data, and (3) falls back to the performance level of general AFs if no structure is present.
Tasks	Gaussian Processes, Meta-Learning, Transfer Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=ryeYpJSKwr
PDF	https://openreview.net/pdf?id=ryeYpJSKwr
PWC	https://paperswithcode.com/paper/meta-learning-acquisition-functions-for-1
Repo	https://github.com/metabo-iclr2020/MetaBO
Framework	none

Composition-based Multi-Relational Graph Convolutional Networks


Title	Composition-based Multi-Relational Graph Convolutional Networks
Authors	Anonymous
Abstract	Graph Convolutional Networks (GCNs) have recently been shown to be quite successful in modeling graph-structured data. However, the primary focus has been on handling simple undirected graphs. Multi-relational graphs are a more general and prevalent form of graphs where each edge has a label and direction associated with it. Most of the existing approaches to handle such graphs suffer from over-parameterization and are restricted to learning representations of nodes only. In this paper, we propose CompGCN, a novel Graph Convolutional framework which jointly embeds both nodes and relations in a relational graph. CompGCN leverages a variety of entity-relation composition operations from Knowledge Graph Embedding techniques and scales with the number of relations. It also generalizes several of the existing multi-relational GCN methods. We evaluate our proposed method on multiple tasks such as node classification, link prediction, and graph classification, and achieve demonstrably superior results. We make the source code of CompGCN available to foster reproducible research.
Tasks	Graph Classification, Graph Embedding, Knowledge Graph Embedding, Link Prediction, Node Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=BylA_C4tPr
PDF	https://openreview.net/pdf?id=BylA_C4tPr
PWC	https://paperswithcode.com/paper/composition-based-multi-relational-graph-1
Repo	https://github.com/malllabiisc/CompGCN
Framework	pytorch

Argus: Efficient Activity Detection System for Extended Video Analysis


Title	Argus: Efficient Activity Detection System for Extended Video Analysis
Authors	Wenhe Liu, Guoliang Kang, Po-Yao Huang, Xiaojun Chang, Yijun Qian, Junwei Liang, Liangke Gui, Jing Wen, Peng Chen
Abstract	We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario. For the spatial-temporal event detection in the surveillance video, we first generate video proposals by applying object detection and tracking algorithm which shared the detection features. After that, we extract several different features and apply sequential activity classification with them. Finally, we eliminate inaccurate events and fuse all the predictions from different features. The proposed system wins Trecvid Activities in Extended Video (ActEV) challenge 2019. It achieves the first place with 60.5 mean weighted Pmiss, out-performing the second place system by 14.5 and the baseline R-C3D by 29.0. In TRECVID 2019 Challenge, the proposed system wins the first place with pAUDC@ 0.2 tfa 0.48407
Tasks	Action Detection, Activity Detection, Multi-Object Tracking, Object Detection, Video Object Detection, Video Object Tracking
Published	2020-03-02
URL	http://openaccess.thecvf.com/content_WACVW_2020/html/w5/Liu_Argus_Efficient_Activity_Detection_System_for_Extended_Video_Analysis_WACVW_2020_paper.html
PDF	http://openaccess.thecvf.com/content_WACVW_2020/papers/w5/Liu_Argus_Efficient_Activity_Detection_System_for_Extended_Video_Analysis_WACVW_2020_paper.pdf
PWC	https://paperswithcode.com/paper/argus-efficient-activity-detection-system-for
Repo	https://github.com/JunweiLiang/Object_Detection_Tracking
Framework	tf

Multi-scale Attributed Node Embedding


Title	Multi-scale Attributed Node Embedding
Authors	Anonymous
Abstract	We present network embedding algorithms that capture information about a node from the local distribution over node attributes around it, as observed over random walks following an approach similar to Skip-gram. Observations from neighborhoods of different sizes are either pooled (AE) or encoded distinctly in a multi-scale approach (MUSAE). Capturing attribute-neighborhood relationships over multiple scales is useful for a diverse range of applications, including latent feature identification across disconnected networks with similar attributes. We prove theoretically that matrices of node-feature pointwise mutual information are implicitly factorized by the embeddings. Experiments show that our algorithms are robust, computationally efficient and outperform comparable models on social, web and citation network datasets.
Tasks	Network Embedding
Published	2020-01-01
URL	https://openreview.net/forum?id=HJxiMAVtPH
PDF	https://openreview.net/pdf?id=HJxiMAVtPH
PWC	https://paperswithcode.com/paper/multi-scale-attributed-node-embedding-1
Repo	https://github.com/iclr2020/MUSAE
Framework	none

Deep symbolic regression


Title	Deep symbolic regression
Authors	Anonymous
Abstract	Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of symbolic regression. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are lacking. We propose a framework that combines deep learning with symbolic regression via a simple idea: use a large model to search the space of small models. More specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions, and employ reinforcement learning to train the network to generate better-fitting expressions. Our algorithm significantly outperforms standard genetic programming-based symbolic regression in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate a priori constraints in situ.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=S1gKA6NtPS
PDF	https://openreview.net/pdf?id=S1gKA6NtPS
PWC	https://paperswithcode.com/paper/deep-symbolic-regression
Repo	https://github.com/brendenpetersen/deep-symbolic-regression
Framework	none

GraphSAINT: Graph Sampling Based Inductive Learning Method


Title	GraphSAINT: Graph Sampling Based Inductive Learning Method
Authors	Anonymous
Abstract	Graph Convolutional Networks (GCNs) are powerful models for learning representations of attributed graphs. To scale GCNs to large graphs, state-of-the-art methods use various layer sampling techniques to alleviate the “neighbor explosion” problem during minibatch training. We propose GraphSAINT, a graph sampling based inductive learning method that improves training efficiency and accuracy in a fundamentally different way. By changing perspective, GraphSAINT constructs minibatches by sampling the training graph, rather than the nodes or edges across GCN layers. Each iteration, a complete GCN is built from the properly sampled subgraph. Thus, we ensure fixed number of well-connected nodes in all layers. We further propose normalization technique to eliminate bias, and sampling algorithms for variance reduction. Importantly, we can decouple the sampling from the forward and backward propagation, and extend GraphSAINT with many architecture variants (e.g., graph attention, jumping connection). GraphSAINT demonstrates superior performance in both accuracy and training time on five large graphs, and achieves new state-of-the-art F1 scores for PPI (0.995) and Reddit (0.970).
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=BJe8pkHFwS
PDF	https://openreview.net/pdf?id=BJe8pkHFwS
PWC	https://paperswithcode.com/paper/graphsaint-graph-sampling-based-inductive-1
Repo	https://github.com/GraphSAINT/GraphSAINT
Framework	tf

Restricting the Flow: Information Bottlenecks for Attribution


Title	Restricting the Flow: Information Bottlenecks for Attribution
Authors	Anonymous
Abstract	Attribution methods provide insights into the decision-making of machine learning models like artificial neural networks. For a given input sample, they assign a relevance score to each individual input variable, such as the pixels of an image. In this work we adapt the information bottleneck concept for attribution. By adding noise to intermediate feature maps we restrict the flow of information and can quantify (in bits) how much information image regions provide. We compare our method against ten baselines using three different metrics on VGG-16 and ResNet-50, and find that our methods outperform all baselines in five out of six settings. The method’s information-theoretic foundation provides an absolute frame of reference for attribution values (bits) and a guarantee that regions scored close to zero are not required for the network’s decision.
Tasks	Decision Making
Published	2020-01-01
URL	https://openreview.net/forum?id=S1xWh1rYwB
PDF	https://openreview.net/pdf?id=S1xWh1rYwB
PWC	https://paperswithcode.com/paper/restricting-the-flow-information-bottlenecks
Repo	https://github.com/attribution-bottleneck/attribution-bottleneck-pytorch
Framework	pytorch

Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks


Title	Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks
Authors	Anonymous
Abstract	While tasks could come with varying the number of instances and classes in realistic settings, the existing meta-learning approaches for few-shot classification assume that number of instances per task and class is fixed. Due to such restriction, they learn to equally utilize the meta-knowledge across all the tasks, even when the number of instances per task and class largely varies. Moreover, they do not consider distributional difference in unseen tasks, on which the meta-knowledge may have less usefulness depending on the task relatedness. To overcome these limitations, we propose a novel meta-learning model that adaptively balances the effect of the meta-learning and task-specific learning within each task. Through the learning of the balancing variables, we can decide whether to obtain a solution by relying on the meta-knowledge or task-specific learning. We formulate this objective into a Bayesian inference framework and tackle it using variational inference. We validate our Bayesian Task-Adaptive Meta-Learning (Bayesian TAML) on two realistic task- and class-imbalanced datasets, on which it significantly outperforms existing meta-learning approaches. Further ablation study confirms the effectiveness of each balancing component and the Bayesian learning framework.
Tasks	Bayesian Inference, Meta-Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rkeZIJBYvr
PDF	https://openreview.net/pdf?id=rkeZIJBYvr
PWC	https://paperswithcode.com/paper/learning-to-balance-bayesian-meta-learning-1
Repo	https://github.com/haebeom-lee/l2b
Framework	tf

Deep Bayesian Structure Networks


Title	Deep Bayesian Structure Networks
Authors	Anonymous
Abstract	Bayesian neural networks (BNNs) introduce uncertainty estimation to deep networks by performing Bayesian inference on network weights. However, such models bring the challenges of inference, and further BNNs with weight uncertainty rarely achieve superior performance to standard models. In this paper, we investigate a new line of Bayesian deep learning by performing Bayesian reasoning on the structure of deep neural networks. Drawing inspiration from the neural architecture search, we define the network structure as random weights on the redundant operations between computational nodes, and apply stochastic variational inference techniques to learn the structure distributions of networks. Empirically, the proposed method substantially surpasses the advanced deep neural networks across a range of classification and segmentation tasks. More importantly, our approach also preserves benefits of Bayesian principles, producing improved uncertainty estimation than the strong baselines including MC dropout and variational BNNs algorithms (e.g. noisy EK-FAC).
Tasks	Bayesian Inference, Neural Architecture Search
Published	2020-01-01
URL	https://openreview.net/forum?id=B1gXR3NtwS
PDF	https://openreview.net/pdf?id=B1gXR3NtwS
PWC	https://paperswithcode.com/paper/deep-bayesian-structure-networks
Repo	https://github.com/anonymousest/DBSN
Framework	tf

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation


Title	U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
Authors	Anonymous
Abstract	We propose a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. The attention module guides our model to focus on more important regions distinguishing between source and target domains based on the attention map obtained by the auxiliary classifier. Unlike previous attention-based method which cannot handle the geometric changes between domains, our model can translate both images requiring holistic changes and images requiring large shape changes. Moreover, our new AdaLIN (Adaptive Layer-Instance Normalization) function helps our attention-guided model to flexibly control the amount of change in shape and texture by learned parameters depending on datasets. Experimental results show the superiority of the proposed method compared to the existing state-of-the-art models with a fixed network architecture and hyper-parameters.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2020-01-01
URL	https://openreview.net/forum?id=BJlZ5ySKPH
PDF	https://openreview.net/pdf?id=BJlZ5ySKPH
PWC	https://paperswithcode.com/paper/u-gat-it-unsupervised-generative-attentional-1
Repo	https://github.com/taki0112/UGATIT
Framework	tf

Destruction from sky: Weakly supervised approach for destruction detection in satellite imagery


Title	Destruction from sky: Weakly supervised approach for destruction detection in satellite imagery
Authors	Muhammad Usman Ali, Waqas Sultani, Mohsen Ali
Abstract	Natural and man-made disasters cause huge damage to built infrastructures and results in loss of human lives. The rehabilitation efforts and rescue operations are hampered by the non-availability of accurate and timely information regarding the location of damaged infrastructure and its extent. In this paper, we model the destruction in satellite imagery using a deep learning model employing a weakly-supervised approach. In stark contrast to previous approaches, instead of solving the problem as change detection (using pre and post-event images), we model to identify destruction itself using a single post-event image. To overcome the challenge of collecting pixel-level ground truth data mostly used during training, we only assume image-level labels, representing either destruction is present (at any location) in a given image or not. The proposed attention-based mechanism learns to identify the image-patches with destruction automatically under the sparsity constraint. Furthermore, to reduce false-positive and improve segmentation quality, a hard negative mining technique has been proposed that results in considerable improvement over baseline. To validate our approach, we have collected a new dataset containing destruction and non-destruction images from Indonesia, Yemen, Japan, and Pakistan. On testing-dataset, we obtained excellent destruction results with pixel-level accuracy of 93% and patch level accuracy of 91%. The source code and dataset will be made publicly available.
Tasks
Published	2020-04-01
URL	http://im.itu.edu.pk/destruction-detection/
PDF	http://im.itu.edu.pk/wp-content/uploads/2020/02/id_compressed.pdf
PWC	https://paperswithcode.com/paper/destruction-from-sky-weakly-supervised
Repo	https://github.com/usmanali414/Destruction-Detection-in-Satellite-Imagery
Framework	tf