April 1, 2020

3044 words 15 mins read

Paper Group NANR 127

Paper Group NANR 127

Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding. Learning Generative Models using Denoising Density Estimators. Invertible generative models for inverse problems: mitigating representation error and dataset bias. XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings. A Bi …

Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding

Title Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding
Authors Anonymous
Abstract Can the success of reinforcement learning methods for simple combinatorial optimization problems be extended to multi-robot sequential assignment planning? In addition to the challenge of achieving near-optimal performance in large problems, transferability to an unseen number of robots and tasks is another key challenge for real-world applications. In this paper, we suggest a method that achieves the first success in both challenges for robot/machine scheduling problems. Our method comprises of three components. First, we show any robot scheduling problem can be expressed as a random probabilistic graphical model (PGM). We develop a mean-field inference method for random PGM and use it for Q-function inference. Second, we show that transferability can be achieved by carefully designing two-step sequential encoding of problem state. Third, we resolve the computational scalability issue of fitted Q-iteration by suggesting a heuristic auction-based Q-iteration fitting method enabled by transferability we achieved. We apply our method to discrete-time, discrete space problems (Multi-Robot Reward Collection (MRRC)) and scalably achieve 97% optimality with transferability. This optimality is maintained under stochastic contexts. By extending our method to continuous time, continuous space formulation, we claim to be the first learning-based method with scalable performance in any type of multi-machine scheduling problems; our method scalability achieves comparable performance to popular metaheuristics in Identical parallel machine scheduling (IPMS) problems.
Tasks Combinatorial Optimization, Graph Embedding
Published 2020-01-01
URL https://openreview.net/forum?id=rJxRJeStvB
PDF https://openreview.net/pdf?id=rJxRJeStvB
PWC https://paperswithcode.com/paper/learning-scalable-and-transferable-multi
Repo
Framework

Learning Generative Models using Denoising Density Estimators

Title Learning Generative Models using Denoising Density Estimators
Authors Anonymous
Abstract Learning generative probabilistic models that can estimate the continuous density given a set of samples, and that can sample from that density is one of the fundamental challenges in unsupervised machine learning. In this paper we introduce a new approach to obtain such models based on what we call denoising density estimators (DDEs). A DDE is a scalar function, parameterized by a neural network, that is efficiently trained to represent a kernel density estimator of the data. In addition, we show how to leverage DDEs to develop a novel approach to obtain generative models that sample from given densities. We prove that our algorithms to obtain both DDEs and generative models are guaranteed to converge to the correct solutions. Advantages of our approach include that we do not require specific network architectures like in normalizing flows, ODE solvers as in continuous normalizing flows, nor do we require adversarial training as in generative adversarial networks (GANs). Finally, we provide experimental results that demonstrate practical applications of our technique.
Tasks Denoising
Published 2020-01-01
URL https://openreview.net/forum?id=Skl1HCNKDr
PDF https://openreview.net/pdf?id=Skl1HCNKDr
PWC https://paperswithcode.com/paper/learning-generative-models-using-denoising
Repo
Framework

Invertible generative models for inverse problems: mitigating representation error and dataset bias

Title Invertible generative models for inverse problems: mitigating representation error and dataset bias
Authors Anonymous
Abstract Trained generative models have shown remarkable performance as priors for inverse problems in imaging. For example, Generative Adversarial Network priors permit recovery of test images from 5-10x fewer measurements than sparsity priors. Unfortunately, these models may be unable to represent any particular image because of architectural choices, mode collapse, and bias in the training dataset. In this paper, we demonstrate that invertible neural networks, which have zero representation error by design, can be effective natural signal priors at inverse problems such as denoising, compressive sensing, and inpainting. Our formulation is an empirical risk minimization that does not directly optimize the likelihood of images, as one would expect. Instead we optimize the likelihood of the latent representation of images as a proxy, as this is empirically easier. For compressive sensing, our formulation can yield higher accuracy than sparsity priors across almost all undersampling ratios. For the same accuracy on test images, they can use 10-20x fewer measurements. We demonstrate that invertible priors can yield better reconstructions than sparsity priors for images that have rare features of variation within the biased training set, including out-of-distribution natural images.
Tasks Compressive Sensing, Denoising
Published 2020-01-01
URL https://openreview.net/forum?id=BJgkbyHKDS
PDF https://openreview.net/pdf?id=BJgkbyHKDS
PWC https://paperswithcode.com/paper/invertible-generative-models-for-inverse-1
Repo
Framework

XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings

Title XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings
Authors Anonymous
Abstract Current state-of-the-art results in multilingual natural language inference (NLI) are based on tuning XLM (a pre-trained polyglot language model) separately for each language involved, resulting in multiple models. We reach significantly higher NLI results with a single model for all languages via multilingual tuning. Furthermore, we introduce cross-lingual knowledge distillation (XD), where the same polyglot model is used both as teacher and student across languages to improve its sentence representations without using the end-task labels. When used alone, XD beats multilingual tuning for some languages and the combination of them both results in a new state-of-the-art of 79.2% on the XNLI dataset, surpassing the previous result by absolute 2.5%. The models and code for reproducing our experiments will be made publicly available after de-anonymization.
Tasks Language Modelling, Natural Language Inference, Sentence Embeddings
Published 2020-01-01
URL https://openreview.net/forum?id=BkePneStwH
PDF https://openreview.net/pdf?id=BkePneStwH
PWC https://paperswithcode.com/paper/xd-cross-lingual-knowledge-distillation-for
Repo
Framework

A Bilingual Generative Transformer for Semantic Sentence Embedding

Title A Bilingual Generative Transformer for Semantic Sentence Embedding
Authors Anonymous
Abstract Semantic sentence embedding models take natural language sentences and turn them into vectors, such that similar vectors indicate similarity in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model’s posterior to predict sentence embeddings for monolingual data at test time. Second, we use high- capacity transformers as both data generating distributions and inference networks – contrasting with most past work on sentence embeddings. In experiments, our approach substantially outperforms the state-of-the-art on a standard suite of se- mantic similarity evaluations. Further, we demonstrate that our approach yields the largest gains on more difficult subsets of test where simple word overlap is not a good indicator of similarity.
Tasks Sentence Embedding, Sentence Embeddings
Published 2020-01-01
URL https://openreview.net/forum?id=SkgS2lBFPS
PDF https://openreview.net/pdf?id=SkgS2lBFPS
PWC https://paperswithcode.com/paper/a-bilingual-generative-transformer-for
Repo
Framework

Learning Entailment-Based Sentence Embeddings from Natural Language Inference

Title Learning Entailment-Based Sentence Embeddings from Natural Language Inference
Authors Anonymous
Abstract Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences. But in many such models the embeddings computed by the sentence encoder goes through an MLP-based interaction layer before predicting its label, and thus some of the information about textual entailment is encoded in the interpretation of sentence embeddings given by this parameterised MLP. In this work we propose a simple interaction layer based on predefined entailment and contradiction scores applied directly to the sentence embeddings. This parameter-free interaction model achieves results on natural language inference competitive with MLP-based models, demonstrating that the trained sentence embeddings directly represent the information needed for textual entailment, and the inductive bias of this model leads to better generalisation to other related datasets.
Tasks Natural Language Inference, Sentence Embeddings
Published 2020-01-01
URL https://openreview.net/forum?id=BkxackSKvH
PDF https://openreview.net/pdf?id=BkxackSKvH
PWC https://paperswithcode.com/paper/learning-entailment-based-sentence-embeddings
Repo
Framework

Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation

Title Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
Authors Anonymous
Abstract We focus on temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationship in the generated data is much less explored. This is crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as L2 over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving the learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. We also propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirms the rankings computed with these metrics.
Tasks Super-Resolution, Video Generation, Video Super-Resolution
Published 2020-01-01
URL https://openreview.net/forum?id=r1ltgp4FwS
PDF https://openreview.net/pdf?id=r1ltgp4FwS
PWC https://paperswithcode.com/paper/learning-temporal-coherence-via-self
Repo
Framework

You Only Train Once: Loss-Conditional Training of Deep Networks

Title You Only Train Once: Loss-Conditional Training of Deep Networks
Authors Anonymous
Abstract In many machine learning problems, loss functions are weighted sums of several terms. A typical approach to dealing with these is to train multiple separate models with different selections of weights and then either choose the best one according to some criterion or keep multiple models if it is desirable to maintain a diverse set of solutions. This is inefficient both at training and at inference time. We propose a method that allows replacing multiple models trained on one loss function each by a single model trained on a distribution of losses. At test time a model trained this way can be conditioned to generate outputs corresponding to any loss from the training distribution of losses. We demonstrate this approach on three tasks with parametrized losses: beta-VAE, learned image compression, and fast style transfer.
Tasks Image Compression, Style Transfer
Published 2020-01-01
URL https://openreview.net/forum?id=HyxY6JHKwr
PDF https://openreview.net/pdf?id=HyxY6JHKwr
PWC https://paperswithcode.com/paper/you-only-train-once-loss-conditional-training
Repo
Framework

DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression

Title DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
Authors Anonymous
Abstract We propose a new architecture for distributed image compression from a group of distributed data sources. The work is motivated by practical needs of data-driven codec design, low power consumption, robustness, and data privacy. The proposed architecture, which we refer to as Distributed Recurrent Autoencoder for Scalable Image Compression (DRASIC), is able to train distributed encoders and one joint decoder on correlated data sources. Its compression capability is much better than the method of training codecs separately. Meanwhile, for 10 distributed sources, our distributed system remarkably performs within 2 dB peak signal-to-noise ratio (PSNR) of that of a single codec trained with all data sources. We experiment distributed sources with different correlations and show how our methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding (DSC). Our method is also shown to be robust to the lack of presence of encoded data from a number of distributed sources. Moreover, it is scalable in the sense that codes can be decoded simultaneously at more than one compression quality level. To the best of our knowledge, this is the first data-driven DSC framework for general distributed code design with deep learning.
Tasks Image Compression
Published 2020-01-01
URL https://openreview.net/forum?id=SyxBxCNFwr
PDF https://openreview.net/pdf?id=SyxBxCNFwr
PWC https://paperswithcode.com/paper/drasic-distributed-recurrent-autoencoder-for
Repo
Framework

Deep Audio Prior

Title Deep Audio Prior
Authors Anonymous
Abstract Deep convolutional neural networks are known to specialize in distilling compact and robust prior from a large amount of data. We are interested in applying deep networks in the absence of training dataset. In this paper, we introduce deep audio prior (DAP) which leverages the structure of a network and the temporal information in a single audio file. Specifically, we demonstrate that a randomly-initialized neural network can be used with carefully designed audio prior to tackle challenging audio problems such as universal blind source separation, interactive audio editing, audio texture synthesis, and audio co-separation. To understand the robustness of the deep audio prior, we construct a benchmark dataset Universal-150 for universal sound source separation with a diverse set of sources. We show superior audio results than previous work on both qualitatively and quantitative evaluations. We also perform thorough ablation study to validate our design choices.
Tasks Texture Synthesis
Published 2020-01-01
URL https://openreview.net/forum?id=B1l1qnEFwH
PDF https://openreview.net/pdf?id=B1l1qnEFwH
PWC https://paperswithcode.com/paper/deep-audio-prior
Repo
Framework

Is There Mode Collapse? A Case Study on Face Generation and Its Black-box Calibration

Title Is There Mode Collapse? A Case Study on Face Generation and Its Black-box Calibration
Authors Anonymous
Abstract Generative adversarial networks (GANs) nowadays are capable of producing im-ages of incredible realism. One concern raised is whether the state-of-the-artGAN’s learned distribution still suffers from mode collapse. Existing evaluation metrics for image synthesis focus on low-level perceptual quality. Diversity tests of samples from GANs are usually conducted qualitatively on a small scale. In this work, we devise a set of statistical tools, that are broadly applicable to quantitatively measuring the mode collapse of GANs. Strikingly, we consistently observe strong mode collapse on several state-of-the-art GANs using our toolset. We analyze possible causes, and for the first time present two simple yet effective “black-box” methods to calibrate the GAN learned distribution, without accessing either model parameters or the original training data.
Tasks Calibration, Face Generation, Image Generation
Published 2020-01-01
URL https://openreview.net/forum?id=ryxUMREYPr
PDF https://openreview.net/pdf?id=ryxUMREYPr
PWC https://paperswithcode.com/paper/is-there-mode-collapse-a-case-study-on-face
Repo
Framework

A bi-diffusion based layer-wise sampling method for deep learning in large graphs

Title A bi-diffusion based layer-wise sampling method for deep learning in large graphs
Authors Anonymous
Abstract The Graph Convolutional Network (GCN) and its variants are powerful models for graph representation learning and have recently achieved great success on many graph-based applications. However, most of them target on shallow models (e.g. 2 layers) on relatively small graphs. Very recently, although many acceleration methods have been developed for GCNs training, it still remains a severe challenge how to scale GCN-like models to larger graphs and deeper layers due to the over-expansion of neighborhoods across layers. In this paper, to address the above challenge, we propose a novel layer-wise sampling strategy, which samples the nodes layer by layer conditionally based on the factors of the bi-directional diffusion between layers. In this way, we potentially restrict the time complexity linear to the number of layers, and construct a mini-batch of nodes with high local bi-directional influence (correlation). Further, we apply the self-attention mechanism to flexibly learn suitable weights for the sampled nodes, which allows the model to be able to incorporate both the first-order and higher-order proximities during a single layer propagation process without extra recursive propagation or skip connection. Extensive experiments on three large benchmark graphs demonstrate the effectiveness and efficiency of the proposed model.
Tasks Graph Representation Learning, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=B1xRGkHYDS
PDF https://openreview.net/pdf?id=B1xRGkHYDS
PWC https://paperswithcode.com/paper/a-bi-diffusion-based-layer-wise-sampling
Repo
Framework

Probabilistic View of Multi-agent Reinforcement Learning: A Unified Approach

Title Probabilistic View of Multi-agent Reinforcement Learning: A Unified Approach
Authors Anonymous
Abstract Formulating the reinforcement learning (RL) problem in the framework of probabilistic inference not only offers a new perspective about RL, but also yields practical algorithms that are more robust and easier to train. While this connection between RL and probabilistic inference has been extensively studied in the single-agent setting, it has not yet been fully understood in the multi-agent setup. In this paper, we pose the problem of multi-agent reinforcement learning as the problem of performing inference in a particular graphical model. We model the environment, as seen by each of the agents, using separate but related Markov decision processes. We derive a practical off-policy maximum-entropy actor-critic algorithm that we call Multi-agent Soft Actor-Critic (MA-SAC) for performing approximate inference in the proposed model using variational inference. MA-SAC can be employed in both cooperative and competitive settings. Through experiments, we demonstrate that MA-SAC outperforms a strong baseline on several multi-agent scenarios. While MA-SAC is one resultant multi-agent RL algorithm that can be derived from the proposed probabilistic framework, our work provides a unified view of maximum-entropy algorithms in the multi-agent setting.
Tasks Multi-agent Reinforcement Learning
Published 2020-01-01
URL https://openreview.net/forum?id=S1ef6JBtPr
PDF https://openreview.net/pdf?id=S1ef6JBtPr
PWC https://paperswithcode.com/paper/probabilistic-view-of-multi-agent
Repo
Framework

Integrative Tensor-based Anomaly Detection System For Satellites

Title Integrative Tensor-based Anomaly Detection System For Satellites
Authors Youjin Shin, Sangyup Lee, Shahroz Tariq, Myeong Shin Lee, OkchulJung, Daewon Chung, Simon Woo
Abstract Detecting anomalies is of growing importance for various industrial applications and mission-critical infrastructures, including satellite systems. Although there have been several studies in detecting anomalies based on rule-based or machine learning-based approaches for satellite systems, a tensor-based decomposition method has not been extensively explored for anomaly detection. In this work, we introduce an Integrative Tensor-based Anomaly Detection (ITAD) framework to detect anomalies in a satellite system. Because of the high risk and cost, detecting anomalies in a satellite system is crucial. We construct 3rd-order tensors with telemetry data collected from Korea Multi-Purpose Satellite-2 (KOMPSAT-2) and calculate the anomaly score using one of the component matrices obtained by applying CANDECOMP/PARAFAC decomposition to detect anomalies. Our result shows that our tensor-based approach can be effective in achieving higher accuracy and reducing false positives in detecting anomalies as compared to other existing approaches.
Tasks Anomaly Detection
Published 2020-01-01
URL https://openreview.net/forum?id=HJeg46EKPr
PDF https://openreview.net/pdf?id=HJeg46EKPr
PWC https://paperswithcode.com/paper/integrative-tensor-based-anomaly-detection
Repo
Framework

Iterative Deep Graph Learning for Graph Neural Networks

Title Iterative Deep Graph Learning for Graph Neural Networks
Authors Anonymous
Abstract In this paper, we propose an end-to-end graph learning framework, namely Iterative Deep Graph Learning (IDGL), for jointly learning graph structure and graph embedding simultaneously. We first cast graph structure learning problem as similarity metric learning problem and leverage an adapted graph regularization for controlling smoothness, connectivity and sparsity of the generated graph. We further propose a novel iterative method for searching for hidden graph structure that augments the initial graph structure. Our iterative method dynamically stops when learning graph structure approaches close enough to the ground truth graph. Our extensive experiments demonstrate that the proposed IDGL model can consistently outperform or match state-of-the-art baselines in terms of both classification accuracy and computational time. The proposed approach can cope with both transductive training and inductive training.
Tasks Graph Embedding, Metric Learning
Published 2020-01-01
URL https://openreview.net/forum?id=Bkl2UlrFwr
PDF https://openreview.net/pdf?id=Bkl2UlrFwr
PWC https://paperswithcode.com/paper/iterative-deep-graph-learning-for-graph
Repo
Framework
comments powered by Disqus