Paper Group AWR 49
Reinforcement Learning Framework for Deep Brain Stimulation Study. Few-shot Natural Language Generation for Task-Oriented Dialog. Bringing Stories Alive: Generating Interactive Fiction Worlds. Graph Constrained Reinforcement Learning for Natural Language Action Spaces. Deterministic Decoding for Discrete Data in Variational Autoencoders. Multi-clas …
Reinforcement Learning Framework for Deep Brain Stimulation Study
Title | Reinforcement Learning Framework for Deep Brain Stimulation Study |
Authors | Dmitrii Krylov, Remi Tachet, Romain Laroche, Michael Rosenblum, Dmitry V. Dylov |
Abstract | Malfunctioning neurons in the brain sometimes operate synchronously, reportedly causing many neurological diseases, e.g. Parkinson’s. Suppression and control of this collective synchronous activity are therefore of great importance for neuroscience, and can only rely on limited engineering trials due to the need to experiment with live human brains. We present the first Reinforcement Learning gym framework that emulates this collective behavior of neurons and allows us to find suppression parameters for the environment of synthetic degenerate models of neurons. We successfully suppress synchrony via RL for three pathological signaling regimes, characterize the framework’s stability to noise, and further remove the unwanted oscillations by engaging multiple PPO agents. |
Tasks | |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.10948v1 |
https://arxiv.org/pdf/2002.10948v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-framework-for-deep |
Repo | https://github.com/cviaai/RL-DBS |
Framework | none |
Few-shot Natural Language Generation for Task-Oriented Dialog
Title | Few-shot Natural Language Generation for Task-Oriented Dialog |
Authors | Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao |
Abstract | As a crucial component in task-oriented dialog systems, the Natural Language Generation (NLG) module converts a dialog act represented in a semantic form into a response in natural language. The success of traditional template-based or statistical models typically relies on heavily annotated data, which is infeasible for new domains. Therefore, it is pivotal for an NLG system to generalize well with limited labelled data in real applications. To this end, we present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems. Further, we develop the SC-GPT model. It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains. Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods, measured by various automatic metrics and human evaluations. |
Tasks | Few-Shot Learning, Text Generation |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12328v1 |
https://arxiv.org/pdf/2002.12328v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-natural-language-generation-for-task |
Repo | https://github.com/pengbaolin/SC-GPT |
Framework | none |
Bringing Stories Alive: Generating Interactive Fiction Worlds
Title | Bringing Stories Alive: Generating Interactive Fiction Worlds |
Authors | Prithviraj Ammanabrolu, Wesley Cheung, Dan Tu, William Broniec, Mark O. Riedl |
Abstract | World building forms the foundation of any task that requires narrative intelligence. In this work, we focus on procedurally generating interactive fiction worlds—text-based worlds that players “see” and “talk to” using natural language. Generating these worlds requires referencing everyday and thematic commonsense priors in addition to being semantically consistent, interesting, and coherent throughout. Using existing story plots as inspiration, we present a method that first extracts a partial knowledge graph encoding basic information regarding world structure such as locations and objects. This knowledge graph is then automatically completed utilizing thematic knowledge and used to guide a neural language generation model that fleshes out the rest of the world. We perform human participant-based evaluations, testing our neural model’s ability to extract and fill-in a knowledge graph and to generate language conditioned on it against rule-based and human-made baselines. Our code is available at https://github.com/rajammanabrolu/WorldGeneration. |
Tasks | Text Generation |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10161v1 |
https://arxiv.org/pdf/2001.10161v1.pdf | |
PWC | https://paperswithcode.com/paper/bringing-stories-alive-generating-interactive |
Repo | https://github.com/rajammanabrolu/WorldGeneration |
Framework | none |
Graph Constrained Reinforcement Learning for Natural Language Action Spaces
Title | Graph Constrained Reinforcement Learning for Natural Language Action Spaces |
Authors | Prithviraj Ammanabrolu, Matthew Hausknecht |
Abstract | Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space. We contend that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions. Results across a wide variety of IF games show that KG-A2C outperforms current IF agents despite the exponential increase in action space size. |
Tasks | Text Generation |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08837v1 |
https://arxiv.org/pdf/2001.08837v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-constrained-reinforcement-learning-for-1 |
Repo | https://github.com/rajammanabrolu/KG-A2C |
Framework | pytorch |
Deterministic Decoding for Discrete Data in Variational Autoencoders
Title | Deterministic Decoding for Discrete Data in Variational Autoencoders |
Authors | Daniil Polykovskiy, Dmitry Vetrov |
Abstract | Variational autoencoders are prominent generative models for modeling discrete data. However, with flexible decoders, they tend to ignore the latent codes. In this paper, we study a VAE model with a deterministic decoder (DD-VAE) for sequential data that selects the highest-scoring tokens instead of sampling. Deterministic decoding solely relies on latent codes as the only way to produce diverse objects, which improves the structure of the learned manifold. To implement DD-VAE, we propose a new class of bounded support proposal distributions and derive Kullback-Leibler divergence for Gaussian and uniform priors. We also study a continuous relaxation of deterministic decoding objective function and analyze the relation of reconstruction accuracy and relaxation parameters. We demonstrate the performance of DD-VAE on multiple datasets, including molecular generation and optimization problems. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02174v1 |
https://arxiv.org/pdf/2003.02174v1.pdf | |
PWC | https://paperswithcode.com/paper/deterministic-decoding-for-discrete-data-in |
Repo | https://github.com/insilicomedicine/DD-VAE |
Framework | pytorch |
Multi-class Gaussian Process Classification with Noisy Inputs
Title | Multi-class Gaussian Process Classification with Noisy Inputs |
Authors | Carlos Villacampa-Calvo, Bryan Zaldivar, Eduardo C. Garrido-Merchán, Daniel Hernández-Lobato |
Abstract | It is a common practice in the supervised machine learning community to assume that the observed data are noise-free in the input attributes. Nevertheless, scenarios with input noise are common in real problems, as measurements are never perfectly accurate. If this input noise is not taken into account, a supervised machine learning method is expected to perform sub-optimally. In this paper, we focus on multi-class classification problems and use Gaussian processes (GPs) as the underlying classifier. Motivated by a dataset coming from the astrophysics domain, we hypothesize that the observed data may contain noise in the inputs. Therefore, we devise several multi-class GP classifiers that can account for input noise. Such classifiers can be efficiently trained using variational inference to approximate the posterior distribution of the latent variables of the model. Moreover, in some situations, the amount of noise can be known before-hand. If this is the case, it can be readily introduced in the proposed methods. This prior information is expected to lead to better performance results. We have evaluated the proposed methods by carrying out several experiments, involving synthetic and real data. These data include several datasets from the UCI repository, the MNIST dataset and a dataset coming from astrophysics. The results obtained show that, although the classification error is similar across methods, the predictive distribution of the proposed methods is better, in terms of the test log-likelihood, than the predictive distribution of a classifier based on GPs that ignores input noise. |
Tasks | Gaussian Processes |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10523v2 |
https://arxiv.org/pdf/2001.10523v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-class-gaussian-process-classification-1 |
Repo | https://github.com/cvillacampa/GPInputNoise |
Framework | tf |
Community Detection in Bipartite Networks with Stochastic Blockmodels
Title | Community Detection in Bipartite Networks with Stochastic Blockmodels |
Authors | Tzu-Chi Yen, Daniel B. Larremore |
Abstract | In bipartite networks, community structures are restricted to being disassortative, in that nodes of one type are grouped according to common patterns of connection with nodes of the other type. This makes the stochastic block model (SBM), a highly flexible generative model for networks with block structure, an intuitive choice for bipartite community detection. However, typical formulations of the SBM do not make use of the special structure of bipartite networks. In this work, we introduce a Bayesian nonparametric formulation of the SBM and a corresponding algorithm to efficiently find communities in bipartite networks without overfitting. The biSBM improves community detection results over general SBMs when data are noisy, improves the model resolution limit by a factor of $\sqrt{2}$, and expands our understanding of the complicated optimization landscape associated with community detection tasks. A direct comparison of certain terms of the prior distributions in the biSBM and a related high-resolution hierarchical SBM also reveals a counterintuitive regime of community detection problems, populated by smaller and sparser networks, where non-hierarchical models outperform their more flexible counterpart. |
Tasks | Community Detection |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.11818v1 |
https://arxiv.org/pdf/2001.11818v1.pdf | |
PWC | https://paperswithcode.com/paper/community-detection-in-bipartite-networks |
Repo | https://github.com/junipertcy/bipartiteSBM |
Framework | none |
A unified framework for 21cm tomography sample generation and parameter inference with Progressively Growing GANs
Title | A unified framework for 21cm tomography sample generation and parameter inference with Progressively Growing GANs |
Authors | Florian List, Geraint F. Lewis |
Abstract | Creating a database of 21cm brightness temperature signals from the Epoch of Reionisation (EoR) for an array of reionisation histories is a complex and computationally expensive task, given the range of astrophysical processes involved and the possibly high-dimensional parameter space that is to be probed. We utilise a specific type of neural network, a Progressively Growing Generative Adversarial Network (PGGAN), to produce realistic tomography images of the 21cm brightness temperature during the EoR, covering a continuous three-dimensional parameter space that models varying X-ray emissivity, Lyman band emissivity, and ratio between hard and soft X-rays. The GPU-trained network generates new samples at a resolution of $\sim 3'$ in a second (on a laptop CPU), and the resulting global 21cm signal, power spectrum, and pixel distribution function agree well with those of the training data, taken from the 21SSD catalogue \citep{Semelin2017}. Finally, we showcase how a trained PGGAN can be leveraged for the converse task of inferring parameters from 21cm tomography samples via Approximate Bayesian Computation. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.07940v1 |
https://arxiv.org/pdf/2002.07940v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-21cm-tomography |
Repo | https://github.com/FloList/21cmGAN |
Framework | tf |
DDSP: Differentiable Digital Signal Processing
Title | DDSP: Differentiable Digital Signal Processing |
Authors | Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, Adam Roberts |
Abstract | Most generative models of audio directly generate samples in one of two domains: time or frequency. While sufficient to express any signal, these representations are inefficient, as they do not utilize existing knowledge of how sound is generated and perceived. A third approach (vocoders/synthesizers) successfully incorporates strong domain knowledge of signal processing and perception, but has been less actively researched due to limited expressivity and difficulty integrating with modern auto-differentiation-based machine learning methods. In this paper, we introduce the Differentiable Digital Signal Processing (DDSP) library, which enables direct integration of classic signal processing elements with deep learning methods. Focusing on audio synthesis, we achieve high-fidelity generation without the need for large autoregressive models or adversarial losses, demonstrating that DDSP enables utilizing strong inductive biases without losing the expressive power of neural networks. Further, we show that combining interpretable modules permits manipulation of each separate model component, with applications such as independent control of pitch and loudness, realistic extrapolation to pitches not seen during training, blind dereverberation of room acoustics, transfer of extracted room acoustics to new environments, and transformation of timbre between disparate sources. In short, DDSP enables an interpretable and modular approach to generative modeling, without sacrificing the benefits of deep learning. The library is publicly available at https://github.com/magenta/ddsp and we welcome further contributions from the community and domain experts. |
Tasks | Audio Generation |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04643v1 |
https://arxiv.org/pdf/2001.04643v1.pdf | |
PWC | https://paperswithcode.com/paper/ddsp-differentiable-digital-signal-processing-1 |
Repo | https://github.com/magenta/ddsp |
Framework | tf |
Untangling in Invariant Speech Recognition
Title | Untangling in Invariant Speech Recognition |
Authors | Cory Stephenson, Jenelle Feather, Suchismita Padhy, Oguz Elibol, Hanlin Tang, Josh McDermott, SueYeon Chung |
Abstract | Encouraged by the success of deep neural networks on a variety of visual tasks, much theoretical and experimental work has been aimed at understanding and interpreting how vision networks operate. Meanwhile, deep neural networks have also achieved impressive performance in audio processing applications, both as sub-components of larger systems and as complete end-to-end systems by themselves. Despite their empirical successes, comparatively little is understood about how these audio models accomplish these tasks. In this work, we employ a recently developed statistical mechanical theory that connects geometric properties of network representations and the separability of classes to probe how information is untangled within neural networks trained to recognize speech. We observe that speaker-specific nuisance variations are discarded by the network’s hierarchy, whereas task-relevant properties such as words and phonemes are untangled in later layers. Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network. Finally, we find that the deep representations carry out significant temporal untangling by efficiently extracting task-relevant features at each time step of the computation. Taken together, these findings shed light on how deep auditory models process time dependent input signals to achieve invariant speech recognition, and show how different concepts emerge through the layers of the network. |
Tasks | Speech Recognition |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01787v1 |
https://arxiv.org/pdf/2003.01787v1.pdf | |
PWC | https://paperswithcode.com/paper/untangling-in-invariant-speech-recognition-1 |
Repo | https://github.com/schung039/neural_manifolds_replicaMFT |
Framework | pytorch |
TEASER: Fast and Certifiable Point Cloud Registration
Title | TEASER: Fast and Certifiable Point Cloud Registration |
Authors | Heng Yang, Jingnan Shi, Luca Carlone |
Abstract | We propose the first fast and certifiable algorithm for the registration of two sets of 3D points in the presence of large amounts of outlier correspondences. Towards this goal, we first reformulate the registration problem using a Truncated Least Squares (TLS) cost that makes the estimation insensitive to spurious correspondences. Then, we provide a general graph-theoretic framework to decouple scale, rotation, and translation estimation, which allows solving in cascade for the three transformations. Despite the fact that each subproblem is still non-convex and combinatorial in nature, we show that (i) TLS scale and (component-wise) translation estimation can be solved in polynomial time via an adaptive voting scheme, (ii) TLS rotation estimation can be relaxed to a semidefinite program (SDP) and the relaxation is tight, even in the presence of extreme outlier rates. We name the resulting algorithm TEASER (Truncated least squares Estimation And SEmidefinite Relaxation). While solving large SDP relaxations is typically slow, we develop a second certifiable algorithm, named TEASER++, that circumvents the need to solve an SDP and runs in milliseconds. For both algorithms, we provide theoretical bounds on the estimation errors, which are the first of their kind for robust registration problems. Moreover, we test their performance on standard benchmarks, object detection datasets, and the 3DMatch scan matching dataset, and show that (i) both algorithms dominate the state of the art (e.g., RANSAC, branch-&-bound, heuristics) and are robust to more than 99% outliers, (ii) TEASER++ can run in milliseconds and it is currently the fastest robust registration algorithm, (iii) TEASER++ is so robust it can also solve problems without correspondences (e.g., hypothesizing all-to-all correspondences) where it largely outperforms ICP. We release a fast open-source C++ implementation of TEASER++. |
Tasks | Object Detection, Point Cloud Registration |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07715v1 |
https://arxiv.org/pdf/2001.07715v1.pdf | |
PWC | https://paperswithcode.com/paper/teaser-fast-and-certifiable-point-cloud |
Repo | https://github.com/MIT-SPARK/TEASER-plusplus |
Framework | none |
Self-supervised Image Enhancement Network: Training with Low Light Images Only
Title | Self-supervised Image Enhancement Network: Training with Low Light Images Only |
Authors | Yu Zhang, Xiaoguang Di, Bin Zhang, Chunhui Wang |
Abstract | This paper proposes a self-supervised low light image enhancement method based on deep learning. Inspired by information entropy theory and Retinex model, we proposed a maximum entropy based Retinex model. With this model, a very simple network can separate the illumination and reflectance, and the network can be trained with low light images only. We introduce a constraint that the maximum channel of the reflectance conforms to the maximum channel of the low light image and its entropy should be largest in our model to achieve self-supervised learning. Our model is very simple and does not rely on any well-designed data set (even one low light image can complete the training). The network only needs minute-level training to achieve image enhancement. It can be proved through experiments that the proposed method has reached the state-of-the-art in terms of processing speed and effect. |
Tasks | Image Enhancement, Low-Light Image Enhancement |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11300v1 |
https://arxiv.org/pdf/2002.11300v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-image-enhancement-network |
Repo | https://github.com/hitzhangyu/Self-supervised-Image-Enhancement-Network-Training-With-Low-Light-Images-Only |
Framework | none |
Can We Use Split Learning on 1D CNN Models for Privacy Preserving Training?
Title | Can We Use Split Learning on 1D CNN Models for Privacy Preserving Training? |
Authors | Sharif Abuadbba, Kyuyeon Kim, Minki Kim, Chandra Thapa, Seyit A. Camtepe, Yansong Gao, Hyoungshick Kim, Surya Nepal |
Abstract | A new collaborative learning, called split learning, was recently introduced, aiming to protect user data privacy without revealing raw input data to a server. It collaboratively runs a deep neural network model where the model is split into two parts, one for the client and the other for the server. Therefore, the server has no direct access to raw data processed at the client. Until now, the split learning is believed to be a promising approach to protect the client’s raw data; for example, the client’s data was protected in healthcare image applications using 2D convolutional neural network (CNN) models. However, it is still unclear whether the split learning can be applied to other deep learning models, in particular, 1D CNN. In this paper, we examine whether split learning can be used to perform privacy-preserving training for 1D CNN models. To answer this, we first design and implement an 1D CNN model under split learning and validate its efficacy in detecting heart abnormalities using medical ECG data. We observed that the 1D CNN model under split learning can achieve the same accuracy of 98.9% like the original (non-split) model. However, our evaluation demonstrates that split learning may fail to protect the raw data privacy on 1D CNN models. To address the observed privacy leakage in split learning, we adopt two privacy leakage mitigation techniques: 1) adding more hidden layers to the client side and 2) applying differential privacy. Although those mitigation techniques are helpful in reducing privacy leakage, they have a significant impact on model accuracy. Hence, based on those results, we conclude that split learning alone would not be sufficient to maintain the confidentiality of raw sequential data in 1D CNN models. |
Tasks | |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.12365v1 |
https://arxiv.org/pdf/2003.12365v1.pdf | |
PWC | https://paperswithcode.com/paper/can-we-use-split-learning-on-1d-cnn-models |
Repo | https://github.com/SharifAbuadbba/split-learning-1D |
Framework | none |
Output Diversified Initialization for Adversarial Attacks
Title | Output Diversified Initialization for Adversarial Attacks |
Authors | Yusuke Tashiro, Yang Song, Stefano Ermon |
Abstract | Adversarial examples are often constructed by iteratively refining a randomly perturbed input. To improve diversity and thus also the success rates of attacks, we propose Output Diversified Initialization (ODI), a novel random initialization strategy that can be combined with most existing white-box adversarial attacks. Instead of using uniform perturbations in the input space, we seek diversity in the output logits space of the target model. Empirically, we demonstrate that existing $\ell_\infty$ and $\ell_2$ adversarial attacks with ODI become much more efficient on several datasets including MNIST, CIFAR-10 and ImageNet, reducing the accuracy of recently proposed defense models by 1–17%. Moreover, PGD attack with ODI outperforms current state-of-the-art attacks against robust models, while also being roughly 50 times faster on CIFAR-10. The code is available on https://github.com/ermongroup/ODI/. |
Tasks | |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06878v1 |
https://arxiv.org/pdf/2003.06878v1.pdf | |
PWC | https://paperswithcode.com/paper/output-diversified-initialization-for |
Repo | https://github.com/MadryLab/cifar10_challenge |
Framework | tf |
Federated Learning with Matched Averaging
Title | Federated Learning with Matched Averaging |
Authors | Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, Yasaman Khazaeni |
Abstract | Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA constructs the shared global model in a layer-wise manner by matching and averaging hidden elements (i.e. channels for convolution layers; hidden states for LSTM; neurons for fully connected layers) with similar feature extraction signatures. Our experiments indicate that FedMA not only outperforms popular state-of-the-art federated learning algorithms on deep CNN and LSTM architectures trained on real world datasets, but also reduces the overall communication burden. |
Tasks | |
Published | 2020-02-15 |
URL | https://arxiv.org/abs/2002.06440v1 |
https://arxiv.org/pdf/2002.06440v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-learning-with-matched-averaging-1 |
Repo | https://github.com/IBM/FedMA |
Framework | pytorch |