April 1, 2020

2969 words 14 mins read

Paper Group NANR 111

Paper Group NANR 111

Natural- to formal-language generation using Tensor Product Representations. Towards trustworthy predictions from deep neural networks with fast adversarial calibration. Deep Randomized Least Squares Value Iteration. Language-independent Cross-lingual Contextual Representations. Safe Policy Learning for Continuous Control. Lattice Representation Le …

Natural- to formal-language generation using Tensor Product Representations

Title Natural- to formal-language generation using Tensor Product Representations
Authors Anonymous
Abstract Generating formal-language represented by relational tuples, such as Lisp programs or mathematical expressions, from a natural-language input is an extremely challenging task because it requires to explicitly capture discrete symbolic structural information from the input to generate the output. Most state-of-the-art neural sequence models do not explicitly capture such structure information, and thus do not perform well on these tasks. In this paper, we propose a new encoder-decoder model based on Tensor Product Representations (TPRs) for Natural- to Formal-language generation, called TP-N2F. The encoder of TP-N2F employs TPR ‘binding’ to encode natural-language symbolic structure in vector space and the decoder uses TPR ‘unbinding’ to generate a sequence of relational tuples, each consisting of a relation (or operation) and a number of arguments, in symbolic space. TP-N2F considerably outperforms LSTM-based Seq2Seq models, creating a new state of the art results on two benchmarks: the MathQA dataset for math problem solving, and the AlgoList dataset for program synthesis. Ablation studies show that improvements are mainly attributed to the use of TPRs in both the encoder and decoder to explicitly capture relational structure information for symbolic reasoning.
Tasks Program Synthesis, Text Generation
Published 2020-01-01
URL https://openreview.net/forum?id=BylPSkHKvB
PDF https://openreview.net/pdf?id=BylPSkHKvB
PWC https://paperswithcode.com/paper/natural-to-formal-language-generation-using
Repo
Framework

Towards trustworthy predictions from deep neural networks with fast adversarial calibration

Title Towards trustworthy predictions from deep neural networks with fast adversarial calibration
Authors Anonymous
Abstract To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include post-processing steps for trained neural networks, Bayesian neural networks as well as alternative non-Bayesian approaches such as ensemble approaches and evidential deep learning. Here, we propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions for a wide range of perturbations. We comprehensively evaluate previously proposed approaches on different data modalities, a large range of data sets, network architectures and perturbation strategies and observe that our modelling approach substantially outperforms existing state-of-the-art approaches, yielding well-calibrated predictions for both in-domain and out-of domain samples.
Tasks Calibration, Decision Making
Published 2020-01-01
URL https://openreview.net/forum?id=rygePJHYPH
PDF https://openreview.net/pdf?id=rygePJHYPH
PWC https://paperswithcode.com/paper/towards-trustworthy-predictions-from-deep
Repo
Framework

Deep Randomized Least Squares Value Iteration

Title Deep Randomized Least Squares Value Iteration
Authors Guy Adam, Tom Zahavy, Oron Anschel, Nahum Shimkin
Abstract Exploration while learning representations is one of the main challenges Deep Reinforcement Learning (DRL) faces today. As the learned representation is dependant in the observed data, the exploration strategy has a crucial role. The popular DQN algorithm has improved significantly the capabilities of Reinforcement Learning (RL) algorithms to learn state representations from raw data, yet, it uses a naive exploration strategy which is statistically inefficient. The Randomized Least Squares Value Iteration (RLSVI) algorithm (Osband et al., 2016), on the other hand, explores and generalizes efficiently via linearly parameterized value functions. However, it is based on hand-designed state representation that requires prior engineering work for every environment. In this paper, we propose a Deep Learning adaptation for RLSVI. Rather than using hand-design state representation, we use a state representation that is being learned directly from the data by a DQN agent. As the representation is being optimized during the learning process, a key component for the suggested method is a likelihood matching mechanism, which adapts to the changing representations. We demonstrate the importance of the various properties of our algorithm on a toy problem and show that our method outperforms DQN in five Atari benchmarks, reaching competitive results with the Rainbow algorithm.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Syetja4KPH
PDF https://openreview.net/pdf?id=Syetja4KPH
PWC https://paperswithcode.com/paper/deep-randomized-least-squares-value-iteration
Repo
Framework

Language-independent Cross-lingual Contextual Representations

Title Language-independent Cross-lingual Contextual Representations
Authors Anonymous
Abstract Contextual representation models like BERT have achieved state-of-the-art performance on a diverse range of NLP tasks. We propose a cross-lingual contextual representation model that generates language-independent contextual representations. This helps to enable zero-shot cross-lingual transfer of a wide range of NLP models, on top of contextual representation models like BERT. We provide a formulation of language-independent cross-lingual contextual representation based on mono-lingual representations. Our formulation takes three steps to align sequences of vectors: transform, extract, and reorder. We present a detailed discussion about the process of learning cross-lingual contextual representations, also about the performance in cross-lingual transfer learning and its implications.
Tasks Cross-Lingual Transfer, Transfer Learning
Published 2020-01-01
URL https://openreview.net/forum?id=HylvleBtPB
PDF https://openreview.net/pdf?id=HylvleBtPB
PWC https://paperswithcode.com/paper/language-independent-cross-lingual-contextual
Repo
Framework

Safe Policy Learning for Continuous Control

Title Safe Policy Learning for Continuous Control
Authors Anonymous
Abstract We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that keep the agent in desirable situations, both during training and at convergence. We formulate these problems as {\em constrained} Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve them. Our algorithms can use any standard policy gradient (PG) method, such as deep deterministic policy gradient (DDPG) or proximal policy optimization (PPO), to train a neural network policy, while guaranteeing near-constraint satisfaction for every policy update by projecting either the policy parameter or the selected action onto the set of feasible solutions induced by the state-dependent linearized Lyapunov constraints. Compared to the existing constrained PG algorithms, ours are more data efficient as they are able to utilize both on-policy and off-policy data. Moreover, our action-projection algorithm often leads to less conservative policy updates and allows for natural integration into an end-to-end PG training pipeline. We evaluate our algorithms and compare them with the state-of-the-art baselines on several simulated (MuJoCo) tasks, as well as a real-world robot obstacle-avoidance problem, demonstrating their effectiveness in terms of balancing performance and constraint satisfaction.
Tasks Continuous Control
Published 2020-01-01
URL https://openreview.net/forum?id=HkxeThNFPH
PDF https://openreview.net/pdf?id=HkxeThNFPH
PWC https://paperswithcode.com/paper/safe-policy-learning-for-continuous-control
Repo
Framework

Lattice Representation Learning

Title Lattice Representation Learning
Authors Anonymous
Abstract We introduce the notion of \emph{lattice representation learning}, in which the representation for some object of interest (e.g. a sentence or an image) is a lattice point in an Euclidean space. Our main contribution is a result for replacing an objective function which employs lattice quantization with an objective function in which quantization is absent, thus allowing optimization techniques based on gradient descent to apply; we call the resulting algorithms \emph{dithered stochastic gradient descent} algorithms as they are designed explicitly to allow for an optimization procedure where only local information is employed. We also argue that a technique commonly used in Variational Auto-Encoders (Gaussian priors and Gaussian approximate posteriors) is tightly connected with the idea of lattice representations, as the quantization error in good high dimensional lattices can be modeled as a Gaussian distribution. We use a traditional encoder/decoder architecture to explore the idea of latticed valued representations, and provide experimental evidence of the potential of using lattice representations by modifying the \texttt{OpenNMT-py} generic \texttt{seq2seq} architecture so that it can implement not only Gaussian dithering of representations, but also the well known straight-through estimator and its application to vector quantization.
Tasks Quantization, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=rJlwAa4YwS
PDF https://openreview.net/pdf?id=rJlwAa4YwS
PWC https://paperswithcode.com/paper/lattice-representation-learning
Repo
Framework

Meta Learning via Learned Loss

Title Meta Learning via Learned Loss
Authors Anonymous
Abstract We present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures. We develop a pipeline for training such loss functions, targeted at maximizing the performance of model learn- ing with them. We observe that the loss landscape produced by our learned losses significantly improves upon the original task-specific losses in both supervised and reinforcement learning tasks. Furthermore, we show that our meta-learning framework is flexible enough to incorporate additional information at meta-train time. This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time.
Tasks Meta-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=ryesZANKPB
PDF https://openreview.net/pdf?id=ryesZANKPB
PWC https://paperswithcode.com/paper/meta-learning-via-learned-loss-1
Repo
Framework

Efficient Exploration via State Marginal Matching

Title Efficient Exploration via State Marginal Matching
Authors Anonymous
Abstract Reinforcement learning agents need to explore their unknown environments to solve the tasks given to them. The Bayes optimal solution to exploration is intractable for complex environments, and while several exploration methods have been proposed as approximations, it remains unclear what underlying objective is being optimized by existing exploration methods, or how they can be altered to incorporate prior knowledge about the task. Moreover, it is unclear how to acquire a single exploration strategy that will be useful for solving multiple downstream tasks. We address these shortcomings by learning a single exploration policy that can quickly solve a suite of downstream tasks in a multi-task setting, amortizing the cost of learning to explore. We recast exploration as a problem of State Marginal Matching (SMM), where we aim to learn a policy for which the state marginal distribution matches a given target state distribution, which can incorporate prior knowledge about the task. We optimize the objective by reducing it to a two-player, zero-sum game between a state density model and a parametric policy. Our theoretical analysis of this approach suggests that prior exploration methods do not learn a policy that does distribution matching, but acquire a replay buffer that performs distribution matching, an observation that potentially explains these prior methods’ success in single-task settings. On both simulated and real-world tasks, we demonstrate that our algorithm explores faster and adapts more quickly than prior methods.
Tasks Efficient Exploration
Published 2020-01-01
URL https://openreview.net/forum?id=Hkla1eHFvS
PDF https://openreview.net/pdf?id=Hkla1eHFvS
PWC https://paperswithcode.com/paper/efficient-exploration-via-state-marginal-1
Repo
Framework

Learning Compact Embedding Layers via Differentiable Product Quantization

Title Learning Compact Embedding Layers via Differentiable Product Quantization
Authors Anonymous
Abstract Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings. Despite their effectiveness, the number of parameters in an embedding layer increases linearly with the number of symbols and poses a critical challenge on memory and storage constraints. In this work, we propose a generic and end-to-end learnable compression framework termed differentiable product quantization (DPQ). We present two instantiations of DPQ that leverage different approximation techniques to enable differentiability in end-to-end learning. Our method can readily serve as a drop-in alternative for any existing embedding layer. Empirically, DPQ offers significant compression ratios (14-238x) at negligible or no performance cost on 10 datasets across three different language tasks.
Tasks Quantization
Published 2020-01-01
URL https://openreview.net/forum?id=BJxbOlSKPr
PDF https://openreview.net/pdf?id=BJxbOlSKPr
PWC https://paperswithcode.com/paper/learning-compact-embedding-layers-via
Repo
Framework

{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery

Title {COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
Authors Anonymous
Abstract We release the largest public ECG dataset of continuous raw signals for representation learning containing over 11k patients and 2 billion labelled beats. Our goal is to enable semi-supervised ECG models to be made as well as to discover unknown subtypes of arrhythmia and anomalous ECG signal events. To this end, we propose an unsupervised representation learning task, evaluated in a semi-supervised fashion. We provide a set of baselines for different feature extractors that can be built upon. Additionally, we perform qualitative evaluations on results from PCA embeddings, where we identify some clustering of known subtypes indicating the potential for representation learning in arrhythmia sub-type discovery.
Tasks Representation Learning, Unsupervised Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=BkgqL0EtPH
PDF https://openreview.net/pdf?id=BkgqL0EtPH
PWC https://paperswithcode.com/paper/companyname11k-an-unsupervised-representation
Repo
Framework

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Title CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
Authors Anonymous
Abstract A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multi-agent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others’ success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The complete architecture, called CM3, learns significantly faster than direct adaptations of existing algorithms on three challenging multi-goal multi-agent problems: cooperative navigation in difficult formations, negotiating multi-vehicle lane changes in the SUMO traffic simulator, and strategic cooperation in a Checkers environment.
Tasks Efficient Exploration, Multi-agent Reinforcement Learning
Published 2020-01-01
URL https://openreview.net/forum?id=S1lEX04tPr
PDF https://openreview.net/pdf?id=S1lEX04tPr
PWC https://paperswithcode.com/paper/cm3-cooperative-multi-goal-multi-stage-multi-1
Repo
Framework

Learning World Graph Decompositions To Accelerate Reinforcement Learning

Title Learning World Graph Decompositions To Accelerate Reinforcement Learning
Authors Anonymous
Abstract Efficiently learning to solve tasks in complex environments is a key challenge for reinforcement learning (RL) agents. We propose to decompose a complex environment using a task-agnostic world graphs, an abstraction that accelerates learning by enabling agents to focus exploration on a subspace of the environment.The nodes of a world graph are important waypoint states and edges represent feasible traversals between them. Our framework has two learning phases: 1) identifying world graph nodes and edges by training a binary recurrent variational auto-encoder (VAE) on trajectory data and 2) a hierarchical RL framework that leverages structural and connectivity knowledge from the learned world graph to bias exploration towards task-relevant waypoints and regions. We show that our approach significantly accelerates RL on a suite of challenging 2D grid world tasks: compared to baselines, world graph integration doubles achieved rewards on simpler tasks, e.g. MultiGoal, and manages to solve more challenging tasks, e.g. Door-Key, where baselines fail.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=BkgRe1SFDS
PDF https://openreview.net/pdf?id=BkgRe1SFDS
PWC https://paperswithcode.com/paper/learning-world-graph-decompositions-to
Repo
Framework

Group-Transformer: Towards A Lightweight Character-level Language Model

Title Group-Transformer: Towards A Lightweight Character-level Language Model
Authors Anonymous
Abstract Character-level language modeling is an essential but challenging task in Natural Language Processing. Prior works have focused on identifying long-term dependencies between characters and have built deeper and wider networks for better performance. However, their models require substantial computational resources, which hinders the usability of character-level language models in applications with limited resources. In this paper, we propose a lightweight model, called Group-Transformer, that reduces the resource requirements for a Transformer, a promising method for modeling sequence with long-term dependencies. Specifically, the proposed method partitions linear operations to reduce the number of parameters and computational cost. As a result, Group-Transformer only uses 18.2% of parameters compared to the best performing LSTM-based model, while providing better performance on two benchmark tasks, enwik8 and text8. When compared to Transformers with a comparable number of parameters and time complexity, the proposed model shows better performance. The implementation code will be available.
Tasks Language Modelling
Published 2020-01-01
URL https://openreview.net/forum?id=rkxdexBYPB
PDF https://openreview.net/pdf?id=rkxdexBYPB
PWC https://paperswithcode.com/paper/group-transformer-towards-a-lightweight
Repo
Framework

Learning DNA folding patterns with Recurrent Neural Networks

Title Learning DNA folding patterns with Recurrent Neural Networks
Authors Anonymous
Abstract The recent expansion of machine learning applications to molecular biology proved to have a significant contribution to our understanding of biological systems, and genome functioning in particular. Technological advances enabled the collection of large epigenetic datasets, including information about various DNA binding factors (ChIP-Seq) and DNA spatial structure (Hi-C). Several studies have confirmed the correlation between DNA binding factors and Topologically Associating Domains (TADs) in DNA structure. However, the information about physical proximity represented by genomic coordinate was not yet used for the improvement of the prediction models. In this research, we focus on Machine Learning methods for prediction of folding patterns of DNA in a classical model organism Drosophila melanogaster. The paper considers linear models with four types of regularization, Gradient Boosting and Recurrent Neural Networks for the prediction of chromatin folding patterns from epigenetic marks. The bidirectional LSTM RNN model outperformed all the models and gained the best prediction scores. This demonstrates the utilization of complex models and the importance of memory of sequential DNA states for the chromatin folding. We identify informative epigenetic features that lead to the further conclusion of their biological significance.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Bkel6ertwS
PDF https://openreview.net/pdf?id=Bkel6ertwS
PWC https://paperswithcode.com/paper/learning-dna-folding-patterns-with-recurrent
Repo
Framework

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Title TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing
Authors Anonymous
Abstract Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatically and provide summaries of code functionality accurately can significantly help developers to reduce time spent in code navigation and understanding, and thus increase productivity. Different from natural language articles, source code in programming languages often follows rigid syntactical structures and there can exist dependencies among code elements that are located far away from each other through complex control flows and data flows. Existing studies on tree-based convolutional neural networks (TBCNN) and gated graph neural networks (GGNN) are not able to capture essential semantic dependencies among code elements accurately. In this paper, we propose novel tree-based capsule networks (TreeCaps) and relevant techniques for processing program code in an automated way that encodes code syntactical structures and captures code dependencies more accurately. Based on evaluation on programs written in different programming languages, we show that our TreeCaps-based approach can outperform other approaches in classifying the functionalities of many programs.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=SJgXs1HtwH
PDF https://openreview.net/pdf?id=SJgXs1HtwH
PWC https://paperswithcode.com/paper/treecaps-tree-structured-capsule-networks-for
Repo
Framework
comments powered by Disqus