April 1, 2020

2969 words 14 mins read

Paper Group NANR 111

Natural- to formal-language generation using Tensor Product Representations. Towards trustworthy predictions from deep neural networks with fast adversarial calibration. Deep Randomized Least Squares Value Iteration. Language-independent Cross-lingual Contextual Representations. Safe Policy Learning for Continuous Control. Lattice Representation Le …

Natural- to formal-language generation using Tensor Product Representations


Title	Natural- to formal-language generation using Tensor Product Representations
Authors	Anonymous
Abstract	Generating formal-language represented by relational tuples, such as Lisp programs or mathematical expressions, from a natural-language input is an extremely challenging task because it requires to explicitly capture discrete symbolic structural information from the input to generate the output. Most state-of-the-art neural sequence models do not explicitly capture such structure information, and thus do not perform well on these tasks. In this paper, we propose a new encoder-decoder model based on Tensor Product Representations (TPRs) for Natural- to Formal-language generation, called TP-N2F. The encoder of TP-N2F employs TPR ‘binding’ to encode natural-language symbolic structure in vector space and the decoder uses TPR ‘unbinding’ to generate a sequence of relational tuples, each consisting of a relation (or operation) and a number of arguments, in symbolic space. TP-N2F considerably outperforms LSTM-based Seq2Seq models, creating a new state of the art results on two benchmarks: the MathQA dataset for math problem solving, and the AlgoList dataset for program synthesis. Ablation studies show that improvements are mainly attributed to the use of TPRs in both the encoder and decoder to explicitly capture relational structure information for symbolic reasoning.
Tasks	Program Synthesis, Text Generation
Published	2020-01-01
URL	https://openreview.net/forum?id=BylPSkHKvB
PDF	https://openreview.net/pdf?id=BylPSkHKvB
PWC	https://paperswithcode.com/paper/natural-to-formal-language-generation-using
Repo
Framework

Towards trustworthy predictions from deep neural networks with fast adversarial calibration


Title	Towards trustworthy predictions from deep neural networks with fast adversarial calibration
Authors	Anonymous
Abstract	To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include post-processing steps for trained neural networks, Bayesian neural networks as well as alternative non-Bayesian approaches such as ensemble approaches and evidential deep learning. Here, we propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions for a wide range of perturbations. We comprehensively evaluate previously proposed approaches on different data modalities, a large range of data sets, network architectures and perturbation strategies and observe that our modelling approach substantially outperforms existing state-of-the-art approaches, yielding well-calibrated predictions for both in-domain and out-of domain samples.
Tasks	Calibration, Decision Making
Published	2020-01-01
URL	https://openreview.net/forum?id=rygePJHYPH
PDF	https://openreview.net/pdf?id=rygePJHYPH
PWC	https://paperswithcode.com/paper/towards-trustworthy-predictions-from-deep
Repo
Framework

Deep Randomized Least Squares Value Iteration


Title	Deep Randomized Least Squares Value Iteration
Authors	Guy Adam, Tom Zahavy, Oron Anschel, Nahum Shimkin
Abstract	Exploration while learning representations is one of the main challenges Deep Reinforcement Learning (DRL) faces today. As the learned representation is dependant in the observed data, the exploration strategy has a crucial role. The popular DQN algorithm has improved significantly the capabilities of Reinforcement Learning (RL) algorithms to learn state representations from raw data, yet, it uses a naive exploration strategy which is statistically inefficient. The Randomized Least Squares Value Iteration (RLSVI) algorithm (Osband et al., 2016), on the other hand, explores and generalizes efficiently via linearly parameterized value functions. However, it is based on hand-designed state representation that requires prior engineering work for every environment. In this paper, we propose a Deep Learning adaptation for RLSVI. Rather than using hand-design state representation, we use a state representation that is being learned directly from the data by a DQN agent. As the representation is being optimized during the learning process, a key component for the suggested method is a likelihood matching mechanism, which adapts to the changing representations. We demonstrate the importance of the various properties of our algorithm on a toy problem and show that our method outperforms DQN in five Atari benchmarks, reaching competitive results with the Rainbow algorithm.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Syetja4KPH
PDF	https://openreview.net/pdf?id=Syetja4KPH
PWC	https://paperswithcode.com/paper/deep-randomized-least-squares-value-iteration
Repo
Framework

Language-independent Cross-lingual Contextual Representations


Title	Language-independent Cross-lingual Contextual Representations
Authors	Anonymous
Abstract	Contextual representation models like BERT have achieved state-of-the-art performance on a diverse range of NLP tasks. We propose a cross-lingual contextual representation model that generates language-independent contextual representations. This helps to enable zero-shot cross-lingual transfer of a wide range of NLP models, on top of contextual representation models like BERT. We provide a formulation of language-independent cross-lingual contextual representation based on mono-lingual representations. Our formulation takes three steps to align sequences of vectors: transform, extract, and reorder. We present a detailed discussion about the process of learning cross-lingual contextual representations, also about the performance in cross-lingual transfer learning and its implications.
Tasks	Cross-Lingual Transfer, Transfer Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=HylvleBtPB
PDF	https://openreview.net/pdf?id=HylvleBtPB
PWC	https://paperswithcode.com/paper/language-independent-cross-lingual-contextual
Repo
Framework

Safe Policy Learning for Continuous Control


Title	Safe Policy Learning for Continuous Control
Authors	Anonymous
Abstract	We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that keep the agent in desirable situations, both during training and at convergence. We formulate these problems as {\em constrained} Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve them. Our algorithms can use any standard policy gradient (PG) method, such as deep deterministic policy gradient (DDPG) or proximal policy optimization (PPO), to train a neural network policy, while guaranteeing near-constraint satisfaction for every policy update by projecting either the policy parameter or the selected action onto the set of feasible solutions induced by the state-dependent linearized Lyapunov constraints. Compared to the existing constrained PG algorithms, ours are more data efficient as they are able to utilize both on-policy and off-policy data. Moreover, our action-projection algorithm often leads to less conservative policy updates and allows for natural integration into an end-to-end PG training pipeline. We evaluate our algorithms and compare them with the state-of-the-art baselines on several simulated (MuJoCo) tasks, as well as a real-world robot obstacle-avoidance problem, demonstrating their effectiveness in terms of balancing performance and constraint satisfaction.
Tasks	Continuous Control
Published	2020-01-01
URL	https://openreview.net/forum?id=HkxeThNFPH
PDF	https://openreview.net/pdf?id=HkxeThNFPH
PWC	https://paperswithcode.com/paper/safe-policy-learning-for-continuous-control
Repo
Framework

Lattice Representation Learning


Title	Lattice Representation Learning
Authors	Anonymous
Abstract	We introduce the notion of \emph{lattice representation learning}, in which the representation for some object of interest (e.g. a sentence or an image) is a lattice point in an Euclidean space. Our main contribution is a result for replacing an objective function which employs lattice quantization with an objective function in which quantization is absent, thus allowing optimization techniques based on gradient descent to apply; we call the resulting algorithms \emph{dithered stochastic gradient descent} algorithms as they are designed explicitly to allow for an optimization procedure where only local information is employed. We also argue that a technique commonly used in Variational Auto-Encoders (Gaussian priors and Gaussian approximate posteriors) is tightly connected with the idea of lattice representations, as the quantization error in good high dimensional lattices can be modeled as a Gaussian distribution. We use a traditional encoder/decoder architecture to explore the idea of latticed valued representations, and provide experimental evidence of the potential of using lattice representations by modifying the \texttt{OpenNMT-py} generic \texttt{seq2seq} architecture so that it can implement not only Gaussian dithering of representations, but also the well known straight-through estimator and its application to vector quantization.
Tasks	Quantization, Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rJlwAa4YwS
PDF	https://openreview.net/pdf?id=rJlwAa4YwS
PWC	https://paperswithcode.com/paper/lattice-representation-learning
Repo
Framework

Meta Learning via Learned Loss


Title	Meta Learning via Learned Loss
Authors	Anonymous
Abstract	We present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures. We develop a pipeline for training such loss functions, targeted at maximizing the performance of model learn- ing with them. We observe that the loss landscape produced by our learned losses significantly improves upon the original task-specific losses in both supervised and reinforcement learning tasks. Furthermore, we show that our meta-learning framework is flexible enough to incorporate additional information at meta-train time. This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time.
Tasks	Meta-Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=ryesZANKPB
PDF	https://openreview.net/pdf?id=ryesZANKPB
PWC	https://paperswithcode.com/paper/meta-learning-via-learned-loss-1
Repo
Framework

Efficient Exploration via State Marginal Matching


Title	Efficient Exploration via State Marginal Matching
Authors	Anonymous
Abstract	Reinforcement learning agents need to explore their unknown environments to solve the tasks given to them. The Bayes optimal solution to exploration is intractable for complex environments, and while several exploration methods have been proposed as approximations, it remains unclear what underlying objective is being optimized by existing exploration methods, or how they can be altered to incorporate prior knowledge about the task. Moreover, it is unclear how to acquire a single exploration strategy that will be useful for solving multiple downstream tasks. We address these shortcomings by learning a single exploration policy that can quickly solve a suite of downstream tasks in a multi-task setting, amortizing the cost of learning to explore. We recast exploration as a problem of State Marginal Matching (SMM), where we aim to learn a policy for which the state marginal distribution matches a given target state distribution, which can incorporate prior knowledge about the task. We optimize the objective by reducing it to a two-player, zero-sum game between a state density model and a parametric policy. Our theoretical analysis of this approach suggests that prior exploration methods do not learn a policy that does distribution matching, but acquire a replay buffer that performs distribution matching, an observation that potentially explains these prior methods’ success in single-task settings. On both simulated and real-world tasks, we demonstrate that our algorithm explores faster and adapts more quickly than prior methods.
Tasks	Efficient Exploration
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkla1eHFvS
PDF	https://openreview.net/pdf?id=Hkla1eHFvS
PWC	https://paperswithcode.com/paper/efficient-exploration-via-state-marginal-1
Repo
Framework

Learning Compact Embedding Layers via Differentiable Product Quantization


Title	Learning Compact Embedding Layers via Differentiable Product Quantization
Authors	Anonymous
Abstract	Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings. Despite their effectiveness, the number of parameters in an embedding layer increases linearly with the number of symbols and poses a critical challenge on memory and storage constraints. In this work, we propose a generic and end-to-end learnable compression framework termed differentiable product quantization (DPQ). We present two instantiations of DPQ that leverage different approximation techniques to enable differentiability in end-to-end learning. Our method can readily serve as a drop-in alternative for any existing embedding layer. Empirically, DPQ offers significant compression ratios (14-238x) at negligible or no performance cost on 10 datasets across three different language tasks.
Tasks	Quantization
Published	2020-01-01
URL	https://openreview.net/forum?id=BJxbOlSKPr
PDF	https://openreview.net/pdf?id=BJxbOlSKPr
PWC	https://paperswithcode.com/paper/learning-compact-embedding-layers-via
Repo
Framework

{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery


Title	{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
Authors	Anonymous
Abstract	We release the largest public ECG dataset of continuous raw signals for representation learning containing over 11k patients and 2 billion labelled beats. Our goal is to enable semi-supervised ECG models to be made as well as to discover unknown subtypes of arrhythmia and anomalous ECG signal events. To this end, we propose an unsupervised representation learning task, evaluated in a semi-supervised fashion. We provide a set of baselines for different feature extractors that can be built upon. Additionally, we perform qualitative evaluations on results from PCA embeddings, where we identify some clustering of known subtypes indicating the potential for representation learning in arrhythmia sub-type discovery.
Tasks	Representation Learning, Unsupervised Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=BkgqL0EtPH
PDF	https://openreview.net/pdf?id=BkgqL0EtPH
PWC	https://paperswithcode.com/paper/companyname11k-an-unsupervised-representation
Repo
Framework

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning


Title	CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
Authors	Anonymous
Abstract	A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multi-agent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others’ success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The complete architecture, called CM3, learns significantly faster than direct adaptations of existing algorithms on three challenging multi-goal multi-agent problems: cooperative navigation in difficult formations, negotiating multi-vehicle lane changes in the SUMO traffic simulator, and strategic cooperation in a Checkers environment.
Tasks	Efficient Exploration, Multi-agent Reinforcement Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=S1lEX04tPr
PDF	https://openreview.net/pdf?id=S1lEX04tPr
PWC	https://paperswithcode.com/paper/cm3-cooperative-multi-goal-multi-stage-multi-1
Repo
Framework

Learning World Graph Decompositions To Accelerate Reinforcement Learning


Title	Learning World Graph Decompositions To Accelerate Reinforcement Learning
Authors	Anonymous
Abstract	Efficiently learning to solve tasks in complex environments is a key challenge for reinforcement learning (RL) agents. We propose to decompose a complex environment using a task-agnostic world graphs, an abstraction that accelerates learning by enabling agents to focus exploration on a subspace of the environment.The nodes of a world graph are important waypoint states and edges represent feasible traversals between them. Our framework has two learning phases: 1) identifying world graph nodes and edges by training a binary recurrent variational auto-encoder (VAE) on trajectory data and 2) a hierarchical RL framework that leverages structural and connectivity knowledge from the learned world graph to bias exploration towards task-relevant waypoints and regions. We show that our approach significantly accelerates RL on a suite of challenging 2D grid world tasks: compared to baselines, world graph integration doubles achieved rewards on simpler tasks, e.g. MultiGoal, and manages to solve more challenging tasks, e.g. Door-Key, where baselines fail.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=BkgRe1SFDS
PDF	https://openreview.net/pdf?id=BkgRe1SFDS
PWC	https://paperswithcode.com/paper/learning-world-graph-decompositions-to
Repo
Framework

Group-Transformer: Towards A Lightweight Character-level Language Model


Title	Group-Transformer: Towards A Lightweight Character-level Language Model
Authors	Anonymous
Abstract	Character-level language modeling is an essential but challenging task in Natural Language Processing. Prior works have focused on identifying long-term dependencies between characters and have built deeper and wider networks for better performance. However, their models require substantial computational resources, which hinders the usability of character-level language models in applications with limited resources. In this paper, we propose a lightweight model, called Group-Transformer, that reduces the resource requirements for a Transformer, a promising method for modeling sequence with long-term dependencies. Specifically, the proposed method partitions linear operations to reduce the number of parameters and computational cost. As a result, Group-Transformer only uses 18.2% of parameters compared to the best performing LSTM-based model, while providing better performance on two benchmark tasks, enwik8 and text8. When compared to Transformers with a comparable number of parameters and time complexity, the proposed model shows better performance. The implementation code will be available.
Tasks	Language Modelling
Published	2020-01-01
URL	https://openreview.net/forum?id=rkxdexBYPB
PDF	https://openreview.net/pdf?id=rkxdexBYPB
PWC	https://paperswithcode.com/paper/group-transformer-towards-a-lightweight
Repo
Framework

Learning DNA folding patterns with Recurrent Neural Networks


Title	Learning DNA folding patterns with Recurrent Neural Networks
Authors	Anonymous
Abstract	The recent expansion of machine learning applications to molecular biology proved to have a significant contribution to our understanding of biological systems, and genome functioning in particular. Technological advances enabled the collection of large epigenetic datasets, including information about various DNA binding factors (ChIP-Seq) and DNA spatial structure (Hi-C). Several studies have confirmed the correlation between DNA binding factors and Topologically Associating Domains (TADs) in DNA structure. However, the information about physical proximity represented by genomic coordinate was not yet used for the improvement of the prediction models. In this research, we focus on Machine Learning methods for prediction of folding patterns of DNA in a classical model organism Drosophila melanogaster. The paper considers linear models with four types of regularization, Gradient Boosting and Recurrent Neural Networks for the prediction of chromatin folding patterns from epigenetic marks. The bidirectional LSTM RNN model outperformed all the models and gained the best prediction scores. This demonstrates the utilization of complex models and the importance of memory of sequential DNA states for the chromatin folding. We identify informative epigenetic features that lead to the further conclusion of their biological significance.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Bkel6ertwS
PDF	https://openreview.net/pdf?id=Bkel6ertwS
PWC	https://paperswithcode.com/paper/learning-dna-folding-patterns-with-recurrent
Repo
Framework

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing


Title	TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing
Authors	Anonymous
Abstract	Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatically and provide summaries of code functionality accurately can significantly help developers to reduce time spent in code navigation and understanding, and thus increase productivity. Different from natural language articles, source code in programming languages often follows rigid syntactical structures and there can exist dependencies among code elements that are located far away from each other through complex control flows and data flows. Existing studies on tree-based convolutional neural networks (TBCNN) and gated graph neural networks (GGNN) are not able to capture essential semantic dependencies among code elements accurately. In this paper, we propose novel tree-based capsule networks (TreeCaps) and relevant techniques for processing program code in an automated way that encodes code syntactical structures and captures code dependencies more accurately. Based on evaluation on programs written in different programming languages, we show that our TreeCaps-based approach can outperform other approaches in classifying the functionalities of many programs.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SJgXs1HtwH
PDF	https://openreview.net/pdf?id=SJgXs1HtwH
PWC	https://paperswithcode.com/paper/treecaps-tree-structured-capsule-networks-for
Repo
Framework