[PAPER]@Telematika | Policy Gradient Methods

Policy Gradient Methods

Paper Group ANR 64

Paper Group ANR 64

April 3, 2020

Regularization Helps with Mitigating Poisoning Attacks: Distributionally-Robust Machine Learning Using the Wasserstein Distance. 3D Gated Recurrent Fusion for Semantic Scene Completion. Learning Queuing Networks by Recurrent Neural Networks. 3D Deep Learning on Medical Images: A Review. SHX: Search History Driven Crossover for Real-Coded Genetic Al …

Paper Group AWR 28

Paper Group AWR 28

April 3, 2020

Using Reinforcement Learning in the Algorithmic Trading Problem. Hyperspectral Classification Based on 3D Asymmetric Inception Network with Data Fusion Transfer Learning. Variational Inference with Vine Copulas: An efficient Approach for Bayesian Computer Model Calibration. Ensemble neural network forecasts with singular value decomposition. Error …

Paper Group ANR 131

Paper Group ANR 131

April 2, 2020

FlexiBO: Cost-Aware Multi-Objective Optimization of Deep Neural Networks. Joint Event Extraction along Shortest Dependency Paths using Graph Convolutional Networks. Optimizing Revenue while showing Relevant Assortments at Scale. Classification of Hyperspectral and LiDAR Data Using Coupled CNNs. Differential Dynamic Programming Neural Optimizer. Ill …

Paper Group ANR 168

Paper Group ANR 168

April 2, 2020

A Multilingual View of Unsupervised Machine Translation. Lost in Embedding Space: Explaining Cross-Lingual Task Performance with Eigenvalue Divergence. Suphx: Mastering Mahjong with Deep Reinforcement Learning. Equivalence of Dataflow Graphs via Rewrite Rules Using a Graph-to-Sequence Neural Model. Segmentation of Cellular Patterns in Confocal Imag …

Paper Group ANR 318

Paper Group ANR 318

April 2, 2020

Limits of Detecting Text Generated by Large-Scale Language Models. Multimodal Matching Transformer for Live Commenting. Exploiting Database Management Systems and Treewidth for Counting. Learning Contact-Rich Manipulation Tasks with Rigid Position-Controlled Robots: Learning to Force Control. Citation Text Generation. Ten Research Challenge Areas i …

Paper Group ANR 412

Paper Group ANR 412

April 1, 2020

Ising-based Consensus Clustering on Specialized Hardware. Restore from Restored: Single Image Denoising with Pseudo Clean Image. Advaita: Bug Duplicity Detection System. Algebraic and Analytic Approaches for Parameter Learning in Mixture Models. Beyond without Forgetting: Multi-Task Learning for Classification with Disjoint Datasets. A machine lear …

Paper Group NANR 101

Paper Group NANR 101

April 1, 2020

Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling. Discovering Motor Programs by Recomposing Demonstrations. Representing Unordered Data Using Multiset Automata and Complex Numbers. Evaluating Lossy Compression Rates of Deep Generative Models. Attentive Sequential Neural Processes. Controlling generative models with continuou …

Paper Group NANR 126

Paper Group NANR 126

April 1, 2020

Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization. Reasoning-Aware Graph Convolutional Network for Visual Question Answering. Attention on Abstract Visual Reasoning. CLEVRER: Collision Events for Video Representation and Reasoning. Deep End-to-end Unsupervised Anomaly Detection. Attentive Weights Generation for Few Shot Lea …

Paper Group NANR 133

Paper Group NANR 133

April 1, 2020

Winning the Lottery with Continuous Sparsification. Model Architecture Controls Gradient Descent Dynamics: A Combinatorial Path-Based Formula. Reweighted Proximal Pruning for Large-Scale Language Representation. Distilled embedding: non-linear embedding factorization using knowledge distillation. Soft Token Matching for Interpretable Low-Resource C …

Paper Group NANR 138

Paper Group NANR 138

April 1, 2020

Generalized Bayesian Posterior Expectation Distillation for Deep Neural Networks. AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING. Sample Efficient Policy Gradient Methods with Recursive Variance Reduction. Topological Autoencoders. Policy Optimization with Stochastic Mirror Descent. Generative Latent Flow. Neural Phrase-to-P …

Paper Group NANR 39

Paper Group NANR 39

April 1, 2020

Projected Canonical Decomposition for Knowledge Base Completion. Learning a Spatio-Temporal Embedding for Video Instance Segmentation. A Stochastic Derivative Free Optimization Method with Momentum. Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks. Non-linear System Identification from Partial Observations via Iterative Smo …

Paper Group NANR 54

Paper Group NANR 54

April 1, 2020

Theory and Evaluation Metrics for Learning Disentangled Representations. LSTOD: Latent Spatial-Temporal Origin-Destination prediction model and its applications in ride-sharing platforms. TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces. Unsupervised Model Selection for Variational Disentan …

Paper Group NANR 90

Paper Group NANR 90

April 1, 2020

SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning. Transferable Perturbations of Deep Feature Distributions. Mutual Information Maximization for Robust Plannable Representations. Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation. Non-Sequential Melody Generation. Smart Ternary Quantizatio …

Paper Group NANR 93

Paper Group NANR 93

April 1, 2020

PNAT: Non-autoregressive Transformer by Position Learning. Diagnosing the Environment Bias in Vision-and-Language Navigation. Differentially Private Meta-Learning. Learning Underlying Physical Properties From Observations For Trajectory Prediction. Universal Approximation with Certified Networks. Neural Policy Gradient Methods: Global Optimality an …

Paper Group NAWR 3

Paper Group NAWR 3

April 1, 2020

Why ADAM Beats SGD for Attention Models. Hindsight Trust Region Policy Optimization. Decentralized Distributed PPO: Mastering PointGoal Navigation. ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning. Bootstrapping the Expressivity with Model-based Planning. Automated Relational Meta-learning. Empirical Bayes Transductive Meta-Learn …