February 1, 2020

3142 words 15 mins read

Paper Group AWR 182

Paper Group AWR 182

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning. Kernel Mean Matching for Content Addressability of GANs. The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development. RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information. Precise Synthetic Image and LiDAR (PreSIL) Datase …

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Title Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
Authors Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang
Abstract Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms.
Tasks
Published 2019-09-07
URL https://arxiv.org/abs/1909.03245v2
PDF https://arxiv.org/pdf/1909.03245v2.pdf
PWC https://paperswithcode.com/paper/regularized-anderson-acceleration-for-off
Repo https://github.com/shiwj16/raa-drl
Framework pytorch

Kernel Mean Matching for Content Addressability of GANs

Title Kernel Mean Matching for Content Addressability of GANs
Authors Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf
Abstract We propose a novel procedure which adds “content-addressability” to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model.
Tasks Image Generation
Published 2019-05-14
URL https://arxiv.org/abs/1905.05882v1
PDF https://arxiv.org/pdf/1905.05882v1.pdf
PWC https://paperswithcode.com/paper/kernel-mean-matching-for-content
Repo https://github.com/wittawatj/cadgan
Framework pytorch

The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development

Title The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
Authors Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni
Abstract As machine learning is applied more widely, data scientists often struggle to find or create end-to-end machine learning systems for specific tasks. The proliferation of libraries and frameworks and the complexity of the tasks have led to the emergence of “pipeline jungles” - brittle, ad hoc ML systems. To address these problems, we introduce the Machine Learning Bazaar, a new approach to developing machine learning and automated machine learning software systems. First, we introduce ML primitives, a unified API and specification for data processing and ML components from different software libraries. Next, we compose primitives into usable ML pipelines, abstracting away glue code, data flow, and data storage. We further pair these pipelines with a hierarchy of AutoML strategies - Bayesian optimization and bandit learning. We use these components to create a general-purpose, multi-task, end-to-end AutoML system that provides solutions to a variety of data modalities (image, text, graph, tabular, relational, etc.) and problem types (classification, regression, anomaly detection, graph matching, etc.). We present an evaluation suite of 456 real-world ML tasks and describe the characteristics of 2.5 million pipelines searched over this task suite. Finally, we demonstrate 5 real-world use cases and 2 case studies of our approach.
Tasks Anomaly Detection, AutoML, Graph Matching
Published 2019-05-22
URL https://arxiv.org/abs/1905.08942v3
PDF https://arxiv.org/pdf/1905.08942v3.pdf
PWC https://paperswithcode.com/paper/the-machine-learning-bazaar-harnessing-the-ml
Repo https://github.com/HDI-Project/MLBlocks
Framework none

RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information

Title RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information
Authors Helmut Mayer
Abstract A core component of all Structure from Motion (SfM) approaches is bundle adjustment. As the latter is a computational bottleneck for larger blocks, parallel bundle adjustment has become an active area of research. Particularly, consensus-based optimization methods have been shown to be suitable for this task. We have extended them using covariance information derived by the adjustment of individual three-dimensional (3D) points, i.e., “triangulation” or “intersection”. This does not only lead to a much better convergence behavior, but also avoids fiddling with the penalty parameter of standard consensus-based approaches. The corresponding novel approach can also be seen as a variant of resection / intersection schemes, where we adjust during intersection a number of sub-blocks directly related to the number of threads available on a computer each containing a fraction of the cameras of the block. We show that our novel approach is suitable for robust parallel bundle adjustment and demonstrate its capabilities in comparison to the basic consensus-based approach as well as a state-of-the-art parallel implementation of bundle adjustment. Code for our novel approach is available on GitHub: https://github.com/helmayer/RPBA
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.08138v1
PDF https://arxiv.org/pdf/1910.08138v1.pdf
PWC https://paperswithcode.com/paper/rpba-robust-parallel-bundle-adjustment-based
Repo https://github.com/helmayer/RPBA
Framework none

Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception

Title Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception
Authors Braden Hurl, Krzysztof Czarnecki, Steven Waslander
Abstract We introduce the Precise Synthetic Image and LiDAR (PreSIL) dataset for autonomous vehicle perception. Grand Theft Auto V (GTA V), a commercial video game, has a large detailed world with realistic graphics, which provides a diverse data collection environment. Existing works creating synthetic LiDAR data for autonomous driving with GTA V have not released their datasets, rely on an in-game raycasting function which represents people as cylinders, and can fail to capture vehicles past 30 metres. Our work creates a precise LiDAR simulator within GTA V which collides with detailed models for all entities no matter the type or position. The PreSIL dataset consists of over 50,000 frames and includes high-definition images with full resolution depth information, semantic segmentation (images), point-wise segmentation (point clouds), and detailed annotations for all vehicles and people. Collecting additional data with our framework is entirely automatic and requires no human annotation of any kind. We demonstrate the effectiveness of our dataset by showing an improvement of up to 5% average precision on the KITTI 3D Object Detection benchmark challenge when state-of-the-art 3D object detection networks are pre-trained with our data. The data and code are available at https://tinyurl.com/y3tb9sxy
Tasks 3D Object Detection, Autonomous Driving, Object Detection, Semantic Segmentation
Published 2019-05-01
URL https://arxiv.org/abs/1905.00160v2
PDF https://arxiv.org/pdf/1905.00160v2.pdf
PWC https://paperswithcode.com/paper/precise-synthetic-image-and-lidar-presil
Repo https://github.com/bradenhurl/DeepGTAV-PreSIL
Framework none

Deep Visual Template-Free Form Parsing

Title Deep Visual Template-Free Form Parsing
Authors Brian Davis, Bryan Morse, Scott Cohen, Brian Price, Chris Tensmeyer
Abstract Automatic, template-free extraction of information from form images is challenging due to the variety of form layouts. This is even more challenging for historical forms due to noise and degradation. A crucial part of the extraction process is associating input text with pre-printed labels. We present a learned, template-free solution to detecting pre-printed text and input text/handwriting and predicting pair-wise relationships between them. While previous approaches to this problem have been focused on clean images and clear layouts, we show our approach is effective in the domain of noisy, degraded, and varied form images. We introduce a new dataset of historical form images (late 1800s, early 1900s) for training and validating our approach. Our method uses a convolutional network to detect pre-printed text and input text lines. We pool features from the detection network to classify possible relationships in a language-agnostic way. We show that our proposed pairing method outperforms heuristic rules and that visual features are critical to obtaining high accuracy.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02576v2
PDF https://arxiv.org/pdf/1909.02576v2.pdf
PWC https://paperswithcode.com/paper/deep-visual-template-free-form-parsing
Repo https://github.com/herobd/NAF_dataset
Framework none

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning

Title Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Authors Gregory Farquhar, Shimon Whiteson, Jakob Foerster
Abstract Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives. We derive an objective that, under automatic differentiation, produces low-variance unbiased estimators of derivatives at any order. Our objective is compatible with arbitrary advantage estimators, which allows the control of the bias and variance of any-order derivatives when using function approximation. Furthermore, we propose a method to trade off bias and variance of higher order derivatives by discounting the impact of more distant causal dependencies. We demonstrate the correctness and utility of our objective in analytically tractable MDPs and in meta-reinforcement-learning for continuous control.
Tasks Continuous Control
Published 2019-09-23
URL https://arxiv.org/abs/1909.10549v1
PDF https://arxiv.org/pdf/1909.10549v1.pdf
PWC https://paperswithcode.com/paper/loaded-dice-trading-off-bias-and-variance-in
Repo https://github.com/oxwhirl/loaded-dice
Framework none

FlipTest: Fairness Testing via Optimal Transport

Title FlipTest: Fairness Testing via Optimal Transport
Authors Emily Black, Samuel Yeom, Matt Fredrikson
Abstract We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a “flipset”: the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a “transparency report”: a ranking of features that are most associated with the model’s behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09218v4
PDF https://arxiv.org/pdf/1906.09218v4.pdf
PWC https://paperswithcode.com/paper/fliptest-fairness-auditing-via-optimal
Repo https://github.com/samuel-yeom/fliptest
Framework tf

Self-organized inductive reasoning with NeMuS

Title Self-organized inductive reasoning with NeMuS
Authors Leonardo Barreto, Edjard Mota
Abstract Neural Multi-Space (NeMuS) is a weighted multi-space representation for a portion of first-order logic designed for use with machine learning and neural network methods. It was demonstrated that it can be used to perform reasoning based on regions forming patterns of refutation and also in the process of inductive learning in ILP-like style. Initial experiments were carried out to investigate whether a self-organizing the approach is suitable to generate similar concept regions according to the attributes that form such concepts. We present the results and make an analysis of the suitability of the method in the process of inductive learning with NeMuS.
Tasks
Published 2019-06-16
URL https://arxiv.org/abs/1906.06761v1
PDF https://arxiv.org/pdf/1906.06761v1.pdf
PWC https://paperswithcode.com/paper/self-organized-inductive-reasoning-with-nemus
Repo https://github.com/JustGlowing/minisom
Framework none

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Title Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation
Authors Kenton Murray, Jeffery Kinnison, Toan Q. Nguyen, Walter Scheirer, David Chiang
Abstract Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.
Tasks Machine Translation
Published 2019-10-01
URL https://arxiv.org/abs/1910.06717v1
PDF https://arxiv.org/pdf/1910.06717v1.pdf
PWC https://paperswithcode.com/paper/auto-sizing-the-transformer-network-improving
Repo https://github.com/KentonMurray/ProxGradPytorch
Framework pytorch

Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume

Title Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume
Authors Qingshan Xu, Wenbing Tao
Abstract Deep learning has shown to be effective for depth inference in multi-view stereo (MVS). However, the scalability and accuracy still remain an open problem in this domain. This can be attributed to the memory-consuming cost volume representation and inappropriate depth inference. Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume. This can not only reduce the memory consumption but also reduce the computational burden in the cost volume filtering. Based on our effective cost volume representation, we propose a cascade 3D U-Net module to regularize the cost volume to further boost the performance. Unlike the previous methods that treat multi-view depth inference as a depth regression problem or an inverse depth classification problem, we recast multi-view depth inference as an inverse depth regression task. This allows our network to achieve sub-pixel estimation and be applicable to large-scale scenes. Through extensive experiments on DTU dataset and Tanks and Temples dataset, we show that our proposed network with Correlation cost volume and Inverse DEpth Regression (CIDER), achieves state-of-the-art results, demonstrating its superior performance on scalability and accuracy.
Tasks Stereo Matching
Published 2019-12-26
URL https://arxiv.org/abs/1912.11746v1
PDF https://arxiv.org/pdf/1912.11746v1.pdf
PWC https://paperswithcode.com/paper/learning-inverse-depth-regression-for-multi
Repo https://github.com/GhiXu/CIDER
Framework none

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues

Title Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues
Authors Or Levi, Pedram Hosseini, Mona Diab, David A. Broniatowski
Abstract The blurry line between nefarious fake news and protected-speech satire has been a notorious struggle for social media platforms. Further to the efforts of reducing exposure to misinformation on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. In this work, we address the challenge of automatically classifying fake news versus satire. Previous work have studied whether fake news and satire can be distinguished based on language differences. Contrary to fake news, satire stories are usually humorous and carry some political or social message. We hypothesize that these nuances could be identified using semantic and linguistic cues. Consequently, we train a machine learning method using semantic representation, with a state-of-the-art contextual language model, and with linguistic features based on textual coherence metrics. Empirical evaluation attests to the merits of our approach compared to the language-based baseline and sheds light on the nuances between fake news and satire. As avenues for future work, we consider studying additional linguistic features related to the humor aspect, and enriching the data with current news events, to help identify a political or social message.
Tasks Language Modelling
Published 2019-10-02
URL https://arxiv.org/abs/1910.01160v2
PDF https://arxiv.org/pdf/1910.01160v2.pdf
PWC https://paperswithcode.com/paper/identifying-nuances-in-fake-news-vs-satire
Repo https://github.com/adverifai/Satire_vs_Fake
Framework none

Differentiable Convex Optimization Layers

Title Differentiable Convex Optimization Layers
Authors Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, Zico Kolter
Abstract Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver’s solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work.
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12430v1
PDF https://arxiv.org/pdf/1910.12430v1.pdf
PWC https://paperswithcode.com/paper/differentiable-convex-optimization-layers
Repo https://github.com/cvxgrp/cvxpylayers
Framework pytorch

Intrinsic dimension of data representations in deep neural networks

Title Intrinsic dimension of data representations in deep neural networks
Authors Alessio Ansuini, Alessandro Laio, Jakob H. Macke, Davide Zoccolan
Abstract Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12784v2
PDF https://arxiv.org/pdf/1905.12784v2.pdf
PWC https://paperswithcode.com/paper/intrinsic-dimension-of-data-representations
Repo https://github.com/ansuini/IntrinsicDimDeep
Framework pytorch

Exploiting Temporal Relationships in Video Moment Localization with Natural Language

Title Exploiting Temporal Relationships in Video Moment Localization with Natural Language
Authors Songyang Zhang, Jinsong Su, Jiebo Luo
Abstract We address the problem of video moment localization with natural language, i.e. localizing a video segment described by a natural language sentence. While most prior work focuses on grounding the query as a whole, temporal dependencies and reasoning between events within the text are not fully considered. In this paper, we propose a novel Temporal Compositional Modular Network (TCMN) where a tree attention network first automatically decomposes a sentence into three descriptions with respect to the main event, context event and temporal signal. Two modules are then utilized to measure the visual similarity and location similarity between each segment and the decomposed descriptions. Moreover, since the main event and context event may rely on different modalities (RGB or optical flow), we use late fusion to form an ensemble of four models, where each model is independently trained by one combination of the visual input. Experiments show that our model outperforms the state-of-the-art methods on the TEMPO dataset.
Tasks Optical Flow Estimation
Published 2019-08-11
URL https://arxiv.org/abs/1908.03846v1
PDF https://arxiv.org/pdf/1908.03846v1.pdf
PWC https://paperswithcode.com/paper/exploiting-temporal-relationships-in-video
Repo https://github.com/Sy-Zhang/TCMN-Release
Framework pytorch
comments powered by Disqus