February 1, 2020

3142 words 15 mins read

Paper Group AWR 182

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning. Kernel Mean Matching for Content Addressability of GANs. The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development. RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information. Precise Synthetic Image and LiDAR (PreSIL) Datase …

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning


Title	Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
Authors	Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang
Abstract	Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms.
Tasks
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03245v2
PDF	https://arxiv.org/pdf/1909.03245v2.pdf
PWC	https://paperswithcode.com/paper/regularized-anderson-acceleration-for-off
Repo	https://github.com/shiwj16/raa-drl
Framework	pytorch

Kernel Mean Matching for Content Addressability of GANs


Title	Kernel Mean Matching for Content Addressability of GANs
Authors	Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf
Abstract	We propose a novel procedure which adds “content-addressability” to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model.
Tasks	Image Generation
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05882v1
PDF	https://arxiv.org/pdf/1905.05882v1.pdf
PWC	https://paperswithcode.com/paper/kernel-mean-matching-for-content
Repo	https://github.com/wittawatj/cadgan
Framework	pytorch

The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development


Title	The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
Authors	Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni
Abstract	As machine learning is applied more widely, data scientists often struggle to find or create end-to-end machine learning systems for specific tasks. The proliferation of libraries and frameworks and the complexity of the tasks have led to the emergence of “pipeline jungles” - brittle, ad hoc ML systems. To address these problems, we introduce the Machine Learning Bazaar, a new approach to developing machine learning and automated machine learning software systems. First, we introduce ML primitives, a unified API and specification for data processing and ML components from different software libraries. Next, we compose primitives into usable ML pipelines, abstracting away glue code, data flow, and data storage. We further pair these pipelines with a hierarchy of AutoML strategies - Bayesian optimization and bandit learning. We use these components to create a general-purpose, multi-task, end-to-end AutoML system that provides solutions to a variety of data modalities (image, text, graph, tabular, relational, etc.) and problem types (classification, regression, anomaly detection, graph matching, etc.). We present an evaluation suite of 456 real-world ML tasks and describe the characteristics of 2.5 million pipelines searched over this task suite. Finally, we demonstrate 5 real-world use cases and 2 case studies of our approach.
Tasks	Anomaly Detection, AutoML, Graph Matching
Published	2019-05-22
URL	https://arxiv.org/abs/1905.08942v3
PDF	https://arxiv.org/pdf/1905.08942v3.pdf
PWC	https://paperswithcode.com/paper/the-machine-learning-bazaar-harnessing-the-ml
Repo	https://github.com/HDI-Project/MLBlocks
Framework	none

RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information


Title	RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information
Authors	Helmut Mayer
Abstract	A core component of all Structure from Motion (SfM) approaches is bundle adjustment. As the latter is a computational bottleneck for larger blocks, parallel bundle adjustment has become an active area of research. Particularly, consensus-based optimization methods have been shown to be suitable for this task. We have extended them using covariance information derived by the adjustment of individual three-dimensional (3D) points, i.e., “triangulation” or “intersection”. This does not only lead to a much better convergence behavior, but also avoids fiddling with the penalty parameter of standard consensus-based approaches. The corresponding novel approach can also be seen as a variant of resection / intersection schemes, where we adjust during intersection a number of sub-blocks directly related to the number of threads available on a computer each containing a fraction of the cameras of the block. We show that our novel approach is suitable for robust parallel bundle adjustment and demonstrate its capabilities in comparison to the basic consensus-based approach as well as a state-of-the-art parallel implementation of bundle adjustment. Code for our novel approach is available on GitHub: https://github.com/helmayer/RPBA
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08138v1
PDF	https://arxiv.org/pdf/1910.08138v1.pdf
PWC	https://paperswithcode.com/paper/rpba-robust-parallel-bundle-adjustment-based
Repo	https://github.com/helmayer/RPBA
Framework	none

Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception


Title	Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception
Authors	Braden Hurl, Krzysztof Czarnecki, Steven Waslander
Abstract	We introduce the Precise Synthetic Image and LiDAR (PreSIL) dataset for autonomous vehicle perception. Grand Theft Auto V (GTA V), a commercial video game, has a large detailed world with realistic graphics, which provides a diverse data collection environment. Existing works creating synthetic LiDAR data for autonomous driving with GTA V have not released their datasets, rely on an in-game raycasting function which represents people as cylinders, and can fail to capture vehicles past 30 metres. Our work creates a precise LiDAR simulator within GTA V which collides with detailed models for all entities no matter the type or position. The PreSIL dataset consists of over 50,000 frames and includes high-definition images with full resolution depth information, semantic segmentation (images), point-wise segmentation (point clouds), and detailed annotations for all vehicles and people. Collecting additional data with our framework is entirely automatic and requires no human annotation of any kind. We demonstrate the effectiveness of our dataset by showing an improvement of up to 5% average precision on the KITTI 3D Object Detection benchmark challenge when state-of-the-art 3D object detection networks are pre-trained with our data. The data and code are available at https://tinyurl.com/y3tb9sxy
Tasks	3D Object Detection, Autonomous Driving, Object Detection, Semantic Segmentation
Published	2019-05-01
URL	https://arxiv.org/abs/1905.00160v2
PDF	https://arxiv.org/pdf/1905.00160v2.pdf
PWC	https://paperswithcode.com/paper/precise-synthetic-image-and-lidar-presil
Repo	https://github.com/bradenhurl/DeepGTAV-PreSIL
Framework	none

Deep Visual Template-Free Form Parsing


Title	Deep Visual Template-Free Form Parsing
Authors	Brian Davis, Bryan Morse, Scott Cohen, Brian Price, Chris Tensmeyer
Abstract	Automatic, template-free extraction of information from form images is challenging due to the variety of form layouts. This is even more challenging for historical forms due to noise and degradation. A crucial part of the extraction process is associating input text with pre-printed labels. We present a learned, template-free solution to detecting pre-printed text and input text/handwriting and predicting pair-wise relationships between them. While previous approaches to this problem have been focused on clean images and clear layouts, we show our approach is effective in the domain of noisy, degraded, and varied form images. We introduce a new dataset of historical form images (late 1800s, early 1900s) for training and validating our approach. Our method uses a convolutional network to detect pre-printed text and input text lines. We pool features from the detection network to classify possible relationships in a language-agnostic way. We show that our proposed pairing method outperforms heuristic rules and that visual features are critical to obtaining high accuracy.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02576v2
PDF	https://arxiv.org/pdf/1909.02576v2.pdf
PWC	https://paperswithcode.com/paper/deep-visual-template-free-form-parsing
Repo	https://github.com/herobd/NAF_dataset
Framework	none

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning


Title	Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Authors	Gregory Farquhar, Shimon Whiteson, Jakob Foerster
Abstract	Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives. We derive an objective that, under automatic differentiation, produces low-variance unbiased estimators of derivatives at any order. Our objective is compatible with arbitrary advantage estimators, which allows the control of the bias and variance of any-order derivatives when using function approximation. Furthermore, we propose a method to trade off bias and variance of higher order derivatives by discounting the impact of more distant causal dependencies. We demonstrate the correctness and utility of our objective in analytically tractable MDPs and in meta-reinforcement-learning for continuous control.
Tasks	Continuous Control
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10549v1
PDF	https://arxiv.org/pdf/1909.10549v1.pdf
PWC	https://paperswithcode.com/paper/loaded-dice-trading-off-bias-and-variance-in
Repo	https://github.com/oxwhirl/loaded-dice
Framework	none

FlipTest: Fairness Testing via Optimal Transport


Title	FlipTest: Fairness Testing via Optimal Transport
Authors	Emily Black, Samuel Yeom, Matt Fredrikson
Abstract	We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a “flipset”: the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a “transparency report”: a ranking of features that are most associated with the model’s behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09218v4
PDF	https://arxiv.org/pdf/1906.09218v4.pdf
PWC	https://paperswithcode.com/paper/fliptest-fairness-auditing-via-optimal
Repo	https://github.com/samuel-yeom/fliptest
Framework	tf

Self-organized inductive reasoning with NeMuS


Title	Self-organized inductive reasoning with NeMuS
Authors	Leonardo Barreto, Edjard Mota
Abstract	Neural Multi-Space (NeMuS) is a weighted multi-space representation for a portion of first-order logic designed for use with machine learning and neural network methods. It was demonstrated that it can be used to perform reasoning based on regions forming patterns of refutation and also in the process of inductive learning in ILP-like style. Initial experiments were carried out to investigate whether a self-organizing the approach is suitable to generate similar concept regions according to the attributes that form such concepts. We present the results and make an analysis of the suitability of the method in the process of inductive learning with NeMuS.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06761v1
PDF	https://arxiv.org/pdf/1906.06761v1.pdf
PWC	https://paperswithcode.com/paper/self-organized-inductive-reasoning-with-nemus
Repo	https://github.com/JustGlowing/minisom
Framework	none

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation


Title	Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation
Authors	Kenton Murray, Jeffery Kinnison, Toan Q. Nguyen, Walter Scheirer, David Chiang
Abstract	Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.
Tasks	Machine Translation
Published	2019-10-01
URL	https://arxiv.org/abs/1910.06717v1
PDF	https://arxiv.org/pdf/1910.06717v1.pdf
PWC	https://paperswithcode.com/paper/auto-sizing-the-transformer-network-improving
Repo	https://github.com/KentonMurray/ProxGradPytorch
Framework	pytorch

Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume


Title	Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume
Authors	Qingshan Xu, Wenbing Tao
Abstract	Deep learning has shown to be effective for depth inference in multi-view stereo (MVS). However, the scalability and accuracy still remain an open problem in this domain. This can be attributed to the memory-consuming cost volume representation and inappropriate depth inference. Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume. This can not only reduce the memory consumption but also reduce the computational burden in the cost volume filtering. Based on our effective cost volume representation, we propose a cascade 3D U-Net module to regularize the cost volume to further boost the performance. Unlike the previous methods that treat multi-view depth inference as a depth regression problem or an inverse depth classification problem, we recast multi-view depth inference as an inverse depth regression task. This allows our network to achieve sub-pixel estimation and be applicable to large-scale scenes. Through extensive experiments on DTU dataset and Tanks and Temples dataset, we show that our proposed network with Correlation cost volume and Inverse DEpth Regression (CIDER), achieves state-of-the-art results, demonstrating its superior performance on scalability and accuracy.
Tasks	Stereo Matching
Published	2019-12-26
URL	https://arxiv.org/abs/1912.11746v1
PDF	https://arxiv.org/pdf/1912.11746v1.pdf
PWC	https://paperswithcode.com/paper/learning-inverse-depth-regression-for-multi
Repo	https://github.com/GhiXu/CIDER
Framework	none

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues


Title	Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues
Authors	Or Levi, Pedram Hosseini, Mona Diab, David A. Broniatowski
Abstract	The blurry line between nefarious fake news and protected-speech satire has been a notorious struggle for social media platforms. Further to the efforts of reducing exposure to misinformation on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. In this work, we address the challenge of automatically classifying fake news versus satire. Previous work have studied whether fake news and satire can be distinguished based on language differences. Contrary to fake news, satire stories are usually humorous and carry some political or social message. We hypothesize that these nuances could be identified using semantic and linguistic cues. Consequently, we train a machine learning method using semantic representation, with a state-of-the-art contextual language model, and with linguistic features based on textual coherence metrics. Empirical evaluation attests to the merits of our approach compared to the language-based baseline and sheds light on the nuances between fake news and satire. As avenues for future work, we consider studying additional linguistic features related to the humor aspect, and enriching the data with current news events, to help identify a political or social message.
Tasks	Language Modelling
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01160v2
PDF	https://arxiv.org/pdf/1910.01160v2.pdf
PWC	https://paperswithcode.com/paper/identifying-nuances-in-fake-news-vs-satire
Repo	https://github.com/adverifai/Satire_vs_Fake
Framework	none

Differentiable Convex Optimization Layers


Title	Differentiable Convex Optimization Layers
Authors	Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, Zico Kolter
Abstract	Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver’s solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12430v1
PDF	https://arxiv.org/pdf/1910.12430v1.pdf
PWC	https://paperswithcode.com/paper/differentiable-convex-optimization-layers
Repo	https://github.com/cvxgrp/cvxpylayers
Framework	pytorch

Intrinsic dimension of data representations in deep neural networks


Title	Intrinsic dimension of data representations in deep neural networks
Authors	Alessio Ansuini, Alessandro Laio, Jakob H. Macke, Davide Zoccolan
Abstract	Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12784v2
PDF	https://arxiv.org/pdf/1905.12784v2.pdf
PWC	https://paperswithcode.com/paper/intrinsic-dimension-of-data-representations
Repo	https://github.com/ansuini/IntrinsicDimDeep
Framework	pytorch

Exploiting Temporal Relationships in Video Moment Localization with Natural Language


Title	Exploiting Temporal Relationships in Video Moment Localization with Natural Language
Authors	Songyang Zhang, Jinsong Su, Jiebo Luo
Abstract	We address the problem of video moment localization with natural language, i.e. localizing a video segment described by a natural language sentence. While most prior work focuses on grounding the query as a whole, temporal dependencies and reasoning between events within the text are not fully considered. In this paper, we propose a novel Temporal Compositional Modular Network (TCMN) where a tree attention network first automatically decomposes a sentence into three descriptions with respect to the main event, context event and temporal signal. Two modules are then utilized to measure the visual similarity and location similarity between each segment and the decomposed descriptions. Moreover, since the main event and context event may rely on different modalities (RGB or optical flow), we use late fusion to form an ensemble of four models, where each model is independently trained by one combination of the visual input. Experiments show that our model outperforms the state-of-the-art methods on the TEMPO dataset.
Tasks	Optical Flow Estimation
Published	2019-08-11
URL	https://arxiv.org/abs/1908.03846v1
PDF	https://arxiv.org/pdf/1908.03846v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-temporal-relationships-in-video
Repo	https://github.com/Sy-Zhang/TCMN-Release
Framework	pytorch