Paper Group AWR 182
Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning. Kernel Mean Matching for Content Addressability of GANs. The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development. RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information. Precise Synthetic Image and LiDAR (PreSIL) Datase …
Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
Title | Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning |
Authors | Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang |
Abstract | Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms. |
Tasks | |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03245v2 |
https://arxiv.org/pdf/1909.03245v2.pdf | |
PWC | https://paperswithcode.com/paper/regularized-anderson-acceleration-for-off |
Repo | https://github.com/shiwj16/raa-drl |
Framework | pytorch |
Kernel Mean Matching for Content Addressability of GANs
Title | Kernel Mean Matching for Content Addressability of GANs |
Authors | Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf |
Abstract | We propose a novel procedure which adds “content-addressability” to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model. |
Tasks | Image Generation |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05882v1 |
https://arxiv.org/pdf/1905.05882v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-mean-matching-for-content |
Repo | https://github.com/wittawatj/cadgan |
Framework | pytorch |
The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
Title | The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development |
Authors | Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni |
Abstract | As machine learning is applied more widely, data scientists often struggle to find or create end-to-end machine learning systems for specific tasks. The proliferation of libraries and frameworks and the complexity of the tasks have led to the emergence of “pipeline jungles” - brittle, ad hoc ML systems. To address these problems, we introduce the Machine Learning Bazaar, a new approach to developing machine learning and automated machine learning software systems. First, we introduce ML primitives, a unified API and specification for data processing and ML components from different software libraries. Next, we compose primitives into usable ML pipelines, abstracting away glue code, data flow, and data storage. We further pair these pipelines with a hierarchy of AutoML strategies - Bayesian optimization and bandit learning. We use these components to create a general-purpose, multi-task, end-to-end AutoML system that provides solutions to a variety of data modalities (image, text, graph, tabular, relational, etc.) and problem types (classification, regression, anomaly detection, graph matching, etc.). We present an evaluation suite of 456 real-world ML tasks and describe the characteristics of 2.5 million pipelines searched over this task suite. Finally, we demonstrate 5 real-world use cases and 2 case studies of our approach. |
Tasks | Anomaly Detection, AutoML, Graph Matching |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.08942v3 |
https://arxiv.org/pdf/1905.08942v3.pdf | |
PWC | https://paperswithcode.com/paper/the-machine-learning-bazaar-harnessing-the-ml |
Repo | https://github.com/HDI-Project/MLBlocks |
Framework | none |
RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information
Title | RPBA – Robust Parallel Bundle Adjustment Based on Covariance Information |
Authors | Helmut Mayer |
Abstract | A core component of all Structure from Motion (SfM) approaches is bundle adjustment. As the latter is a computational bottleneck for larger blocks, parallel bundle adjustment has become an active area of research. Particularly, consensus-based optimization methods have been shown to be suitable for this task. We have extended them using covariance information derived by the adjustment of individual three-dimensional (3D) points, i.e., “triangulation” or “intersection”. This does not only lead to a much better convergence behavior, but also avoids fiddling with the penalty parameter of standard consensus-based approaches. The corresponding novel approach can also be seen as a variant of resection / intersection schemes, where we adjust during intersection a number of sub-blocks directly related to the number of threads available on a computer each containing a fraction of the cameras of the block. We show that our novel approach is suitable for robust parallel bundle adjustment and demonstrate its capabilities in comparison to the basic consensus-based approach as well as a state-of-the-art parallel implementation of bundle adjustment. Code for our novel approach is available on GitHub: https://github.com/helmayer/RPBA |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08138v1 |
https://arxiv.org/pdf/1910.08138v1.pdf | |
PWC | https://paperswithcode.com/paper/rpba-robust-parallel-bundle-adjustment-based |
Repo | https://github.com/helmayer/RPBA |
Framework | none |
Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception
Title | Precise Synthetic Image and LiDAR (PreSIL) Dataset for Autonomous Vehicle Perception |
Authors | Braden Hurl, Krzysztof Czarnecki, Steven Waslander |
Abstract | We introduce the Precise Synthetic Image and LiDAR (PreSIL) dataset for autonomous vehicle perception. Grand Theft Auto V (GTA V), a commercial video game, has a large detailed world with realistic graphics, which provides a diverse data collection environment. Existing works creating synthetic LiDAR data for autonomous driving with GTA V have not released their datasets, rely on an in-game raycasting function which represents people as cylinders, and can fail to capture vehicles past 30 metres. Our work creates a precise LiDAR simulator within GTA V which collides with detailed models for all entities no matter the type or position. The PreSIL dataset consists of over 50,000 frames and includes high-definition images with full resolution depth information, semantic segmentation (images), point-wise segmentation (point clouds), and detailed annotations for all vehicles and people. Collecting additional data with our framework is entirely automatic and requires no human annotation of any kind. We demonstrate the effectiveness of our dataset by showing an improvement of up to 5% average precision on the KITTI 3D Object Detection benchmark challenge when state-of-the-art 3D object detection networks are pre-trained with our data. The data and code are available at https://tinyurl.com/y3tb9sxy |
Tasks | 3D Object Detection, Autonomous Driving, Object Detection, Semantic Segmentation |
Published | 2019-05-01 |
URL | https://arxiv.org/abs/1905.00160v2 |
https://arxiv.org/pdf/1905.00160v2.pdf | |
PWC | https://paperswithcode.com/paper/precise-synthetic-image-and-lidar-presil |
Repo | https://github.com/bradenhurl/DeepGTAV-PreSIL |
Framework | none |
Deep Visual Template-Free Form Parsing
Title | Deep Visual Template-Free Form Parsing |
Authors | Brian Davis, Bryan Morse, Scott Cohen, Brian Price, Chris Tensmeyer |
Abstract | Automatic, template-free extraction of information from form images is challenging due to the variety of form layouts. This is even more challenging for historical forms due to noise and degradation. A crucial part of the extraction process is associating input text with pre-printed labels. We present a learned, template-free solution to detecting pre-printed text and input text/handwriting and predicting pair-wise relationships between them. While previous approaches to this problem have been focused on clean images and clear layouts, we show our approach is effective in the domain of noisy, degraded, and varied form images. We introduce a new dataset of historical form images (late 1800s, early 1900s) for training and validating our approach. Our method uses a convolutional network to detect pre-printed text and input text lines. We pool features from the detection network to classify possible relationships in a language-agnostic way. We show that our proposed pairing method outperforms heuristic rules and that visual features are critical to obtaining high accuracy. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02576v2 |
https://arxiv.org/pdf/1909.02576v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-visual-template-free-form-parsing |
Repo | https://github.com/herobd/NAF_dataset |
Framework | none |
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Title | Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning |
Authors | Gregory Farquhar, Shimon Whiteson, Jakob Foerster |
Abstract | Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives. We derive an objective that, under automatic differentiation, produces low-variance unbiased estimators of derivatives at any order. Our objective is compatible with arbitrary advantage estimators, which allows the control of the bias and variance of any-order derivatives when using function approximation. Furthermore, we propose a method to trade off bias and variance of higher order derivatives by discounting the impact of more distant causal dependencies. We demonstrate the correctness and utility of our objective in analytically tractable MDPs and in meta-reinforcement-learning for continuous control. |
Tasks | Continuous Control |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10549v1 |
https://arxiv.org/pdf/1909.10549v1.pdf | |
PWC | https://paperswithcode.com/paper/loaded-dice-trading-off-bias-and-variance-in |
Repo | https://github.com/oxwhirl/loaded-dice |
Framework | none |
FlipTest: Fairness Testing via Optimal Transport
Title | FlipTest: Fairness Testing via Optimal Transport |
Authors | Emily Black, Samuel Yeom, Matt Fredrikson |
Abstract | We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a “flipset”: the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a “transparency report”: a ranking of features that are most associated with the model’s behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria. |
Tasks | |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09218v4 |
https://arxiv.org/pdf/1906.09218v4.pdf | |
PWC | https://paperswithcode.com/paper/fliptest-fairness-auditing-via-optimal |
Repo | https://github.com/samuel-yeom/fliptest |
Framework | tf |
Self-organized inductive reasoning with NeMuS
Title | Self-organized inductive reasoning with NeMuS |
Authors | Leonardo Barreto, Edjard Mota |
Abstract | Neural Multi-Space (NeMuS) is a weighted multi-space representation for a portion of first-order logic designed for use with machine learning and neural network methods. It was demonstrated that it can be used to perform reasoning based on regions forming patterns of refutation and also in the process of inductive learning in ILP-like style. Initial experiments were carried out to investigate whether a self-organizing the approach is suitable to generate similar concept regions according to the attributes that form such concepts. We present the results and make an analysis of the suitability of the method in the process of inductive learning with NeMuS. |
Tasks | |
Published | 2019-06-16 |
URL | https://arxiv.org/abs/1906.06761v1 |
https://arxiv.org/pdf/1906.06761v1.pdf | |
PWC | https://paperswithcode.com/paper/self-organized-inductive-reasoning-with-nemus |
Repo | https://github.com/JustGlowing/minisom |
Framework | none |
Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation
Title | Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation |
Authors | Kenton Murray, Jeffery Kinnison, Toan Q. Nguyen, Walter Scheirer, David Chiang |
Abstract | Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model. |
Tasks | Machine Translation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.06717v1 |
https://arxiv.org/pdf/1910.06717v1.pdf | |
PWC | https://paperswithcode.com/paper/auto-sizing-the-transformer-network-improving |
Repo | https://github.com/KentonMurray/ProxGradPytorch |
Framework | pytorch |
Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume
Title | Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume |
Authors | Qingshan Xu, Wenbing Tao |
Abstract | Deep learning has shown to be effective for depth inference in multi-view stereo (MVS). However, the scalability and accuracy still remain an open problem in this domain. This can be attributed to the memory-consuming cost volume representation and inappropriate depth inference. Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume. This can not only reduce the memory consumption but also reduce the computational burden in the cost volume filtering. Based on our effective cost volume representation, we propose a cascade 3D U-Net module to regularize the cost volume to further boost the performance. Unlike the previous methods that treat multi-view depth inference as a depth regression problem or an inverse depth classification problem, we recast multi-view depth inference as an inverse depth regression task. This allows our network to achieve sub-pixel estimation and be applicable to large-scale scenes. Through extensive experiments on DTU dataset and Tanks and Temples dataset, we show that our proposed network with Correlation cost volume and Inverse DEpth Regression (CIDER), achieves state-of-the-art results, demonstrating its superior performance on scalability and accuracy. |
Tasks | Stereo Matching |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11746v1 |
https://arxiv.org/pdf/1912.11746v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-inverse-depth-regression-for-multi |
Repo | https://github.com/GhiXu/CIDER |
Framework | none |
Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues
Title | Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues |
Authors | Or Levi, Pedram Hosseini, Mona Diab, David A. Broniatowski |
Abstract | The blurry line between nefarious fake news and protected-speech satire has been a notorious struggle for social media platforms. Further to the efforts of reducing exposure to misinformation on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. In this work, we address the challenge of automatically classifying fake news versus satire. Previous work have studied whether fake news and satire can be distinguished based on language differences. Contrary to fake news, satire stories are usually humorous and carry some political or social message. We hypothesize that these nuances could be identified using semantic and linguistic cues. Consequently, we train a machine learning method using semantic representation, with a state-of-the-art contextual language model, and with linguistic features based on textual coherence metrics. Empirical evaluation attests to the merits of our approach compared to the language-based baseline and sheds light on the nuances between fake news and satire. As avenues for future work, we consider studying additional linguistic features related to the humor aspect, and enriching the data with current news events, to help identify a political or social message. |
Tasks | Language Modelling |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01160v2 |
https://arxiv.org/pdf/1910.01160v2.pdf | |
PWC | https://paperswithcode.com/paper/identifying-nuances-in-fake-news-vs-satire |
Repo | https://github.com/adverifai/Satire_vs_Fake |
Framework | none |
Differentiable Convex Optimization Layers
Title | Differentiable Convex Optimization Layers |
Authors | Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, Zico Kolter |
Abstract | Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver’s solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12430v1 |
https://arxiv.org/pdf/1910.12430v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-convex-optimization-layers |
Repo | https://github.com/cvxgrp/cvxpylayers |
Framework | pytorch |
Intrinsic dimension of data representations in deep neural networks
Title | Intrinsic dimension of data representations in deep neural networks |
Authors | Alessio Ansuini, Alessandro Laio, Jakob H. Macke, Davide Zoccolan |
Abstract | Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12784v2 |
https://arxiv.org/pdf/1905.12784v2.pdf | |
PWC | https://paperswithcode.com/paper/intrinsic-dimension-of-data-representations |
Repo | https://github.com/ansuini/IntrinsicDimDeep |
Framework | pytorch |
Exploiting Temporal Relationships in Video Moment Localization with Natural Language
Title | Exploiting Temporal Relationships in Video Moment Localization with Natural Language |
Authors | Songyang Zhang, Jinsong Su, Jiebo Luo |
Abstract | We address the problem of video moment localization with natural language, i.e. localizing a video segment described by a natural language sentence. While most prior work focuses on grounding the query as a whole, temporal dependencies and reasoning between events within the text are not fully considered. In this paper, we propose a novel Temporal Compositional Modular Network (TCMN) where a tree attention network first automatically decomposes a sentence into three descriptions with respect to the main event, context event and temporal signal. Two modules are then utilized to measure the visual similarity and location similarity between each segment and the decomposed descriptions. Moreover, since the main event and context event may rely on different modalities (RGB or optical flow), we use late fusion to form an ensemble of four models, where each model is independently trained by one combination of the visual input. Experiments show that our model outperforms the state-of-the-art methods on the TEMPO dataset. |
Tasks | Optical Flow Estimation |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03846v1 |
https://arxiv.org/pdf/1908.03846v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-temporal-relationships-in-video |
Repo | https://github.com/Sy-Zhang/TCMN-Release |
Framework | pytorch |