April 3, 2020

3127 words 15 mins read

Paper Group AWR 72

Paper Group AWR 72

Robust, Occlusion-aware Pose Estimation for Objects Grasped by Adaptive Hands. TensorFlow Quantum: A Software Framework for Quantum Machine Learning. MixPath: A Unified Approach for One-shot Neural Architecture Search. Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus. Learning Meta Face Recognition in Unseen Doma …

Robust, Occlusion-aware Pose Estimation for Objects Grasped by Adaptive Hands

Title Robust, Occlusion-aware Pose Estimation for Objects Grasped by Adaptive Hands
Authors Bowen Wen, Chaitanya Mitash, Sruthi Soorian, Andrew Kimmel, Avishai Sintov, Kostas E. Bekris
Abstract Many manipulation tasks, such as placement or within-hand manipulation, require the object’s pose relative to a robot hand. The task is difficult when the hand significantly occludes the object. It is especially hard for adaptive hands, for which it is not easy to detect the finger’s configuration. In addition, RGB-only approaches face issues with texture-less objects or when the hand and the object look similar. This paper presents a depth-based framework, which aims for robust pose estimation and short response times. The approach detects the adaptive hand’s state via efficient parallel search given the highest overlap between the hand’s model and the point cloud. The hand’s point cloud is pruned and robust global registration is performed to generate object pose hypotheses, which are clustered. False hypotheses are pruned via physical reasoning. The remaining poses’ quality is evaluated given agreement with observed data. Extensive evaluation on synthetic and real data demonstrates the accuracy and computational efficiency of the framework when applied on challenging, highly-occluded scenarios for different object types. An ablation study identifies how the framework’s components help in performance. This work also provides a dataset for in-hand 6D object pose estimation. Code and dataset are available at: https://github.com/wenbowen123/icra20-hand-object-pose
Tasks 6D Pose Estimation using RGB, Pose Estimation
Published 2020-03-07
URL https://arxiv.org/abs/2003.03518v1
PDF https://arxiv.org/pdf/2003.03518v1.pdf
PWC https://paperswithcode.com/paper/robust-occlusion-aware-pose-estimation-for
Repo https://github.com/wenbowen123/icra20-hand-object-pose
Framework none

TensorFlow Quantum: A Software Framework for Quantum Machine Learning

Title TensorFlow Quantum: A Software Framework for Quantum Machine Learning
Authors Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo, Sergei V. Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin Halavati, Evan Peters, Martin Leib, Andrea Skolik, Michael Streif, David Von Dollen, Jarrod R. McClean, Sergio Boixo, Dave Bacon, Alan K. Ho, Hartmut Neven, Masoud Mohseni
Abstract We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototyping of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software architecture and building blocks through several examples and review the theory of hybrid quantum-classical neural networks. We illustrate TFQ functionalities via several basic applications including supervised learning for quantum classification, quantum control, and quantum approximate optimization. Moreover, we demonstrate how one can apply TFQ to tackle advanced quantum learning tasks including meta-learning, Hamiltonian learning, and sampling thermal states. We hope this framework provides the necessary tools for the quantum computing and machine learning research communities to explore models of both natural and artificial quantum systems, and ultimately discover new quantum algorithms which could potentially yield a quantum advantage.
Tasks Meta-Learning, Quantum Machine Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.02989v1
PDF https://arxiv.org/pdf/2003.02989v1.pdf
PWC https://paperswithcode.com/paper/tensorflow-quantum-a-software-framework-for
Repo https://github.com/tensorflow/quantum
Framework tf
Title MixPath: A Unified Approach for One-shot Neural Architecture Search
Authors Xiangxiang Chu, Xudong Li, Shun Lu, Bo Zhang, Jixiang Li
Abstract Blending multiple convolutional kernels is proved advantageous in neural architectural design. However, current neural architecture search approaches are mainly limited to stacked single-path search space. How can the one-shot doctrine search for multi-path models remains unresolved. Specifically, we are motivated to train a multi-path supernet to accurately evaluate the candidate architectures. In this paper, we discover that in the studied search space, feature vectors summed from multiple paths are nearly multiples of those from a single path, which perturbs supernet training and its ranking ability. In this regard, we propose a novel mechanism called Shadow Batch Normalization(SBN) to regularize the disparate feature statistics. Extensive experiments prove that SBN is capable of stabilizing the training and improving the ranking performance (e.g. Kendall Tau 0.597 tested on NAS-Bench-101). We call our unified multi-path one-shot approach as MixPath, which generates a series of models that achieve state-of-the-art results on ImageNet.
Tasks AutoML, Neural Architecture Search
Published 2020-01-16
URL https://arxiv.org/abs/2001.05887v3
PDF https://arxiv.org/pdf/2001.05887v3.pdf
PWC https://paperswithcode.com/paper/mixpath-a-unified-approach-for-one-shot
Repo https://github.com/xiaomi-automl/MixPath
Framework pytorch

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Title Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus
Authors Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
Abstract The ability to ask questions is important in both human and machine intelligence. Learning to ask questions helps knowledge acquisition, improves question-answering and machine reading comprehension tasks, and helps a chatbot to keep the conversation flowing with a human. Existing question generation models are ineffective at generating a large amount of high-quality question-answer pairs from unstructured text, since given an answer and an input passage, question generation is inherently a one-to-many mapping. In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions. Our system consists of: i) an information extractor, which samples from the text multiple types of assistive information to guide question generation; ii) neural question generators, which generate diverse and controllable questions, leveraging the extracted assistive information; and iii) a neural quality controller, which removes low-quality generated data based on text entailment. We compare our question generation models with existing approaches and resort to voluntary human evaluation to assess the quality of the generated question-answer pairs. The evaluation results suggest that our system dramatically outperforms state-of-the-art neural question generation models in terms of the generation quality, while being scalable in the meantime. With models trained on a relatively smaller amount of data, we can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.
Tasks Chatbot, Machine Reading Comprehension, Question Answering, Question Generation, Reading Comprehension
Published 2020-01-27
URL https://arxiv.org/abs/2002.00748v2
PDF https://arxiv.org/pdf/2002.00748v2.pdf
PWC https://paperswithcode.com/paper/asking-questions-the-human-way-scalable
Repo https://github.com/bangliu/ACS-QG
Framework none

Learning Meta Face Recognition in Unseen Domains

Title Learning Meta Face Recognition in Unseen Domains
Authors Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao, Dong Cao, Zhen Lei, Stan Z. Li
Abstract Face recognition systems are usually faced with unseen domains in real-world applications and show unsatisfactory performance due to their poor generalization. For example, a well-trained model on webface data cannot deal with the ID vs. Spot task in surveillance scenario. In this paper, we aim to learn a generalized model that can directly handle new unseen domains without any model updating. To this end, we propose a novel face recognition method via meta-learning named Meta Face Recognition (MFR). MFR synthesizes the source/target domain shift with a meta-optimization objective, which requires the model to learn effective representations not only on synthesized source domains but also on synthesized target domains. Specifically, we build domain-shift batches through a domain-level sampling strategy and get back-propagated gradients/meta-gradients on synthesized source/target domains by optimizing multi-domain distributions. The gradients and meta-gradients are further combined to update the model to improve generalization. Besides, we propose two benchmarks for generalized face recognition evaluation. Experiments on our benchmarks validate the generalization of our method compared to several baselines and other state-of-the-arts. The proposed benchmarks will be available at https://github.com/cleardusk/MFR.
Tasks Face Recognition, Meta-Learning
Published 2020-03-17
URL https://arxiv.org/abs/2003.07733v2
PDF https://arxiv.org/pdf/2003.07733v2.pdf
PWC https://paperswithcode.com/paper/learning-meta-face-recognition-in-unseen
Repo https://github.com/cleardusk/MFR
Framework none

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework using Visual and Depth Cues

Title Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework using Visual and Depth Cues
Authors Renato Martins, Dhiego Bersan, Mario F. M. Campos, Erickson R. Nascimento
Abstract This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual navigation, or in manipulation tasks. Our formulation leverages a CNN-based object detector (Yolo) with a 3D model-based segmentation technique to perform instance semantic segmentation, and to localize, identify, and track different classes of objects in the scene. The tracking and positioning of semantic classes is done with a dictionary of Kalman filters in order to combine sensor measurements over time and then providing more accurate maps. The formulation is designed to identify and to disregard dynamic objects in order to obtain a medium-term invariant map representation. The proposed method was evaluated with collected and publicly available RGB-D data sequences acquired in different indoor scenes. Experimental results show the potential of the technique to produce augmented semantic maps containing several objects (notably doors). We also provide to the community a dataset composed of annotated object classes (doors, fire extinguishers, benches, water fountains) and their positioning, as well as the source code as ROS packages.
Tasks Robot Navigation, Semantic Segmentation, Visual Navigation
Published 2020-03-13
URL https://arxiv.org/abs/2003.06336v1
PDF https://arxiv.org/pdf/2003.06336v1.pdf
PWC https://paperswithcode.com/paper/extending-maps-with-semantic-and-contextual
Repo https://github.com/verlab/3d-object-semantic-mapping
Framework tf

Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems

Title Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems
Authors Patrick Knöbelreiter, Christian Sormann, Alexander Shekhovtsov, Friedrich Fraundorfer, Thomas Pock
Abstract It has been proposed by many researchers that combining deep neural networks with graphical models can create more efficient and better regularized composite models. The main difficulties in implementing this in practice are associated with a discrepancy in suitable learning objectives as well as with the necessity of approximations for the inference. In this work we take one of the simplest inference methods, a truncated max-product Belief Propagation, and add what is necessary to make it a proper component of a deep learning model: We connect it to learning formulations with losses on marginals and compute the backprop operation. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs), allowing us to design a hierarchical model composing BP inference and CNNs at different scale levels. The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
Tasks Optical Flow Estimation, Semantic Segmentation
Published 2020-03-13
URL https://arxiv.org/abs/2003.06258v1
PDF https://arxiv.org/pdf/2003.06258v1.pdf
PWC https://paperswithcode.com/paper/belief-propagation-reloaded-learning-bp
Repo https://github.com/VLOGroup/bp-layers
Framework none

A Framework for Interdomain and Multioutput Gaussian Processes

Title A Framework for Interdomain and Multioutput Gaussian Processes
Authors Mark van der Wilk, Vincent Dutordoir, ST John, Artem Artemev, Vincent Adam, James Hensman
Abstract One obstacle to the use of Gaussian processes (GPs) in large-scale problems, and as a component in deep learning system, is the need for bespoke derivations and implementations for small variations in the model or inference. In order to improve the utility of GPs we need a modular system that allows rapid implementation and testing, as seen in the neural network community. We present a mathematical and software framework for scalable approximate inference in GPs, which combines interdomain approximations and multiple outputs. Our framework, implemented in GPflow, provides a unified interface for many existing multioutput models, as well as more recent convolutional structures. This simplifies the creation of deep models with GPs, and we hope that this work will encourage more interest in this approach.
Tasks Gaussian Processes
Published 2020-03-02
URL https://arxiv.org/abs/2003.01115v1
PDF https://arxiv.org/pdf/2003.01115v1.pdf
PWC https://paperswithcode.com/paper/a-framework-for-interdomain-and-multioutput
Repo https://github.com/GPflow/GPflow
Framework tf

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Title PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees
Authors Jonas Rothfuss, Vincent Fortuin, Andreas Krause
Abstract Meta-learning can successfully acquire useful inductive biases from data, especially when a large number of meta-tasks are available. Yet, its generalization properties to unseen tasks are poorly understood. Particularly if the number of meta-tasks is small, this raises concerns for potential overfitting. We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning with unbounded loss functions and Bayesian base learners. Using these bounds, we develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-regularization. When instantiating our PAC-optimal hyper-posterior (PACOH) with Gaussian processes as base learners, the resulting approach consistently outperforms several popular meta-learning methods, both in terms of predictive accuracy and the quality of its uncertainty estimates.
Tasks Gaussian Processes, Meta-Learning
Published 2020-02-13
URL https://arxiv.org/abs/2002.05551v1
PDF https://arxiv.org/pdf/2002.05551v1.pdf
PWC https://paperswithcode.com/paper/pacoh-bayes-optimal-meta-learning-with-pac
Repo https://github.com/jonasrothfuss/meta_learning_pacoh
Framework pytorch

Scalable Hyperparameter Optimization with Lazy Gaussian Processes

Title Scalable Hyperparameter Optimization with Lazy Gaussian Processes
Authors Raju Ram, Sabine Müller, Franz-Josef Pfreundt, Nicolas R. Gauger, Janis Keuper
Abstract Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.
Tasks Gaussian Processes, Hyperparameter Optimization
Published 2020-01-16
URL https://arxiv.org/abs/2001.05726v1
PDF https://arxiv.org/pdf/2001.05726v1.pdf
PWC https://paperswithcode.com/paper/scalable-hyperparameter-optimization-with-1
Repo https://github.com/cc-hpc-itwm/HPO_LazyGPR
Framework none

Neuroevolution of Neural Network Architectures Using CoDeepNEAT and Keras

Title Neuroevolution of Neural Network Architectures Using CoDeepNEAT and Keras
Authors Jonas da Silveira Bohrer, Bruno Iochins Grisci, Marcio Dorn
Abstract Machine learning is a huge field of study in computer science and statistics dedicated to the execution of computational tasks through algorithms that do not require explicit instructions but instead rely on learning patterns from data samples to automate inferences. A large portion of the work involved in a machine learning project is to define the best type of algorithm to solve a given problem. Neural networks - especially deep neural networks - are the predominant type of solution in the field. However, the networks themselves can produce very different results according to the architectural choices made for them. Finding the optimal network topology and configurations for a given problem is a challenge that requires domain knowledge and testing efforts due to a large number of parameters that need to be considered. The purpose of this work is to propose an adapted implementation of a well-established evolutionary technique from the neuroevolution field that manages to automate the tasks of topology and hyperparameter selection. It uses a popular and accessible machine learning framework - Keras - as the back-end, presenting results and proposed changes concerning the original algorithm. The implementation is available at GitHub (https://github.com/sbcblab/Keras-CoDeepNEAT) with documentation and examples to reproduce the experiments performed for this work.
Tasks
Published 2020-02-11
URL https://arxiv.org/abs/2002.04634v1
PDF https://arxiv.org/pdf/2002.04634v1.pdf
PWC https://paperswithcode.com/paper/neuroevolution-of-neural-network
Repo https://github.com/sbcblab/Keras-CoDeepNEAT
Framework tf

Re-Examining Linear Embeddings for High-Dimensional Bayesian Optimization

Title Re-Examining Linear Embeddings for High-Dimensional Bayesian Optimization
Authors Benjamin Letham, Roberto Calandra, Akshara Rai, Eytan Bakshy
Abstract Bayesian optimization (BO) is a popular approach to optimize expensive-to-evaluate black-box functions. A significant challenge in BO is to scale to high-dimensional parameter spaces while retaining sample efficiency. A solution considered in existing literature is to embed the high-dimensional space in a lower-dimensional manifold, often via a random linear embedding. In this paper, we identify several crucial issues and misconceptions about the use of linear embeddings for BO. We study the properties of linear embeddings from the literature and show that some of the design choices in current approaches adversely impact their performance. We show empirically that properly addressing these issues significantly improves the efficacy of linear embeddings for BO on a range of problems, including learning a gait policy for robot locomotion.
Tasks
Published 2020-01-31
URL https://arxiv.org/abs/2001.11659v1
PDF https://arxiv.org/pdf/2001.11659v1.pdf
PWC https://paperswithcode.com/paper/re-examining-linear-embeddings-for-high-1
Repo https://github.com/facebookresearch/alebo
Framework pytorch

Meta-learning Extractors for Music Source Separation

Title Meta-learning Extractors for Music Source Separation
Authors David Samuel, Aditya Ganeshan, Jason Naradowsky
Abstract We propose a hierarchical meta-learning-inspired model for music source separation (Meta-TasNet) in which a generator model is used to predict the weights of individual extractor models. This enables efficient parameter-sharing, while still allowing for instrument-specific parameterization. Meta-TasNet is shown to be more effective than the models trained independently or in a multi-task setting, and achieve performance comparable with state-of-the-art methods. In comparison to the latter, our extractors contain fewer parameters and have faster run-time performance. We discuss important architectural considerations, and explore the costs and benefits of this approach.
Tasks Meta-Learning, Music Source Separation
Published 2020-02-17
URL https://arxiv.org/abs/2002.07016v1
PDF https://arxiv.org/pdf/2002.07016v1.pdf
PWC https://paperswithcode.com/paper/meta-learning-extractors-for-music-source
Repo https://github.com/pfnet-research/meta-tasnet
Framework pytorch

Vehicle Driving Assistant

Title Vehicle Driving Assistant
Authors Akanksha Dwivedi, Anoop Toffy, Athul Suresh, Tarini Chandrashekhar
Abstract Autonomous vehicles has been a common term in our day to day life with car manufacturers like Tesla shipping cars that are SAE Level 3. While these vehicles include a slew of features such as parking assistance and cruise control,they have mostly been tailored to foreign roads. Potholes, and the abundance of them, is something that is unique to our Indian roads. We believe that successful detection of potholes from visual images can be applied in a variety of scenarios. Moreover, the sheer variety in the color, shape and size of potholes makes this problem an apt candidate to be solved using modern machine learning and image processing techniques.
Tasks Autonomous Vehicles
Published 2020-02-10
URL https://arxiv.org/abs/2002.03556v1
PDF https://arxiv.org/pdf/2002.03556v1.pdf
PWC https://paperswithcode.com/paper/vehicle-driving-assistant
Repo https://github.com/crunchbang/MP_Project
Framework none

Picking Winning Tickets Before Training by Preserving Gradient Flow

Title Picking Winning Tickets Before Training by Preserving Gradient Flow
Authors Chaoqi Wang, Guodong Zhang, Roger Grosse
Abstract Overparameterization has been shown to benefit both the optimization and generalization of neural networks, but large networks are resource hungry at both training and test time. Network pruning can reduce test-time resource requirements, but is typically applied to trained networks and therefore cannot avoid the expensive training process. We aim to prune networks at initialization, thereby saving resources at training time as well. Specifically, we argue that efficient training requires preserving the gradient flow through the network. This leads to a simple but effective pruning criterion we term Gradient Signal Preservation (GraSP). We empirically investigate the effectiveness of the proposed method with extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and ImageNet, using VGGNet and ResNet architectures. Our method can prune 80% of the weights of a VGG-16 network on ImageNet at initialization, with only a 1.6% drop in top-1 accuracy. Moreover, our method achieves significantly better performance than the baseline at extreme sparsity levels.
Tasks Network Pruning
Published 2020-02-18
URL https://arxiv.org/abs/2002.07376v1
PDF https://arxiv.org/pdf/2002.07376v1.pdf
PWC https://paperswithcode.com/paper/picking-winning-tickets-before-training-by-1
Repo https://github.com/alecwangcq/GraSP
Framework pytorch
comments powered by Disqus