April 3, 2020

3164 words 15 mins read

Paper Group AWR 58

Paper Group AWR 58

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills. Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients. Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data. Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network. RL agents Implicitly Learning Human …

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

Title Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Authors Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres
Abstract Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation – they discover options that provide a poor coverage of the state space. In light of this, we propose ‘Explore, Discover and Learn’ (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned. Code is publicly available at https://github.com/victorcampos7/edl.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.03647v3
PDF https://arxiv.org/pdf/2002.03647v3.pdf
PWC https://paperswithcode.com/paper/explore-discover-and-learn-unsupervised
Repo https://github.com/victorcampos7/edl
Framework pytorch

Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients

Title Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients
Authors Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
Abstract In this paper, we introduce a novel form of value function, $Q(s, s’)$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at \url{sites.google.com/view/qss-paper}.
Tasks Imitation Learning, Transfer Learning
Published 2020-02-21
URL https://arxiv.org/abs/2002.09505v1
PDF https://arxiv.org/pdf/2002.09505v1.pdf
PWC https://paperswithcode.com/paper/estimating-qss-with-deep-deterministic
Repo https://github.com/uber-research/D3G
Framework pytorch

Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data

Title Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data
Authors Yuxiao Zhou, Marc Habermann, Weipeng Xu, Ikhsanul Habibie, Christian Theobalt, Feng Xu
Abstract We present a novel method for monocular hand shape and pose estimation at unprecedented runtime performance of 100fps and at state-of-the-art accuracy. This is enabled by a new learning based architecture designed such that it can make use of all the sources of available hand training data: image data with either 2D or 3D annotations, as well as stand-alone 3D animations without corresponding image data. It features a 3D hand joint detection module and an inverse kinematics module which regresses not only 3D joint positions but also maps them to joint rotations in a single feed-forward pass. This output makes the method more directly usable for applications in computer vision and graphics compared to only regressing 3D joint positions. We demonstrate that our architectural design leads to a significant quantitative and qualitative improvement over the state of the art on several challenging benchmarks. Our model is publicly available for future research.
Tasks Motion Capture, Pose Estimation
Published 2020-03-21
URL https://arxiv.org/abs/2003.09572v1
PDF https://arxiv.org/pdf/2003.09572v1.pdf
PWC https://paperswithcode.com/paper/monocular-real-time-hand-shape-and-motion
Repo https://github.com/CalciferZh/minimal-hand
Framework none

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

Title Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
Authors Jungkyu Lee, Taeryun Won, Tae Kwan Lee, Hyemin Lee, Geonmo Gu, Kiho Hong
Abstract Recent studies in image classification have demonstrated a variety of techniques for improving the performance of Convolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out extensive experiments to validate that carefully assembling these techniques and applying them to basic CNN models (e.g. ResNet and MobileNet) can improve the accuracy and robustness of the models while minimizing the loss of throughput. Our proposed assembled ResNet-50 shows improvements in top-1 accuracy from 76.3% to 82.78%, mCE from 76.0% to 48.9% and mFR from 57.7% to 32.3% on ILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. To verify the performance improvement in transfer learning, fine grained classification and image retrieval tasks were tested on several public datasets and showed that the improvement to backbone network performance boosted transfer learning performance significantly. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019, and the source code and trained models are available at https://github.com/clovaai/assembled-cnn
Tasks Fine-Grained Image Classification, Fine-Grained Visual Recognition, Image Classification, Image Retrieval, Transfer Learning
Published 2020-01-17
URL https://arxiv.org/abs/2001.06268v2
PDF https://arxiv.org/pdf/2001.06268v2.pdf
PWC https://paperswithcode.com/paper/compounding-the-performance-improvements-of
Repo https://github.com/clovaai/assembled-cnn
Framework tf

RL agents Implicitly Learning Human Preferences

Title RL agents Implicitly Learning Human Preferences
Authors Nevan Wichers
Abstract In the real world, RL agents should be rewarded for fulfilling human preferences. We show that RL agents implicitly learn the preferences of humans in their environment. Training a classifier to predict if a simulated human’s preferences are fulfilled based on the activations of a RL agent’s neural network gets .93 AUC. Training a classifier on the raw environment state gets only .8 AUC. Training the classifier off of the RL agent’s activations also does much better than training off of activations from an autoencoder. The human preference classifier can be used as the reward function of an RL agent to make RL agent more beneficial for humans.
Tasks
Published 2020-02-14
URL https://arxiv.org/abs/2002.06137v1
PDF https://arxiv.org/pdf/2002.06137v1.pdf
PWC https://paperswithcode.com/paper/rl-agents-implicitly-learning-human
Repo https://github.com/arunraja-hub/Preference_Extraction
Framework none

FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques

Title FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques
Authors Tai Vu, Leon Tran
Abstract Reinforcement learning is one of the most popular approach for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $\epsilon$-greedy policy, discretization and backward updates. We find that SARSA and Q-Learning outperform the baseline, regularly achieving scores of 1400+, with the highest in-game score of 2069.
Tasks Q-Learning
Published 2020-03-21
URL https://arxiv.org/abs/2003.09579v1
PDF https://arxiv.org/pdf/2003.09579v1.pdf
PWC https://paperswithcode.com/paper/flapai-bird-training-an-agent-to-play-flappy
Repo https://github.com/taivu1998/FlapAI-Bird
Framework pytorch

Deep Video Super-Resolution using HR Optical Flow Estimation

Title Deep Video Super-Resolution using HR Optical Flow Estimation
Authors Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An
Abstract Video super-resolution (SR) aims at generating a sequence of high-resolution (HR) frames with plausible and temporally consistent details from their low-resolution (LR) counterparts. The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames. Existing deep learning based methods commonly estimate optical flows between LR frames to provide temporal dependency. However, the resolution conflict between LR optical flows and HR outputs hinders the recovery of fine details. In this paper, we propose an end-to-end video SR network to super-resolve both optical flows and images. Optical flow SR from LR frames provides accurate temporal dependency and ultimately improves video SR performance. Specifically, we first propose an optical flow reconstruction network (OFRnet) to infer HR optical flows in a coarse-to-fine manner. Then, motion compensation is performed using HR optical flows to encode temporal dependency. Finally, compensated LR inputs are fed to a super-resolution network (SRnet) to generate SR results. Extensive experiments have been conducted to demonstrate the effectiveness of HR optical flows for SR performance improvement. Comparative results on the Vid4 and DAVIS-10 datasets show that our network achieves the state-of-the-art performance.
Tasks Motion Compensation, Optical Flow Estimation, Super-Resolution, Video Super-Resolution
Published 2020-01-06
URL https://arxiv.org/abs/2001.02129v1
PDF https://arxiv.org/pdf/2001.02129v1.pdf
PWC https://paperswithcode.com/paper/deep-video-super-resolution-using-hr-optical
Repo https://github.com/LongguangWang/SOF-VSR
Framework pytorch

G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features

Title G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
Authors Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Ales Leonardis
Abstract In this paper, we propose a novel real-time 6D object pose estimation framework, named G2L-Net. Our network operates on point clouds from RGB-D detection in a divide-and-conquer fashion. Specifically, our network consists of three steps. First, we extract the coarse object point cloud from the RGB-D image by 2D detection. Second, we feed the coarse object point cloud to a translation localization network to perform 3D segmentation and object translation prediction. Third, via the predicted segmentation and translation, we transfer the fine object point cloud into a local canonical coordinate, in which we train a rotation localization network to estimate initial object rotation. In the third step, we define point-wise embedding vector features to capture viewpoint-aware information. To calculate more accurate rotation, we adopt a rotation residual estimator to estimate the residual between initial rotation and ground truth, which can boost initial pose estimation performance. Our proposed G2L-Net is real-time despite the fact multiple steps are stacked via the proposed coarse-to-fine framework. Extensive experiments on two benchmark datasets show that G2L-Net achieves state-of-the-art performance in terms of both accuracy and speed.
Tasks 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published 2020-03-24
URL https://arxiv.org/abs/2003.11089v2
PDF https://arxiv.org/pdf/2003.11089v2.pdf
PWC https://paperswithcode.com/paper/g2l-net-global-to-local-network-for-real-time
Repo https://github.com/DC1991/G2L_Net
Framework pytorch

Correcting Knowledge Base Assertions

Title Correcting Knowledge Base Assertions
Authors Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, Erik B. Myklebus
Abstract The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.
Tasks
Published 2020-01-19
URL https://arxiv.org/abs/2001.06917v1
PDF https://arxiv.org/pdf/2001.06917v1.pdf
PWC https://paperswithcode.com/paper/correcting-knowledge-base-assertions
Repo https://github.com/ChenJiaoyan/KG_Curation
Framework none

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Title UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Authors Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon
Abstract We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
Tasks Language Modelling
Published 2020-02-28
URL https://arxiv.org/abs/2002.12804v1
PDF https://arxiv.org/pdf/2002.12804v1.pdf
PWC https://paperswithcode.com/paper/unilmv2-pseudo-masked-language-models-for
Repo https://github.com/microsoft/unilm
Framework pytorch

Improved Fitness-Dependent Optimizer Algorithm

Title Improved Fitness-Dependent Optimizer Algorithm
Authors Danial A. Muhammed, Soran AM. Saeed, Tarik A. Rashid
Abstract The fitness-dependent optimizer (FDO) algorithm was recently introduced in 2019. An improved FDO (IFDO) algorithm is presented in this work, and this algorithm contributes considerably to refining the ability of the original FDO to address complicated optimization problems. To improve the FDO, the IFDO calculates the alignment and cohesion and then uses these behaviors with the pace at which the FDO updates its position. Moreover, in determining the weights, the FDO uses the weight factor (wf), which is zero in most cases and one in only a few cases. Conversely, the IFDO performs wf randomization in the [0-1] range and then minimizes the range when a better fitness weight value is achieved. In this work, the IFDO algorithm and its method of converging on the optimal solution are demonstrated. Additionally, 19 classical standard test function groups are utilized to test the IFDO, and then the FDO and three other well-known algorithms, namely, the particle swarm algorithm (PSO), dragonfly algorithm (DA), and genetic algorithm (GA), are selected to evaluate the IFDO results. Furthermore, the CECC06 2019 Competition, which is the set of IEEE Congress of Evolutionary Computation benchmark test functions, is utilized to test the IFDO, and then, the FDO and three recent algorithms, namely, the salp swarm algorithm (SSA), DA and whale optimization algorithm (WOA), are chosen to gauge the IFDO results. The results show that IFDO is practical in some cases, and its results are improved in most cases. Finally, to prove the practicability of the IFDO, it is used in real-world applications.
Tasks
Published 2020-01-16
URL https://arxiv.org/abs/2001.11820v1
PDF https://arxiv.org/pdf/2001.11820v1.pdf
PWC https://paperswithcode.com/paper/improved-fitness-dependent-optimizer
Repo https://github.com/Jaza-Abdullah/FDO-Java
Framework none

Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation

Title Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation
Authors Sunwoo Kim, Haici Yang, Minje Kim
Abstract Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity. In this study, we propose an adaptive boosting approach to learning locality sensitive hash codes, which represent audio spectra efficiently. We use the learned hash codes for single-channel speech denoising tasks as an alternative to a complex machine learning model, particularly to address the resource-constrained environments. Our adaptive boosting algorithm learns simple logistic regressors as the weak learners. Once trained, their binary classification results transform each spectrum of test noisy speech into a bit string. Simple bitwise operations calculate Hamming distance to find the K-nearest matching frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. Our proposed learning algorithm differs from AdaBoost in the sense that the projections are trained to minimize the distances between the self-similarity matrix of the hash codes and that of the original spectra, rather than the misclassification rate. We evaluate our discriminative hash codes on the TIMIT corpus with various noise types, and show comparative performance to deep learning methods in terms of denoising performance and complexity.
Tasks Denoising, Speech Enhancement
Published 2020-02-14
URL https://arxiv.org/abs/2002.06239v1
PDF https://arxiv.org/pdf/2002.06239v1.pdf
PWC https://paperswithcode.com/paper/boosted-locality-sensitive-hashing
Repo https://github.com/sunwookimiub/BLSH
Framework pytorch

Real or Not Real, that is the Question

Title Real or Not Real, that is the Question
Authors Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin
Abstract While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles. In this generalized framework, referred to as RealnessGAN, the discriminator outputs a distribution as the measure of realness. While RealnessGAN shares similar theoretical guarantees with the standard GAN, it provides more insights on adversarial learning. Compared to multiple baselines, RealnessGAN provides stronger guidance for the generator, achieving improvements on both synthetic and real-world datasets. Moreover, it enables the basic DCGAN architecture to generate realistic images at 1024*1024 resolution when trained from scratch.
Tasks
Published 2020-02-12
URL https://arxiv.org/abs/2002.05512v1
PDF https://arxiv.org/pdf/2002.05512v1.pdf
PWC https://paperswithcode.com/paper/real-or-not-real-that-is-the-question-1
Repo https://github.com/kam1107/RealnessGAN
Framework pytorch

SUOD: Toward Scalable Unsupervised Outlier Detection

Title SUOD: Toward Scalable Unsupervised Outlier Detection
Authors Yue Zhao, Xueying Ding, Jianing Yang, Haoping Bai
Abstract Outlier detection is a key field of machine learning for identifying abnormal data objects. Due to the high expense of acquiring ground truth, unsupervised models are often chosen in practice. To compensate for the unstable nature of unsupervised algorithms, practitioners from high-stakes fields like finance, health, and security, prefer to build a large number of models for further combination and analysis. However, this poses scalability challenges in high-dimensional large datasets. In this study, we propose a three-module acceleration framework called SUOD to expedite the training and prediction with a large number of unsupervised detection models. SUOD’s Random Projection module can generate lower subspaces for high-dimensional datasets while reserving their distance relationship. Balanced Parallel Scheduling module can forecast the training and prediction cost of models with high confidence—so the task scheduler could assign nearly equal amount of taskload among workers for efficient parallelization. SUOD also comes with a Pseudo-supervised Approximation module, which can approximate fitted unsupervised models by lower time complexity supervised regressors for fast prediction on unseen data. It may be considered as an unsupervised model knowledge distillation process. Notably, all three modules are independent with great flexibility to “mix and match”; a combination of modules can be chosen based on use cases. Extensive experiments on more than 30 benchmark datasets have shown the efficacy of SUOD, and a comprehensive future development plan is also presented.
Tasks Outlier Detection
Published 2020-02-08
URL https://arxiv.org/abs/2002.03222v1
PDF https://arxiv.org/pdf/2002.03222v1.pdf
PWC https://paperswithcode.com/paper/suod-toward-scalable-unsupervised-outlier
Repo https://github.com/yzhao062/SUOD
Framework none

Two Routes to Scalable Credit Assignment without Weight Symmetry

Title Two Routes to Scalable Credit Assignment without Weight Symmetry
Authors Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jon Bloom, Daniel L. K. Yamins
Abstract The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport - the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another. Until recently, attempts to create local learning rules that avoid weight transport have typically failed in the large-scale learning scenarios where backpropagation shines, e.g. ImageNet categorization with deep convolutional networks. Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible “weight estimation” process, showing that these rules match state-of-the-art performance on deep networks and operate effectively in the presence of noisy updates. Taken together, our results suggest two routes towards the discovery of neural implementations for credit assignment without weight symmetry: further improvement of local rules so that they perform consistently across architectures and the identification of biological implementations for non-local learning mechanisms.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2003.01513v1
PDF https://arxiv.org/pdf/2003.01513v1.pdf
PWC https://paperswithcode.com/paper/two-routes-to-scalable-credit-assignment
Repo https://github.com/neuroailab/Neural-Alignment
Framework tf
comments powered by Disqus