April 3, 2020

3164 words 15 mins read

Paper Group AWR 58

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills. Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients. Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data. Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network. RL agents Implicitly Learning Human …

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills


Title	Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Authors	Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres
Abstract	Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation – they discover options that provide a poor coverage of the state space. In light of this, we propose ‘Explore, Discover and Learn’ (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned. Code is publicly available at https://github.com/victorcampos7/edl.
Tasks
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03647v3
PDF	https://arxiv.org/pdf/2002.03647v3.pdf
PWC	https://paperswithcode.com/paper/explore-discover-and-learn-unsupervised
Repo	https://github.com/victorcampos7/edl
Framework	pytorch

Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients


Title	Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients
Authors	Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
Abstract	In this paper, we introduce a novel form of value function, $Q(s, s’)$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at \url{sites.google.com/view/qss-paper}.
Tasks	Imitation Learning, Transfer Learning
Published	2020-02-21
URL	https://arxiv.org/abs/2002.09505v1
PDF	https://arxiv.org/pdf/2002.09505v1.pdf
PWC	https://paperswithcode.com/paper/estimating-qss-with-deep-deterministic
Repo	https://github.com/uber-research/D3G
Framework	pytorch


Title	Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data
Authors	Yuxiao Zhou, Marc Habermann, Weipeng Xu, Ikhsanul Habibie, Christian Theobalt, Feng Xu
Abstract	We present a novel method for monocular hand shape and pose estimation at unprecedented runtime performance of 100fps and at state-of-the-art accuracy. This is enabled by a new learning based architecture designed such that it can make use of all the sources of available hand training data: image data with either 2D or 3D annotations, as well as stand-alone 3D animations without corresponding image data. It features a 3D hand joint detection module and an inverse kinematics module which regresses not only 3D joint positions but also maps them to joint rotations in a single feed-forward pass. This output makes the method more directly usable for applications in computer vision and graphics compared to only regressing 3D joint positions. We demonstrate that our architectural design leads to a significant quantitative and qualitative improvement over the state of the art on several challenging benchmarks. Our model is publicly available for future research.
Tasks	Motion Capture, Pose Estimation
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09572v1
PDF	https://arxiv.org/pdf/2003.09572v1.pdf
PWC	https://paperswithcode.com/paper/monocular-real-time-hand-shape-and-motion
Repo	https://github.com/CalciferZh/minimal-hand
Framework	none

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network


Title	Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
Authors	Jungkyu Lee, Taeryun Won, Tae Kwan Lee, Hyemin Lee, Geonmo Gu, Kiho Hong
Abstract	Recent studies in image classification have demonstrated a variety of techniques for improving the performance of Convolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out extensive experiments to validate that carefully assembling these techniques and applying them to basic CNN models (e.g. ResNet and MobileNet) can improve the accuracy and robustness of the models while minimizing the loss of throughput. Our proposed assembled ResNet-50 shows improvements in top-1 accuracy from 76.3% to 82.78%, mCE from 76.0% to 48.9% and mFR from 57.7% to 32.3% on ILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. To verify the performance improvement in transfer learning, fine grained classification and image retrieval tasks were tested on several public datasets and showed that the improvement to backbone network performance boosted transfer learning performance significantly. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019, and the source code and trained models are available at https://github.com/clovaai/assembled-cnn
Tasks	Fine-Grained Image Classification, Fine-Grained Visual Recognition, Image Classification, Image Retrieval, Transfer Learning
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06268v2
PDF	https://arxiv.org/pdf/2001.06268v2.pdf
PWC	https://paperswithcode.com/paper/compounding-the-performance-improvements-of
Repo	https://github.com/clovaai/assembled-cnn
Framework	tf

RL agents Implicitly Learning Human Preferences


Title	RL agents Implicitly Learning Human Preferences
Authors	Nevan Wichers
Abstract	In the real world, RL agents should be rewarded for fulfilling human preferences. We show that RL agents implicitly learn the preferences of humans in their environment. Training a classifier to predict if a simulated human’s preferences are fulfilled based on the activations of a RL agent’s neural network gets .93 AUC. Training a classifier on the raw environment state gets only .8 AUC. Training the classifier off of the RL agent’s activations also does much better than training off of activations from an autoencoder. The human preference classifier can be used as the reward function of an RL agent to make RL agent more beneficial for humans.
Tasks
Published	2020-02-14
URL	https://arxiv.org/abs/2002.06137v1
PDF	https://arxiv.org/pdf/2002.06137v1.pdf
PWC	https://paperswithcode.com/paper/rl-agents-implicitly-learning-human
Repo	https://github.com/arunraja-hub/Preference_Extraction
Framework	none

FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques


Title	FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques
Authors	Tai Vu, Leon Tran
Abstract	Reinforcement learning is one of the most popular approach for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $\epsilon$-greedy policy, discretization and backward updates. We find that SARSA and Q-Learning outperform the baseline, regularly achieving scores of 1400+, with the highest in-game score of 2069.
Tasks	Q-Learning
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09579v1
PDF	https://arxiv.org/pdf/2003.09579v1.pdf
PWC	https://paperswithcode.com/paper/flapai-bird-training-an-agent-to-play-flappy
Repo	https://github.com/taivu1998/FlapAI-Bird
Framework	pytorch

Deep Video Super-Resolution using HR Optical Flow Estimation


Title	Deep Video Super-Resolution using HR Optical Flow Estimation
Authors	Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An
Abstract	Video super-resolution (SR) aims at generating a sequence of high-resolution (HR) frames with plausible and temporally consistent details from their low-resolution (LR) counterparts. The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames. Existing deep learning based methods commonly estimate optical flows between LR frames to provide temporal dependency. However, the resolution conflict between LR optical flows and HR outputs hinders the recovery of fine details. In this paper, we propose an end-to-end video SR network to super-resolve both optical flows and images. Optical flow SR from LR frames provides accurate temporal dependency and ultimately improves video SR performance. Specifically, we first propose an optical flow reconstruction network (OFRnet) to infer HR optical flows in a coarse-to-fine manner. Then, motion compensation is performed using HR optical flows to encode temporal dependency. Finally, compensated LR inputs are fed to a super-resolution network (SRnet) to generate SR results. Extensive experiments have been conducted to demonstrate the effectiveness of HR optical flows for SR performance improvement. Comparative results on the Vid4 and DAVIS-10 datasets show that our network achieves the state-of-the-art performance.
Tasks	Motion Compensation, Optical Flow Estimation, Super-Resolution, Video Super-Resolution
Published	2020-01-06
URL	https://arxiv.org/abs/2001.02129v1
PDF	https://arxiv.org/pdf/2001.02129v1.pdf
PWC	https://paperswithcode.com/paper/deep-video-super-resolution-using-hr-optical
Repo	https://github.com/LongguangWang/SOF-VSR
Framework	pytorch

G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features


Title	G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
Authors	Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Ales Leonardis
Abstract	In this paper, we propose a novel real-time 6D object pose estimation framework, named G2L-Net. Our network operates on point clouds from RGB-D detection in a divide-and-conquer fashion. Specifically, our network consists of three steps. First, we extract the coarse object point cloud from the RGB-D image by 2D detection. Second, we feed the coarse object point cloud to a translation localization network to perform 3D segmentation and object translation prediction. Third, via the predicted segmentation and translation, we transfer the fine object point cloud into a local canonical coordinate, in which we train a rotation localization network to estimate initial object rotation. In the third step, we define point-wise embedding vector features to capture viewpoint-aware information. To calculate more accurate rotation, we adopt a rotation residual estimator to estimate the residual between initial rotation and ground truth, which can boost initial pose estimation performance. Our proposed G2L-Net is real-time despite the fact multiple steps are stacked via the proposed coarse-to-fine framework. Extensive experiments on two benchmark datasets show that G2L-Net achieves state-of-the-art performance in terms of both accuracy and speed.
Tasks	6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published	2020-03-24
URL	https://arxiv.org/abs/2003.11089v2
PDF	https://arxiv.org/pdf/2003.11089v2.pdf
PWC	https://paperswithcode.com/paper/g2l-net-global-to-local-network-for-real-time
Repo	https://github.com/DC1991/G2L_Net
Framework	pytorch

Correcting Knowledge Base Assertions


Title	Correcting Knowledge Base Assertions
Authors	Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, Erik B. Myklebus
Abstract	The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.
Tasks
Published	2020-01-19
URL	https://arxiv.org/abs/2001.06917v1
PDF	https://arxiv.org/pdf/2001.06917v1.pdf
PWC	https://paperswithcode.com/paper/correcting-knowledge-base-assertions
Repo	https://github.com/ChenJiaoyan/KG_Curation
Framework	none

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training


Title	UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Authors	Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon
Abstract	We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
Tasks	Language Modelling
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12804v1
PDF	https://arxiv.org/pdf/2002.12804v1.pdf
PWC	https://paperswithcode.com/paper/unilmv2-pseudo-masked-language-models-for
Repo	https://github.com/microsoft/unilm
Framework	pytorch

Improved Fitness-Dependent Optimizer Algorithm


Title	Improved Fitness-Dependent Optimizer Algorithm
Authors	Danial A. Muhammed, Soran AM. Saeed, Tarik A. Rashid
Abstract	The fitness-dependent optimizer (FDO) algorithm was recently introduced in 2019. An improved FDO (IFDO) algorithm is presented in this work, and this algorithm contributes considerably to refining the ability of the original FDO to address complicated optimization problems. To improve the FDO, the IFDO calculates the alignment and cohesion and then uses these behaviors with the pace at which the FDO updates its position. Moreover, in determining the weights, the FDO uses the weight factor (wf), which is zero in most cases and one in only a few cases. Conversely, the IFDO performs wf randomization in the [0-1] range and then minimizes the range when a better fitness weight value is achieved. In this work, the IFDO algorithm and its method of converging on the optimal solution are demonstrated. Additionally, 19 classical standard test function groups are utilized to test the IFDO, and then the FDO and three other well-known algorithms, namely, the particle swarm algorithm (PSO), dragonfly algorithm (DA), and genetic algorithm (GA), are selected to evaluate the IFDO results. Furthermore, the CECC06 2019 Competition, which is the set of IEEE Congress of Evolutionary Computation benchmark test functions, is utilized to test the IFDO, and then, the FDO and three recent algorithms, namely, the salp swarm algorithm (SSA), DA and whale optimization algorithm (WOA), are chosen to gauge the IFDO results. The results show that IFDO is practical in some cases, and its results are improved in most cases. Finally, to prove the practicability of the IFDO, it is used in real-world applications.
Tasks
Published	2020-01-16
URL	https://arxiv.org/abs/2001.11820v1
PDF	https://arxiv.org/pdf/2001.11820v1.pdf
PWC	https://paperswithcode.com/paper/improved-fitness-dependent-optimizer
Repo	https://github.com/Jaza-Abdullah/FDO-Java
Framework	none

Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation


Title	Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation
Authors	Sunwoo Kim, Haici Yang, Minje Kim
Abstract	Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity. In this study, we propose an adaptive boosting approach to learning locality sensitive hash codes, which represent audio spectra efficiently. We use the learned hash codes for single-channel speech denoising tasks as an alternative to a complex machine learning model, particularly to address the resource-constrained environments. Our adaptive boosting algorithm learns simple logistic regressors as the weak learners. Once trained, their binary classification results transform each spectrum of test noisy speech into a bit string. Simple bitwise operations calculate Hamming distance to find the K-nearest matching frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. Our proposed learning algorithm differs from AdaBoost in the sense that the projections are trained to minimize the distances between the self-similarity matrix of the hash codes and that of the original spectra, rather than the misclassification rate. We evaluate our discriminative hash codes on the TIMIT corpus with various noise types, and show comparative performance to deep learning methods in terms of denoising performance and complexity.
Tasks	Denoising, Speech Enhancement
Published	2020-02-14
URL	https://arxiv.org/abs/2002.06239v1
PDF	https://arxiv.org/pdf/2002.06239v1.pdf
PWC	https://paperswithcode.com/paper/boosted-locality-sensitive-hashing
Repo	https://github.com/sunwookimiub/BLSH
Framework	pytorch

Real or Not Real, that is the Question


Title	Real or Not Real, that is the Question
Authors	Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin
Abstract	While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles. In this generalized framework, referred to as RealnessGAN, the discriminator outputs a distribution as the measure of realness. While RealnessGAN shares similar theoretical guarantees with the standard GAN, it provides more insights on adversarial learning. Compared to multiple baselines, RealnessGAN provides stronger guidance for the generator, achieving improvements on both synthetic and real-world datasets. Moreover, it enables the basic DCGAN architecture to generate realistic images at 1024*1024 resolution when trained from scratch.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05512v1
PDF	https://arxiv.org/pdf/2002.05512v1.pdf
PWC	https://paperswithcode.com/paper/real-or-not-real-that-is-the-question-1
Repo	https://github.com/kam1107/RealnessGAN
Framework	pytorch

SUOD: Toward Scalable Unsupervised Outlier Detection


Title	SUOD: Toward Scalable Unsupervised Outlier Detection
Authors	Yue Zhao, Xueying Ding, Jianing Yang, Haoping Bai
Abstract	Outlier detection is a key field of machine learning for identifying abnormal data objects. Due to the high expense of acquiring ground truth, unsupervised models are often chosen in practice. To compensate for the unstable nature of unsupervised algorithms, practitioners from high-stakes fields like finance, health, and security, prefer to build a large number of models for further combination and analysis. However, this poses scalability challenges in high-dimensional large datasets. In this study, we propose a three-module acceleration framework called SUOD to expedite the training and prediction with a large number of unsupervised detection models. SUOD’s Random Projection module can generate lower subspaces for high-dimensional datasets while reserving their distance relationship. Balanced Parallel Scheduling module can forecast the training and prediction cost of models with high confidence—so the task scheduler could assign nearly equal amount of taskload among workers for efficient parallelization. SUOD also comes with a Pseudo-supervised Approximation module, which can approximate fitted unsupervised models by lower time complexity supervised regressors for fast prediction on unseen data. It may be considered as an unsupervised model knowledge distillation process. Notably, all three modules are independent with great flexibility to “mix and match”; a combination of modules can be chosen based on use cases. Extensive experiments on more than 30 benchmark datasets have shown the efficacy of SUOD, and a comprehensive future development plan is also presented.
Tasks	Outlier Detection
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03222v1
PDF	https://arxiv.org/pdf/2002.03222v1.pdf
PWC	https://paperswithcode.com/paper/suod-toward-scalable-unsupervised-outlier
Repo	https://github.com/yzhao062/SUOD
Framework	none

Two Routes to Scalable Credit Assignment without Weight Symmetry


Title	Two Routes to Scalable Credit Assignment without Weight Symmetry
Authors	Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jon Bloom, Daniel L. K. Yamins
Abstract	The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport - the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another. Until recently, attempts to create local learning rules that avoid weight transport have typically failed in the large-scale learning scenarios where backpropagation shines, e.g. ImageNet categorization with deep convolutional networks. Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible “weight estimation” process, showing that these rules match state-of-the-art performance on deep networks and operate effectively in the presence of noisy updates. Taken together, our results suggest two routes towards the discovery of neural implementations for credit assignment without weight symmetry: further improvement of local rules so that they perform consistently across architectures and the identification of biological implementations for non-local learning mechanisms.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2003.01513v1
PDF	https://arxiv.org/pdf/2003.01513v1.pdf
PWC	https://paperswithcode.com/paper/two-routes-to-scalable-credit-assignment
Repo	https://github.com/neuroailab/Neural-Alignment
Framework	tf