Paper Group AWR 58
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills. Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients. Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data. Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network. RL agents Implicitly Learning Human …
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Title | Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills |
Authors | Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres |
Abstract | Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation – they discover options that provide a poor coverage of the state space. In light of this, we propose ‘Explore, Discover and Learn’ (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned. Code is publicly available at https://github.com/victorcampos7/edl. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03647v3 |
https://arxiv.org/pdf/2002.03647v3.pdf | |
PWC | https://paperswithcode.com/paper/explore-discover-and-learn-unsupervised |
Repo | https://github.com/victorcampos7/edl |
Framework | pytorch |
Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients
Title | Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients |
Authors | Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski |
Abstract | In this paper, we introduce a novel form of value function, $Q(s, s’)$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at \url{sites.google.com/view/qss-paper}. |
Tasks | Imitation Learning, Transfer Learning |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09505v1 |
https://arxiv.org/pdf/2002.09505v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-qss-with-deep-deterministic |
Repo | https://github.com/uber-research/D3G |
Framework | pytorch |
Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data
Title | Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data |
Authors | Yuxiao Zhou, Marc Habermann, Weipeng Xu, Ikhsanul Habibie, Christian Theobalt, Feng Xu |
Abstract | We present a novel method for monocular hand shape and pose estimation at unprecedented runtime performance of 100fps and at state-of-the-art accuracy. This is enabled by a new learning based architecture designed such that it can make use of all the sources of available hand training data: image data with either 2D or 3D annotations, as well as stand-alone 3D animations without corresponding image data. It features a 3D hand joint detection module and an inverse kinematics module which regresses not only 3D joint positions but also maps them to joint rotations in a single feed-forward pass. This output makes the method more directly usable for applications in computer vision and graphics compared to only regressing 3D joint positions. We demonstrate that our architectural design leads to a significant quantitative and qualitative improvement over the state of the art on several challenging benchmarks. Our model is publicly available for future research. |
Tasks | Motion Capture, Pose Estimation |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09572v1 |
https://arxiv.org/pdf/2003.09572v1.pdf | |
PWC | https://paperswithcode.com/paper/monocular-real-time-hand-shape-and-motion |
Repo | https://github.com/CalciferZh/minimal-hand |
Framework | none |
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
Title | Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network |
Authors | Jungkyu Lee, Taeryun Won, Tae Kwan Lee, Hyemin Lee, Geonmo Gu, Kiho Hong |
Abstract | Recent studies in image classification have demonstrated a variety of techniques for improving the performance of Convolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out extensive experiments to validate that carefully assembling these techniques and applying them to basic CNN models (e.g. ResNet and MobileNet) can improve the accuracy and robustness of the models while minimizing the loss of throughput. Our proposed assembled ResNet-50 shows improvements in top-1 accuracy from 76.3% to 82.78%, mCE from 76.0% to 48.9% and mFR from 57.7% to 32.3% on ILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. To verify the performance improvement in transfer learning, fine grained classification and image retrieval tasks were tested on several public datasets and showed that the improvement to backbone network performance boosted transfer learning performance significantly. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019, and the source code and trained models are available at https://github.com/clovaai/assembled-cnn |
Tasks | Fine-Grained Image Classification, Fine-Grained Visual Recognition, Image Classification, Image Retrieval, Transfer Learning |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.06268v2 |
https://arxiv.org/pdf/2001.06268v2.pdf | |
PWC | https://paperswithcode.com/paper/compounding-the-performance-improvements-of |
Repo | https://github.com/clovaai/assembled-cnn |
Framework | tf |
RL agents Implicitly Learning Human Preferences
Title | RL agents Implicitly Learning Human Preferences |
Authors | Nevan Wichers |
Abstract | In the real world, RL agents should be rewarded for fulfilling human preferences. We show that RL agents implicitly learn the preferences of humans in their environment. Training a classifier to predict if a simulated human’s preferences are fulfilled based on the activations of a RL agent’s neural network gets .93 AUC. Training a classifier on the raw environment state gets only .8 AUC. Training the classifier off of the RL agent’s activations also does much better than training off of activations from an autoencoder. The human preference classifier can be used as the reward function of an RL agent to make RL agent more beneficial for humans. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06137v1 |
https://arxiv.org/pdf/2002.06137v1.pdf | |
PWC | https://paperswithcode.com/paper/rl-agents-implicitly-learning-human |
Repo | https://github.com/arunraja-hub/Preference_Extraction |
Framework | none |
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques
Title | FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques |
Authors | Tai Vu, Leon Tran |
Abstract | Reinforcement learning is one of the most popular approach for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $\epsilon$-greedy policy, discretization and backward updates. We find that SARSA and Q-Learning outperform the baseline, regularly achieving scores of 1400+, with the highest in-game score of 2069. |
Tasks | Q-Learning |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09579v1 |
https://arxiv.org/pdf/2003.09579v1.pdf | |
PWC | https://paperswithcode.com/paper/flapai-bird-training-an-agent-to-play-flappy |
Repo | https://github.com/taivu1998/FlapAI-Bird |
Framework | pytorch |
Deep Video Super-Resolution using HR Optical Flow Estimation
Title | Deep Video Super-Resolution using HR Optical Flow Estimation |
Authors | Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An |
Abstract | Video super-resolution (SR) aims at generating a sequence of high-resolution (HR) frames with plausible and temporally consistent details from their low-resolution (LR) counterparts. The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames. Existing deep learning based methods commonly estimate optical flows between LR frames to provide temporal dependency. However, the resolution conflict between LR optical flows and HR outputs hinders the recovery of fine details. In this paper, we propose an end-to-end video SR network to super-resolve both optical flows and images. Optical flow SR from LR frames provides accurate temporal dependency and ultimately improves video SR performance. Specifically, we first propose an optical flow reconstruction network (OFRnet) to infer HR optical flows in a coarse-to-fine manner. Then, motion compensation is performed using HR optical flows to encode temporal dependency. Finally, compensated LR inputs are fed to a super-resolution network (SRnet) to generate SR results. Extensive experiments have been conducted to demonstrate the effectiveness of HR optical flows for SR performance improvement. Comparative results on the Vid4 and DAVIS-10 datasets show that our network achieves the state-of-the-art performance. |
Tasks | Motion Compensation, Optical Flow Estimation, Super-Resolution, Video Super-Resolution |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.02129v1 |
https://arxiv.org/pdf/2001.02129v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-video-super-resolution-using-hr-optical |
Repo | https://github.com/LongguangWang/SOF-VSR |
Framework | pytorch |
G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
Title | G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features |
Authors | Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Ales Leonardis |
Abstract | In this paper, we propose a novel real-time 6D object pose estimation framework, named G2L-Net. Our network operates on point clouds from RGB-D detection in a divide-and-conquer fashion. Specifically, our network consists of three steps. First, we extract the coarse object point cloud from the RGB-D image by 2D detection. Second, we feed the coarse object point cloud to a translation localization network to perform 3D segmentation and object translation prediction. Third, via the predicted segmentation and translation, we transfer the fine object point cloud into a local canonical coordinate, in which we train a rotation localization network to estimate initial object rotation. In the third step, we define point-wise embedding vector features to capture viewpoint-aware information. To calculate more accurate rotation, we adopt a rotation residual estimator to estimate the residual between initial rotation and ground truth, which can boost initial pose estimation performance. Our proposed G2L-Net is real-time despite the fact multiple steps are stacked via the proposed coarse-to-fine framework. Extensive experiments on two benchmark datasets show that G2L-Net achieves state-of-the-art performance in terms of both accuracy and speed. |
Tasks | 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.11089v2 |
https://arxiv.org/pdf/2003.11089v2.pdf | |
PWC | https://paperswithcode.com/paper/g2l-net-global-to-local-network-for-real-time |
Repo | https://github.com/DC1991/G2L_Net |
Framework | pytorch |
Correcting Knowledge Base Assertions
Title | Correcting Knowledge Base Assertions |
Authors | Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, Erik B. Myklebus |
Abstract | The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB. |
Tasks | |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06917v1 |
https://arxiv.org/pdf/2001.06917v1.pdf | |
PWC | https://paperswithcode.com/paper/correcting-knowledge-base-assertions |
Repo | https://github.com/ChenJiaoyan/KG_Curation |
Framework | none |
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Title | UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training |
Authors | Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon |
Abstract | We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks. |
Tasks | Language Modelling |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12804v1 |
https://arxiv.org/pdf/2002.12804v1.pdf | |
PWC | https://paperswithcode.com/paper/unilmv2-pseudo-masked-language-models-for |
Repo | https://github.com/microsoft/unilm |
Framework | pytorch |
Improved Fitness-Dependent Optimizer Algorithm
Title | Improved Fitness-Dependent Optimizer Algorithm |
Authors | Danial A. Muhammed, Soran AM. Saeed, Tarik A. Rashid |
Abstract | The fitness-dependent optimizer (FDO) algorithm was recently introduced in 2019. An improved FDO (IFDO) algorithm is presented in this work, and this algorithm contributes considerably to refining the ability of the original FDO to address complicated optimization problems. To improve the FDO, the IFDO calculates the alignment and cohesion and then uses these behaviors with the pace at which the FDO updates its position. Moreover, in determining the weights, the FDO uses the weight factor (wf), which is zero in most cases and one in only a few cases. Conversely, the IFDO performs wf randomization in the [0-1] range and then minimizes the range when a better fitness weight value is achieved. In this work, the IFDO algorithm and its method of converging on the optimal solution are demonstrated. Additionally, 19 classical standard test function groups are utilized to test the IFDO, and then the FDO and three other well-known algorithms, namely, the particle swarm algorithm (PSO), dragonfly algorithm (DA), and genetic algorithm (GA), are selected to evaluate the IFDO results. Furthermore, the CECC06 2019 Competition, which is the set of IEEE Congress of Evolutionary Computation benchmark test functions, is utilized to test the IFDO, and then, the FDO and three recent algorithms, namely, the salp swarm algorithm (SSA), DA and whale optimization algorithm (WOA), are chosen to gauge the IFDO results. The results show that IFDO is practical in some cases, and its results are improved in most cases. Finally, to prove the practicability of the IFDO, it is used in real-world applications. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.11820v1 |
https://arxiv.org/pdf/2001.11820v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-fitness-dependent-optimizer |
Repo | https://github.com/Jaza-Abdullah/FDO-Java |
Framework | none |
Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation
Title | Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation |
Authors | Sunwoo Kim, Haici Yang, Minje Kim |
Abstract | Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity. In this study, we propose an adaptive boosting approach to learning locality sensitive hash codes, which represent audio spectra efficiently. We use the learned hash codes for single-channel speech denoising tasks as an alternative to a complex machine learning model, particularly to address the resource-constrained environments. Our adaptive boosting algorithm learns simple logistic regressors as the weak learners. Once trained, their binary classification results transform each spectrum of test noisy speech into a bit string. Simple bitwise operations calculate Hamming distance to find the K-nearest matching frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. Our proposed learning algorithm differs from AdaBoost in the sense that the projections are trained to minimize the distances between the self-similarity matrix of the hash codes and that of the original spectra, rather than the misclassification rate. We evaluate our discriminative hash codes on the TIMIT corpus with various noise types, and show comparative performance to deep learning methods in terms of denoising performance and complexity. |
Tasks | Denoising, Speech Enhancement |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06239v1 |
https://arxiv.org/pdf/2002.06239v1.pdf | |
PWC | https://paperswithcode.com/paper/boosted-locality-sensitive-hashing |
Repo | https://github.com/sunwookimiub/BLSH |
Framework | pytorch |
Real or Not Real, that is the Question
Title | Real or Not Real, that is the Question |
Authors | Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin |
Abstract | While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles. In this generalized framework, referred to as RealnessGAN, the discriminator outputs a distribution as the measure of realness. While RealnessGAN shares similar theoretical guarantees with the standard GAN, it provides more insights on adversarial learning. Compared to multiple baselines, RealnessGAN provides stronger guidance for the generator, achieving improvements on both synthetic and real-world datasets. Moreover, it enables the basic DCGAN architecture to generate realistic images at 1024*1024 resolution when trained from scratch. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05512v1 |
https://arxiv.org/pdf/2002.05512v1.pdf | |
PWC | https://paperswithcode.com/paper/real-or-not-real-that-is-the-question-1 |
Repo | https://github.com/kam1107/RealnessGAN |
Framework | pytorch |
SUOD: Toward Scalable Unsupervised Outlier Detection
Title | SUOD: Toward Scalable Unsupervised Outlier Detection |
Authors | Yue Zhao, Xueying Ding, Jianing Yang, Haoping Bai |
Abstract | Outlier detection is a key field of machine learning for identifying abnormal data objects. Due to the high expense of acquiring ground truth, unsupervised models are often chosen in practice. To compensate for the unstable nature of unsupervised algorithms, practitioners from high-stakes fields like finance, health, and security, prefer to build a large number of models for further combination and analysis. However, this poses scalability challenges in high-dimensional large datasets. In this study, we propose a three-module acceleration framework called SUOD to expedite the training and prediction with a large number of unsupervised detection models. SUOD’s Random Projection module can generate lower subspaces for high-dimensional datasets while reserving their distance relationship. Balanced Parallel Scheduling module can forecast the training and prediction cost of models with high confidence—so the task scheduler could assign nearly equal amount of taskload among workers for efficient parallelization. SUOD also comes with a Pseudo-supervised Approximation module, which can approximate fitted unsupervised models by lower time complexity supervised regressors for fast prediction on unseen data. It may be considered as an unsupervised model knowledge distillation process. Notably, all three modules are independent with great flexibility to “mix and match”; a combination of modules can be chosen based on use cases. Extensive experiments on more than 30 benchmark datasets have shown the efficacy of SUOD, and a comprehensive future development plan is also presented. |
Tasks | Outlier Detection |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03222v1 |
https://arxiv.org/pdf/2002.03222v1.pdf | |
PWC | https://paperswithcode.com/paper/suod-toward-scalable-unsupervised-outlier |
Repo | https://github.com/yzhao062/SUOD |
Framework | none |
Two Routes to Scalable Credit Assignment without Weight Symmetry
Title | Two Routes to Scalable Credit Assignment without Weight Symmetry |
Authors | Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jon Bloom, Daniel L. K. Yamins |
Abstract | The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport - the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another. Until recently, attempts to create local learning rules that avoid weight transport have typically failed in the large-scale learning scenarios where backpropagation shines, e.g. ImageNet categorization with deep convolutional networks. Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible “weight estimation” process, showing that these rules match state-of-the-art performance on deep networks and operate effectively in the presence of noisy updates. Taken together, our results suggest two routes towards the discovery of neural implementations for credit assignment without weight symmetry: further improvement of local rules so that they perform consistently across architectures and the identification of biological implementations for non-local learning mechanisms. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2003.01513v1 |
https://arxiv.org/pdf/2003.01513v1.pdf | |
PWC | https://paperswithcode.com/paper/two-routes-to-scalable-credit-assignment |
Repo | https://github.com/neuroailab/Neural-Alignment |
Framework | tf |