January 31, 2020

3375 words 16 mins read

Paper Group AWR 415

Paper Group AWR 415

Rethinking on Multi-Stage Networks for Human Pose Estimation. Neural Network Based in Silico Simulation of Combustion Reactions. Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs. SeesawFaceNets: sparse and robust face verification model for mobile platform. Three scenarios for continual learning. ELF OpenG …

Rethinking on Multi-Stage Networks for Human Pose Estimation

Title Rethinking on Multi-Stage Networks for Human Pose Estimation
Authors Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun
Abstract Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While multi-stage methods are seemingly more suited for the task, their performance in current practice is not as good as single-stage methods. This work studies this issue. We argue that the current multi-stage methods’ unsatisfactory performance comes from the insufficiency in various design choices. We propose several improvements, including the single-stage module design, cross stage feature aggregation, and coarse-to-fine supervision. The resulting method establishes the new state-of-the-art on both MS COCO and MPII Human Pose dataset, justifying the effectiveness of a multi-stage architecture. The source code is publicly available for further research.
Tasks Keypoint Detection, Pose Estimation
Published 2019-01-01
URL https://arxiv.org/abs/1901.00148v4
PDF https://arxiv.org/pdf/1901.00148v4.pdf
PWC https://paperswithcode.com/paper/rethinking-on-multi-stage-networks-for-human
Repo https://github.com/fenglinglwb/MSPN
Framework pytorch

Neural Network Based in Silico Simulation of Combustion Reactions

Title Neural Network Based in Silico Simulation of Combustion Reactions
Authors Jinzhe Zeng, Liqun Cao, Mingyuan Xu, Tong Zhu, John ZH Zhang
Abstract Understanding and prediction of the chemical reactions are fundamental demanding in the study of many complex chemical systems. Reactive molecular dynamics (MD) simulation has been widely used for this purpose as it can offer atomic details and can help us better interpret chemical reaction mechanisms. In this study, two reference datasets were constructed and corresponding neural network (NN) potentials were trained based on them. For given large-scale reaction systems, the NN potentials can predict the potential energy and atomic forces of DFT precision, while it is orders of magnitude faster than the conventional DFT calculation. With these two models, reactive MD simulations were performed to explore the combustion mechanisms of hydrogen and methane. Benefit from the high efficiency of the NN model, nanosecond MD trajectories for large-scale systems containing hundreds of atoms were produced and detailed combustion mechanism was obtained. Through further development, the algorithms in this study can be used to explore and discovery reaction mechanisms of many complex reaction systems, such as combustion, synthesis, and heterogeneous catalysis without any predefined reaction coordinates and elementary reaction steps.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1911.12252v1
PDF https://arxiv.org/pdf/1911.12252v1.pdf
PWC https://paperswithcode.com/paper/neural-network-based-in-silico-simulation-of
Repo https://github.com/tongzhugroup/mddatasetbuilder
Framework none

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Title Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs
Authors Alexia Jolicoeur-Martineau, Ioannis Mitliagkas
Abstract We generalize the concept of maximum-margin classifiers (MMCs) to arbitrary norms and non-linear functions. Support Vector Machines (SVMs) are a special case of MMC. We find that MMCs can be formulated as Integral Probability Metrics (IPMs) or classifiers with some form of gradient norm penalty. This implies a direct link to a class of Generative adversarial networks (GANs) which penalize a gradient norm. We show that the Discriminator in Wasserstein, Standard, Least-Squares, and Hinge GAN with Gradient Penalty is an MMC. We explain why maximizing a margin may be helpful in GANs. We hypothesize and confirm experimentally that $L^\infty$-norm penalties with Hinge loss produce better GANs than $L^2$-norm penalties (based on common evaluation metrics). We derive the margins of Relativistic paired (Rp) and average (Ra) GANs.
Tasks Image Generation
Published 2019-10-15
URL https://arxiv.org/abs/1910.06922v1
PDF https://arxiv.org/pdf/1910.06922v1.pdf
PWC https://paperswithcode.com/paper/connections-between-support-vector-machines
Repo https://github.com/lucidrains/stylegan2-pytorch
Framework pytorch

SeesawFaceNets: sparse and robust face verification model for mobile platform

Title SeesawFaceNets: sparse and robust face verification model for mobile platform
Authors Jintao Zhang
Abstract Deep Convolutional Neural Network (DCNNs) come to be the most widely used solution for most computer vision related tasks, and one of the most important application scenes is face verification. Due to its high-accuracy performance, deep face verification models of which the inference stage occurs on cloud platform through internet plays the key role on most prectical scenes. However, two critical issues exist: First, individual privacy may not be well protected since they have to upload their personal photo and other private information to the online cloud backend. Secondly, either training or inference stage is time-comsuming and the latency may affect customer experience, especially when the internet link speed is not so stable or in remote areas where mobile reception is not so good, but also in cities where building and other construction may block mobile signals. Therefore, designing lightweight networks with low memory requirement and computational cost is one of the most practical solutions for face verification on mobile platform. In this paper, a novel mobile network named SeesawFaceNets, a simple but effective model, is proposed for productively deploying face recognition for mobile devices. Dense experimental results have shown that our proposed model SeesawFaceNets outperforms the baseline MobilefaceNets, with only {\bf66%}(146M VS 221M MAdds) computational cost, smaller batch size and less training steps, and SeesawFaceNets achieve comparable performance with other SOTA model e.g. mobiface with only {\bf54.2%}(1.3M VS 2.4M) parameters and {\bf31.6%}(146M VS 462M MAdds) computational cost, It is also eventually competitive against large-scale deep-networks face recognition on all 5 listed public validation datasets, with {\bf6.5%}(4.2M VS 65M) parameters and {\bf4.35%}(526M VS 12G MAdds) computational cost.
Tasks Face Recognition, Face Verification
Published 2019-08-24
URL https://arxiv.org/abs/1908.09124v3
PDF https://arxiv.org/pdf/1908.09124v3.pdf
PWC https://paperswithcode.com/paper/seesawfacenets-sparse-and-robust-face
Repo https://github.com/didi/AoE
Framework tf

Three scenarios for continual learning

Title Three scenarios for continual learning
Authors Gido M. van de Ven, Andreas S. Tolias
Abstract Standard artificial neural networks suffer from the well-known issue of catastrophic forgetting, making continual or lifelong learning difficult for machine learning. In recent years, numerous methods have been proposed for continual learning, but due to differences in evaluation protocols it is difficult to directly compare their performance. To enable more structured comparisons, we describe three continual learning scenarios based on whether at test time task identity is provided and–in case it is not–whether it must be inferred. Any sequence of well-defined tasks can be performed according to each scenario. Using the split and permuted MNIST task protocols, for each scenario we carry out an extensive comparison of recently proposed continual learning methods. We demonstrate substantial differences between the three scenarios in terms of difficulty and in terms of how efficient different methods are. In particular, when task identity must be inferred (i.e., class incremental learning), we find that regularization-based approaches (e.g., elastic weight consolidation) fail and that replaying representations of previous experiences seems required for solving this scenario.
Tasks Continual Learning
Published 2019-04-15
URL http://arxiv.org/abs/1904.07734v1
PDF http://arxiv.org/pdf/1904.07734v1.pdf
PWC https://paperswithcode.com/paper/three-scenarios-for-continual-learning
Repo https://github.com/GMvandeVen/continual-learning
Framework pytorch

ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero

Title ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero
Authors Yuandong Tian, Jerry Ma, Qucheng Gong, Shubho Sengupta, Zhuoyuan Chen, James Pinkerton, C. Lawrence Zitnick
Abstract The AlphaGo, AlphaGo Zero, and AlphaZero series of algorithms are remarkable demonstrations of deep reinforcement learning’s capabilities, achieving superhuman performance in the complex game of Go with progressively increasing autonomy. However, many obstacles remain in the understanding of and usability of these promising approaches by the research community. Toward elucidating unresolved mysteries and facilitating future research, we propose ELF OpenGo, an open-source reimplementation of the AlphaZero algorithm. ELF OpenGo is the first open-source Go AI to convincingly demonstrate superhuman performance with a perfect (20:0) record against global top professionals. We apply ELF OpenGo to conduct extensive ablation studies, and to identify and analyze numerous interesting phenomena in both the model training and in the gameplay inference procedures. Our code, models, selfplay datasets, and auxiliary data are publicly available.
Tasks Game of Go
Published 2019-02-12
URL https://arxiv.org/abs/1902.04522v4
PDF https://arxiv.org/pdf/1902.04522v4.pdf
PWC https://paperswithcode.com/paper/elf-opengo-an-analysis-and-open
Repo https://github.com/Gregory-Eales/Teeny-Go
Framework pytorch

Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification

Title Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification
Authors Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li
Abstract Person re-identification (re-ID), is a challenging task due to the high variance within identity samples and imaging conditions. Although recent advances in deep learning have achieved remarkable accuracy in settled scenes, i.e., source domain, few works can generalize well on the unseen target domain. One popular solution is assigning unlabeled target images with pseudo labels by clustering, and then retraining the model. However, clustering methods tend to introduce noisy labels and discard low confidence samples as outliers, which may hinder the retraining process and thus limit the generalization ability. In this study, we argue that by explicitly adding a sample filtering procedure after the clustering, the mined examples can be much more efficiently used. To this end, we design an asymmetric co-teaching framework, which resists noisy labels by cooperating two models to select data with possibly clean labels for each other. Meanwhile, one of the models receives samples as pure as possible, while the other takes in samples as diverse as possible. This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively. Extensive experiments show that the proposed framework can consistently benefit most clustering-based methods, and boost the state-of-the-art adaptation accuracy. Our code is available at https://github.com/FlyingRoastDuck/ACT_AAAI20.
Tasks Person Re-Identification
Published 2019-12-03
URL https://arxiv.org/abs/1912.01349v1
PDF https://arxiv.org/pdf/1912.01349v1.pdf
PWC https://paperswithcode.com/paper/asymmetric-co-teaching-for-unsupervised-cross
Repo https://github.com/FlyingRoastDuck/ACT_AAAI20
Framework pytorch

Progressive Transfer Learning for Person Re-identification

Title Progressive Transfer Learning for Person Re-identification
Authors Zhengxu Yu, Zhongming Jin, Long Wei, Jishun Guo, Jianqiang Huang, Deng Cai, Xiaofei He, Xian-Sheng Hua
Abstract Model fine-tuning is a widely used transfer learning approach in person Re-identification (ReID) applications, which fine-tuning a pre-trained feature extraction model into the target scenario instead of training a model from scratch. It is challenging due to the significant variations inside the target scenario, e.g., different camera viewpoint, illumination changes, and occlusion. These variations result in a gap between the distribution of each mini-batch and the distribution of the whole dataset when using mini-batch training. In this paper, we study model fine-tuning from the perspective of the aggregation and utilization of the global information of the dataset when using mini-batch training. Specifically, we introduce a novel network structure called Batch-related Convolutional Cell (BConv-Cell), which progressively collects the global information of the dataset into a latent state and uses this latent state to rectify the extracted feature. Based on BConv-Cells, we further proposed the Progressive Transfer Learning (PTL) method to facilitate the model fine-tuning process by joint training the BConv-Cells and the pre-trained ReID model. Empirical experiments show that our proposal can improve the performance of the ReID model greatly on MSMT17, Market-1501, CUHK03 and DukeMTMC-reID datasets. The code will be released later on at \url{https://github.com/ZJULearning/PTL}
Tasks Person Re-Identification, Transfer Learning
Published 2019-08-07
URL https://arxiv.org/abs/1908.02492v2
PDF https://arxiv.org/pdf/1908.02492v2.pdf
PWC https://paperswithcode.com/paper/progressive-transfer-learning-for-person-re
Repo https://github.com/ZJULearning/PTL
Framework pytorch

What do you learn from context? Probing for sentence structure in contextualized word representations

Title What do you learn from context? Probing for sentence structure in contextualized word representations
Authors Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick
Abstract The jiant toolkit for general-purpose text understanding models
Tasks Language Modelling
Published 2019-05-15
URL https://arxiv.org/abs/1905.06316v1
PDF https://arxiv.org/pdf/1905.06316v1.pdf
PWC https://paperswithcode.com/paper/what-do-you-learn-from-context-probing-for-1
Repo https://github.com/jsalt18-sentence-repl/jiant
Framework pytorch

Multi-task Generative Adversarial Learning on Geometrical Shape Reconstruction from EEG Brain Signals

Title Multi-task Generative Adversarial Learning on Geometrical Shape Reconstruction from EEG Brain Signals
Authors Xiang Zhang, Xiaocong Chen, Manqing Dong, Huan Liu, Chang Ge, Lina Yao
Abstract Synthesizing geometrical shapes from human brain activities is an interesting and meaningful but very challenging topic. Recently, the advancements of deep generative models like Generative Adversarial Networks (GANs) have supported the object generation from neurological signals. However, the Electroencephalograph (EEG)-based shape generation still suffer from the low realism problem. In particular, the generated geometrical shapes lack clear edges and fail to contain necessary details. In light of this, we propose a novel multi-task generative adversarial network to convert the individual’s EEG signals evoked by geometrical shapes to the original geometry. First, we adopt a Convolutional Neural Network (CNN) to learn highly informative latent representation for the raw EEG signals, which is vital for the subsequent shape reconstruction. Next, we build the discriminator based on multi-task learning to distinguish and classify fake samples simultaneously, where the mutual promotion between different tasks improves the quality of the recovered shapes. Then, we propose a semantic alignment constraint in order to force the synthesized samples to approach the real ones in pixel-level, thus producing more compelling shapes. The proposed approach is evaluated over a local dataset and the results show that our model outperforms the competitive state-of-the-art baselines.
Tasks EEG, Multi-Task Learning
Published 2019-07-31
URL https://arxiv.org/abs/1907.13351v2
PDF https://arxiv.org/pdf/1907.13351v2.pdf
PWC https://paperswithcode.com/paper/multi-task-generative-adversarial-learning-on
Repo https://github.com/xiangzhang1015/EEG_Shape_Reconstruction
Framework tf

Pedestrian Collision Avoidance System for Scenarios with Occlusions

Title Pedestrian Collision Avoidance System for Scenarios with Occlusions
Authors Markus Schratter, Maxime Bouton, Mykel J. Kochenderfer, Daniel Watzenig
Abstract Safe autonomous driving in urban areas requires robust algorithms to avoid collisions with other traffic participants with limited perception ability. Current deployed approaches relying on Autonomous Emergency Braking (AEB) systems are often overly conservative. In this work, we formulate the problem as a partially observable Markov decision process (POMDP), to derive a policy robust to uncertainty in the pedestrian location. We investigate how to integrate such a policy with an AEB system that operates only when a collision is unavoidable. In addition, we propose a rigorous evaluation methodology on a set of well defined scenarios. We show that combining the two approaches provides a robust autonomous braking system that reduces unnecessary braking caused by using the AEB system on its own.
Tasks Autonomous Driving
Published 2019-04-25
URL http://arxiv.org/abs/1904.11566v1
PDF http://arxiv.org/pdf/1904.11566v1.pdf
PWC https://paperswithcode.com/paper/pedestrian-collision-avoidance-system-for
Repo https://github.com/sisl/PedestrianAvoidancePOMDP.jl
Framework none

Model Primitive Hierarchical Lifelong Reinforcement Learning

Title Model Primitive Hierarchical Lifelong Reinforcement Learning
Authors Bohan Wu, Jayesh K. Gupta, Mykel J. Kochenderfer
Abstract Learning interpretable and transferable subpolicies and performing task decomposition from a single, complex task is difficult. Some traditional hierarchical reinforcement learning techniques enforce this decomposition in a top-down manner, while meta-learning techniques require a task distribution at hand to learn such decompositions. This paper presents a framework for using diverse suboptimal world models to decompose complex task solutions into simpler modular subpolicies. This framework performs automatic decomposition of a single source task in a bottom up manner, concurrently learning the required modular subpolicies as well as a controller to coordinate them. We perform a series of experiments on high dimensional continuous action control tasks to demonstrate the effectiveness of this approach at both complex single task learning and lifelong learning. Finally, we perform ablation studies to understand the importance and robustness of different elements in the framework and limitations to this approach.
Tasks Hierarchical Reinforcement Learning, Meta-Learning
Published 2019-03-04
URL http://arxiv.org/abs/1903.01567v1
PDF http://arxiv.org/pdf/1903.01567v1.pdf
PWC https://paperswithcode.com/paper/model-primitive-hierarchical-lifelong
Repo https://github.com/sisl/MPHRL
Framework tf

AADS: Augmented Autonomous Driving Simulation using Data-driven Algorithms

Title AADS: Augmented Autonomous Driving Simulation using Data-driven Algorithms
Authors Wei Li, Chengwei Pan, Rong Zhang, Jiaping Ren, Yuexin Ma, Jin Fang, Feilong Yan, Qichuan Geng, Xinyu Huang, Huajun Gong, Weiwei Xu, Guoping Wang, Dinesh Manocha, Ruigang Yang
Abstract Simulation systems have become an essential component in the development and validation of autonomous driving technologies. The prevailing state-of-the-art approach for simulation is to use game engines or high-fidelity computer graphics (CG) models to create driving scenarios. However, creating CG models and vehicle movements (e.g., the assets for simulation) remains a manual task that can be costly and time-consuming. In addition, the fidelity of CG images still lacks the richness and authenticity of real-world images and using these images for training leads to degraded performance. In this paper we present a novel approach to address these issues: Augmented Autonomous Driving Simulation (AADS). Our formulation augments real-world pictures with a simulated traffic flow to create photo-realistic simulation images and renderings. More specifically, we use LiDAR and cameras to scan street scenes. From the acquired trajectory data, we generate highly plausible traffic flows for cars and pedestrians and compose them into the background. The composite images can be re-synthesized with different viewpoints and sensor models. The resulting images are photo-realistic, fully annotated, and ready for end-to-end training and testing of autonomous driving systems from perception to planning. We explain our system design and validate our algorithms with a number of autonomous driving tasks from detection to segmentation and predictions. Compared to traditional approaches, our method offers unmatched scalability and realism. Scalability is particularly important for AD simulation and we believe the complexity and diversity of the real world cannot be realistically captured in a virtual environment. Our augmented approach combines the flexibility in a virtual environment (e.g., vehicle movements) with the richness of the real world to allow effective simulation of anywhere on earth.
Tasks Autonomous Driving
Published 2019-01-23
URL http://arxiv.org/abs/1901.07849v2
PDF http://arxiv.org/pdf/1901.07849v2.pdf
PWC https://paperswithcode.com/paper/aads-augmented-autonomous-driving-simulation
Repo https://github.com/ApolloScapeAuto/dataset-api
Framework none

Weak Supervision for Fake News Detection via Reinforcement Learning

Title Weak Supervision for Fake News Detection via Reinforcement Learning
Authors Yaqing Wang, Weifeng Yang, Fenglong Ma, Jin Xu, Bin Zhong, Qiang Deng, Jing Gao
Abstract Today social media has become the primary source for news. Via social media platforms, fake news travel at unprecedented speeds, reach global audiences and put users and communities at great risk. Therefore, it is extremely important to detect fake news as early as possible. Recently, deep learning based approaches have shown improved performance in fake news detection. However, the training of such models requires a large amount of labeled data, but manual annotation is time-consuming and expensive. Moreover, due to the dynamic nature of news, annotated samples may become outdated quickly and cannot represent the news articles on newly emerged events. Therefore, how to obtain fresh and high-quality labeled samples is the major challenge in employing deep learning models for fake news detection. In order to tackle this challenge, we propose a reinforced weakly-supervised fake news detection framework, i.e., WeFEND, which can leverage users’ reports as weak supervision to enlarge the amount of training data for fake news detection. The proposed framework consists of three main components: the annotator, the reinforced selector and the fake news detector. The annotator can automatically assign weak labels for unlabeled news based on users’ reports. The reinforced selector using reinforcement learning techniques chooses high-quality samples from the weakly labeled data and filters out those low-quality ones that may degrade the detector’s prediction performance. The fake news detector aims to identify fake news based on the news content. We tested the proposed framework on a large collection of news articles published via WeChat official accounts and associated user reports. Extensive experiments on this dataset show that the proposed WeFEND model achieves the best performance compared with the state-of-the-art methods.
Tasks Fake News Detection
Published 2019-12-28
URL https://arxiv.org/abs/1912.12520v2
PDF https://arxiv.org/pdf/1912.12520v2.pdf
PWC https://paperswithcode.com/paper/weak-supervision-for-fake-news-detection-via
Repo https://github.com/yaqingwang/WeFEND-AAAI20
Framework none

Learning to Augment Synthetic Images for Sim2Real Policy Transfer

Title Learning to Augment Synthetic Images for Sim2Real Policy Transfer
Authors Alexander Pashevich, Robin Strudel, Igor Kalevatykh, Ivan Laptev, Cordelia Schmid
Abstract Vision and learning have made significant progress that could improve robotics policies for complex tasks and environments. Learning deep neural networks for image understanding, however, requires large amounts of domain-specific visual data. While collecting such data from real robots is possible, such an approach limits the scalability as learning policies typically requires thousands of trials. In this work we attempt to learn manipulation policies in simulated environments. Simulators enable scalability and provide access to the underlying world state during training. Policies learned in simulators, however, do not transfer well to real scenes given the domain gap between real and synthetic data. We follow recent work on domain randomization and augment synthetic images with sequences of random transformations. Our main contribution is to optimize the augmentation strategy for sim2real transfer and to enable domain-independent policy learning. We design an efficient search for depth image augmentations using object localization as a proxy task. Given the resulting sequence of random transformations, we use it to augment synthetic depth images during policy learning. Our augmentation strategy is policy-independent and enables policy learning with no real images. We demonstrate our approach to significantly improve accuracy on three manipulation tasks evaluated on a real robot.
Tasks Object Localization
Published 2019-03-18
URL https://arxiv.org/abs/1903.07740v2
PDF https://arxiv.org/pdf/1903.07740v2.pdf
PWC https://paperswithcode.com/paper/learning-to-augment-synthetic-images-for
Repo https://github.com/rstrudel/rlbc
Framework pytorch
comments powered by Disqus