October 17, 2019

3276 words 16 mins read

Paper Group ANR 852

Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis. Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration. Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography. Multifunction Cognitive Radar Task Scheduling …

Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis


Title	Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis
Authors	Haoye Dong, Xiaodan Liang, Ke Gong, Hanjiang Lai, Jia Zhu, Jian Yin
Abstract	Despite remarkable advances in image synthesis research, existing works often fail in manipulating images under the context of large geometric transformations. Synthesizing person images conditioned on arbitrary poses is one of the most representative examples where the generation quality largely relies on the capability of identifying and modeling arbitrary transformations on different body parts. Current generative models are often built on local convolutions and overlook the key challenges (e.g. heavy occlusions, different views or dramatic appearance changes) when distinct geometric changes happen for each part, caused by arbitrary pose manipulations. This paper aims to resolve these challenges induced by geometric variability and spatial displacements via a new Soft-Gated Warping Generative Adversarial Network (Warping-GAN), which is composed of two stages: 1) it first synthesizes a target part segmentation map given a target pose, which depicts the region-level spatial layouts for guiding image synthesis with higher-level structure constraints; 2) the Warping-GAN equipped with a soft-gated warping-block learns feature-level mapping to render textures from the original image into the generated segmentation map. Warping-GAN is capable of controlling different transformation degrees given distinct target poses. Moreover, the proposed warping-block is light-weight and flexible enough to be injected into any networks. Human perceptual studies and quantitative evaluations demonstrate the superiority of our Warping-GAN that significantly outperforms all existing methods on two large datasets.
Tasks	Image Generation
Published	2018-10-27
URL	http://arxiv.org/abs/1810.11610v2
PDF	http://arxiv.org/pdf/1810.11610v2.pdf
PWC	https://paperswithcode.com/paper/soft-gated-warping-gan-for-pose-guided-person
Repo
Framework

Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration


Title	Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration
Authors	Yue Pan, Bisheng Yang, Fuxun Liang, Zhen Dong
Abstract	In this paper, we propose a coarse-to-fine integration solution inspired by the classical ICP algorithm, to pairwise 3D point cloud registration with two improvements of hybrid metric spaces (eg, BSC feature and Euclidean geometry spaces) and globally optimal correspondences matching. First, we detect the keypoints of point clouds and use the Binary Shape Context (BSC) descriptor to encode their local features. Then, we formulate the correspondence matching task as an energy function, which models the global similarity of keypoints on the hybrid spaces of BSC feature and Euclidean geometry. Next, we estimate the globally optimal correspondences through optimizing the energy function by the Kuhn-Munkres algorithm and then calculate the transformation based on the correspondences. Finally,we iteratively refine the transformation between two point clouds by conducting optimal correspondences matching and transformation calculation in a mutually reinforcing manner, to achieve the coarse-to-fine registration under an unified framework.The proposed method is evaluated and compared to several state-of-the-art methods on selected challenging datasets with repetitive, symmetric and incomplete structures.Comprehensive experiments demonstrate that the proposed IGSP algorithm obtains good performance and outperforms the state-of-the-art methods in terms of both rotation and translation errors.
Tasks	Point Cloud Registration
Published	2018-08-12
URL	http://arxiv.org/abs/1808.03899v1
PDF	http://arxiv.org/pdf/1808.03899v1.pdf
PWC	https://paperswithcode.com/paper/iterative-global-similarity-points-a-robust
Repo
Framework

Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography


Title	Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography
Authors	Piotr Antonik, Marvyn Gulina, Jaël Pauwels, Serge Massar
Abstract	Using the machine learning approach known as reservoir computing, it is possible to train one dynamical system to emulate another. We show that such trained reservoir computers reproduce the properties of the attractor of the chaotic system sufficiently well to exhibit chaos synchronisation. That is, the trained reservoir computer, weakly driven by the chaotic system, will synchronise with the chaotic system. Conversely, the chaotic system, weakly driven by a trained reservoir computer, will synchronise with the reservoir computer. We illustrate this behaviour on the Mackey-Glass and Lorenz systems. We then show that trained reservoir computers can be used to crack chaos based cryptography and illustrate this on a chaos cryptosystem based on the Mackey-Glass system. We conclude by discussing why reservoir computers are so good at emulating chaotic systems.
Tasks
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02844v2
PDF	http://arxiv.org/pdf/1802.02844v2.pdf
PWC	https://paperswithcode.com/paper/using-a-reservoir-computer-to-learn-chaotic
Repo
Framework

Multifunction Cognitive Radar Task Scheduling Using Monte Carlo Tree Search and Policy Networks


Title	Multifunction Cognitive Radar Task Scheduling Using Monte Carlo Tree Search and Policy Networks
Authors	Mahdi Shaghaghi, Raviraj S. Adve, Zhen Ding
Abstract	A modern radar may be designed to perform multiple functions, such as surveillance, tracking, and fire control. Each function requires the radar to execute a number of transmit-receive tasks. A radar resource management (RRM) module makes decisions on parameter selection, prioritization, and scheduling of such tasks. RRM becomes especially challenging in overload situations, where some tasks may need to be delayed or even dropped. In general, task scheduling is an NP-hard problem. In this work, we develop the branch-and-bound (B&B) method which obtains the optimal solution but at exponential computational complexity. On the other hand, heuristic methods have low complexity but provide relatively poor performance. We resort to machine learning-based techniques to address this issue; specifically we propose an approximate algorithm based on the Monte Carlo tree search method. Along with using bound and dominance rules to eliminate nodes from the search tree, we use a policy network to help to reduce the width of the search. Such a network can be trained using solutions obtained by running the B&B method offline on problems with feasible complexity. We show that the proposed method provides near-optimal performance, but with computational complexity orders of magnitude smaller than the B&B algorithm.
Tasks
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07069v1
PDF	http://arxiv.org/pdf/1805.07069v1.pdf
PWC	https://paperswithcode.com/paper/multifunction-cognitive-radar-task-scheduling
Repo
Framework

A Tensor-based Structural Health Monitoring Approach for Aeroservoelastic Systems


Title	A Tensor-based Structural Health Monitoring Approach for Aeroservoelastic Systems
Authors	Prasad Cheema, Nguyen Lu Dang Khoa, Moray Kidd, Gareth A. Vio
Abstract	Structural health monitoring is a condition-based field of study utilised to monitor infrastructure, via sensing systems. It is therefore used in the field of aerospace engineering to assist in monitoring the health of aerospace structures. A difficulty however is that in structural health monitoring the data input is usually from sensor arrays, which results in data which are highly redundant and correlated, an area in which traditional two-way matrix approaches have had difficulty in deconstructing and interpreting. Newer methods involving tensor analysis allow us to analyse this multi-way structural data in a coherent manner. In our approach, we demonstrate the usefulness of tensor-based learning coupled with for damage detection, on a novel $N$-DoF Lagrangian aeroservoelastic model.
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04845v1
PDF	http://arxiv.org/pdf/1812.04845v1.pdf
PWC	https://paperswithcode.com/paper/a-tensor-based-structural-health-monitoring
Repo
Framework

Joint convolutional neural pyramid for depth map super-resolution


Title	Joint convolutional neural pyramid for depth map super-resolution
Authors	Yi Xiao, Xiang Cao, Xianyi Zhu, Renzhi Yang, Yan Zheng
Abstract	High-resolution depth map can be inferred from a low-resolution one with the guidance of an additional high-resolution texture map of the same scene. Recently, deep neural networks with large receptive fields are shown to benefit applications such as image completion. Our insight is that super resolution is similar to image completion, where only parts of the depth values are precisely known. In this paper, we present a joint convolutional neural pyramid model with large receptive fields for joint depth map super-resolution. Our model consists of three sub-networks, two convolutional neural pyramids concatenated by a normal convolutional neural network. The convolutional neural pyramids extract information from large receptive fields of the depth map and guidance map, while the convolutional neural network effectively transfers useful structures of the guidance image to the depth image. Experimental results show that our model outperforms existing state-of-the-art algorithms not only on data pairs of RGB/depth images, but also on other data pairs like color/saliency and color-scribbles/colorized images.
Tasks	Depth Map Super-Resolution, Super-Resolution
Published	2018-01-03
URL	http://arxiv.org/abs/1801.00968v1
PDF	http://arxiv.org/pdf/1801.00968v1.pdf
PWC	https://paperswithcode.com/paper/joint-convolutional-neural-pyramid-for-depth
Repo
Framework

Person re-identification across different datasets with multi-task learning


Title	Person re-identification across different datasets with multi-task learning
Authors	Matthieu Ospici, Antoine Cecchi
Abstract	This paper presents an approach to tackle the re-identification problem. This is a challenging problem due to the large variation of pose, illumination or camera view. More and more datasets are available to train machine learning models for person re-identification. These datasets vary in conditions: cameras numbers, camera positions, location, season, in size, i.e. number of images, number of different identities. Finally in labeling: there are datasets annotated with attributes while others are not. To deal with this variety of datasets we present in this paper an approach to take information from different datasets to build a system which performs well on all of them. Our model is based on a Convolutional Neural Network (CNN) and trained using multitask learning. Several losses are used to extract the different information available in the different datasets. Our main task is learned with a classification loss. To reduce the intra-class variation we experiment with the center loss. Our paper ends with a performance evaluation in which we discuss the influence of the different losses on the global re-identification performance. We show that with our method, we are able to build a system that performs well on different datasets and simultaneously extracts attributes. We also show that our system outperforms recent re-identification works on two datasets.
Tasks	Multi-Task Learning, Person Re-Identification
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09666v1
PDF	http://arxiv.org/pdf/1807.09666v1.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-across-different
Repo
Framework

Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits


Title	Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits
Authors	Fabien C. Y. Benureau, Pierre-Yves Oudeyer
Abstract	We consider a scenario where an agent has multiple available strategies to explore an unknown environment. For each new interaction with the environment, the agent must select which exploration strategy to use. We provide a new strategy-agnostic method that treat the situation as a Multi-Armed Bandits problem where the reward signal is the diversity of effects that each strategy produces. We test the method empirically on a simulated planar robotic arm, and establish that the method is both able discriminate between strategies of dissimilar quality, even when the differences are tenuous, and that the resulting performance is competitive with the best fixed mixture of strategies.
Tasks	Multi-Armed Bandits
Published	2018-08-23
URL	http://arxiv.org/abs/1808.07739v1
PDF	http://arxiv.org/pdf/1808.07739v1.pdf
PWC	https://paperswithcode.com/paper/diversity-driven-selection-of-exploration
Repo
Framework

Correlated Multi-armed Bandits with a Latent Random Source


Title	Correlated Multi-armed Bandits with a Latent Random Source
Authors	Samarth Gupta, Gauri Joshi, Osman Yağan
Abstract	We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as $non-competitive$, and avoids exploring them. As a result, we reduce a $K$-armed bandit problem to a $C+1$-armed problem, where $C+1$ includes the best arm and $C$ $competitive$ arms. Our regret analysis shows that the competitive arms need to be pulled $\mathcal{O}(\log T)$ times, while the non-competitive arms are pulled only $\mathcal{O}(1)$ times. As a result, there are regimes where our algorithm achieves a $\mathcal{O}(1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and prove that our correlated-UCB algorithm achieves $\mathcal{O}(1)$ regret whenever possible.
Tasks	Multi-Armed Bandits
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05904v2
PDF	http://arxiv.org/pdf/1808.05904v2.pdf
PWC	https://paperswithcode.com/paper/correlated-multi-armed-bandits-with-a-latent
Repo
Framework

SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaptation


Title	SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaptation
Authors	Wenqiang Xu, Yonglu Li, Cewu Lu
Abstract	Instance segmentation is a problem of significance in computer vision. However, preparing annotated data for this task is extremely time-consuming and costly. By combining the advantages of 3D scanning, reasoning, and GAN-based domain adaptation techniques, we introduce a novel pipeline named SRDA to obtain large quantities of training samples with very minor effort. Our pipeline is well-suited to scenes that can be scanned, i.e. most indoor and some outdoor scenarios. To evaluate our performance, we build three representative scenes and a new dataset, with 3D models of various common objects categories and annotated real-world scene images. Extensive experiments show that our pipeline can achieve decent instance segmentation performance given very low human labor cost.
Tasks	Domain Adaptation, Instance Segmentation, Semantic Segmentation
Published	2018-01-26
URL	http://arxiv.org/abs/1801.08839v3
PDF	http://arxiv.org/pdf/1801.08839v3.pdf
PWC	https://paperswithcode.com/paper/srda-generating-instance-segmentation
Repo
Framework

Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter


Title	Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter
Authors	Kun Qian, Jun Zhou, Fengchao Xiong, Huixin Zhou, Juan Du
Abstract	Target tracking in hyperspectral videos is a new research topic. In this paper, a novel method based on convolutional network and Kernelized Correlation Filter (KCF) framework is presented for tracking objects of interest in hyperspectral videos. We extract a set of normalized three-dimensional cubes from the target region as fixed convolution filters which contain spectral information surrounding a target. The feature maps generated by convolutional operations are combined to form a three-dimensional representation of an object, thereby providing effective encoding of local spectral-spatial information. We show that a simple two-layer convolutional networks is sufficient to learn robust representations without the need of offline training with a large dataset. In the tracking step, KCF is adopted to distinguish targets from neighboring environment. Experimental results demonstrate that the proposed method performs well on sample hyperspectral videos, and outperforms several state-of-the-art methods tested on grayscale and color videos in the same scene.
Tasks	Object Tracking
Published	2018-10-28
URL	http://arxiv.org/abs/1810.11819v1
PDF	http://arxiv.org/pdf/1810.11819v1.pdf
PWC	https://paperswithcode.com/paper/object-tracking-in-hyperspectral-videos-with
Repo
Framework

The AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data with GPUs


Title	The AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data with GPUs
Authors	Ahmad Al Badawi, Jin Chao, Jie Lin, Chan Fook Mun, Jun Jie Sim, Benjamin Hong Meng Tan, Xiao Nan, Khin Mi Mi Aung, Vijay Ramaseshan Chandrasekhar
Abstract	Fully homomorphic encryption, with its widely-known feature of computing on encrypted data, empowers a wide range of privacy-concerned cloud applications including deep learning as a service. This comes at a high cost since FHE includes highly-intensive computation that requires enormous computing power. Although the literature includes a number of proposals to run CNNs on encrypted data, the performance is still far from satisfactory. In this paper, we push the level up and show how to accelerate the performance of running CNNs on encrypted data using GPUs. We evaluated a CNN to classify homomorphically the MNIST dataset into 10 classes. We used a number of techniques such as low-precision training, unified training and testing network, optimized FHE parameters and a very efficient GPU implementation to achieve high performance. Our solution achieved high security level (> 128 bit) and high accuracy (99%). In terms of performance, our best results show that we could classify the entire testing dataset in 14.105 seconds, with per-image amortized time (1.411 milliseconds) 40.41x faster than prior art.
Tasks
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00778v2
PDF	http://arxiv.org/pdf/1811.00778v2.pdf
PWC	https://paperswithcode.com/paper/the-alexnet-moment-for-homomorphic-encryption
Repo
Framework

Generalizing Across Domains via Cross-Gradient Training


Title	Generalizing Across Domains via Cross-Gradient Training
Authors	Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, Sunita Sarawagi
Abstract	We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains. CROSSGRAD does not need an adaptation phase via labeled or unlabeled data, or domain features in the new domain. Most existing domain adaptation methods attempt to erase domain signals using techniques like domain adversarial training. In contrast, CROSSGRAD is free to use domain signals for predicting labels, if it can prevent overfitting on training domains. We conceptualize the task in a Bayesian setting, in which a sampling step is implemented as data augmentation, based on domain-guided perturbations of input instances. CROSSGRAD parallelly trains a label and a domain classifier on examples perturbed by loss gradients of each other’s objectives. This enables us to directly perturb inputs, without separating and re-mixing domain signals while making various distributional assumptions. Empirical evaluation on three different applications where this setting is natural establishes that (1) domain-guided perturbation provides consistently better generalization to unseen domains, compared to generic instance perturbation methods, and that (2) data augmentation is a more stable and accurate method than domain adversarial training.
Tasks	Data Augmentation, Domain Adaptation
Published	2018-04-28
URL	http://arxiv.org/abs/1804.10745v2
PDF	http://arxiv.org/pdf/1804.10745v2.pdf
PWC	https://paperswithcode.com/paper/generalizing-across-domains-via-cross
Repo
Framework

Automatic individual pig detection and tracking in surveillance videos


Title	Automatic individual pig detection and tracking in surveillance videos
Authors	Lei Zhang, Helen Gray, Xujiong Ye, Lisa Collins, Nigel Allinson
Abstract	Individual pig detection and tracking is an important requirement in many video-based pig monitoring applications. However, it still remains a challenging task in complex scenes, due to problems of light fluctuation, similar appearances of pigs, shape deformations and occlusions. To tackle these problems, we propose a robust real time multiple pig detection and tracking method which does not require manual marking or physical identification of the pigs, and works under both daylight and infrared light conditions. Our method couples a CNN-based detector and a correlation filter-based tracker via a novel hierarchical data association algorithm. The detector gains the best accuracy/speed trade-off by using the features derived from multiple layers at different scales in a one-stage prediction network. We define a tag-box for each pig as the tracking target, and the multiple object tracking is conducted in a key-points tracking manner using learned correlation filters. Under challenging conditions, the tracking failures are modelled based on the relations between responses of detector and tracker, and the data association algorithm allows the detection hypotheses to be refined, meanwhile the drifted tracks can be corrected by probing the tracking failures followed by the re-initialization of tracking. As a result, the optimal tracklets can sequentially grow with on-line refined detections, and tracking fragments are correctly integrated into respective tracks while keeping the original identifications. Experiments with a dataset captured from a commercial farm show that our method can robustly detect and track multiple pigs under challenging conditions. The promising performance of the proposed method also demonstrates a feasibility of long-term individual pig tracking in a complex environment and thus promises a commercial potential.
Tasks	Multiple Object Tracking, Object Tracking
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04901v1
PDF	http://arxiv.org/pdf/1812.04901v1.pdf
PWC	https://paperswithcode.com/paper/automatic-individual-pig-detection-and
Repo
Framework

Deep-learning the Latent Space of Light Transport


Title	Deep-learning the Latent Space of Light Transport
Authors	Pedro Hermosilla, Sebastian Maisch, Tobias Ritschel, Timo Ropinski
Abstract	We suggest a method to directly deep-learn light transport, i. e., the mapping from a 3D geometry-illumination-material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit of 3D over 2D is, that the former can also correctly capture illumination effects related to occluded and/or semi-transparent geometry. To learn 3D light transport, we represent the 3D scene as an unstructured 3D point cloud, which is later, during rendering, projected to the 2D output image. Thus, we suggest a two-stage operator comprising of a 3D network that first transforms the point cloud into a latent representation, which is later on projected to the 2D output image using a dedicated 3D-2D network in a second step. We will show that our approach results in improved quality in terms of temporal coherence while retaining most of the computational efficiency of common 2D methods. As a consequence, the proposed two stage-operator serves as a valuable extension to modern deferred shading approaches.
Tasks
Published	2018-11-12
URL	https://arxiv.org/abs/1811.04756v2
PDF	https://arxiv.org/pdf/1811.04756v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-the-latent-space-of-light
Repo
Framework