Paper Group ANR 852
Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis. Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration. Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography. Multifunction Cognitive Radar Task Scheduling …
Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis
Title | Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis |
Authors | Haoye Dong, Xiaodan Liang, Ke Gong, Hanjiang Lai, Jia Zhu, Jian Yin |
Abstract | Despite remarkable advances in image synthesis research, existing works often fail in manipulating images under the context of large geometric transformations. Synthesizing person images conditioned on arbitrary poses is one of the most representative examples where the generation quality largely relies on the capability of identifying and modeling arbitrary transformations on different body parts. Current generative models are often built on local convolutions and overlook the key challenges (e.g. heavy occlusions, different views or dramatic appearance changes) when distinct geometric changes happen for each part, caused by arbitrary pose manipulations. This paper aims to resolve these challenges induced by geometric variability and spatial displacements via a new Soft-Gated Warping Generative Adversarial Network (Warping-GAN), which is composed of two stages: 1) it first synthesizes a target part segmentation map given a target pose, which depicts the region-level spatial layouts for guiding image synthesis with higher-level structure constraints; 2) the Warping-GAN equipped with a soft-gated warping-block learns feature-level mapping to render textures from the original image into the generated segmentation map. Warping-GAN is capable of controlling different transformation degrees given distinct target poses. Moreover, the proposed warping-block is light-weight and flexible enough to be injected into any networks. Human perceptual studies and quantitative evaluations demonstrate the superiority of our Warping-GAN that significantly outperforms all existing methods on two large datasets. |
Tasks | Image Generation |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11610v2 |
http://arxiv.org/pdf/1810.11610v2.pdf | |
PWC | https://paperswithcode.com/paper/soft-gated-warping-gan-for-pose-guided-person |
Repo | |
Framework | |
Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration
Title | Iterative Global Similarity Points : A robust coarse-to-fine integration solution for pairwise 3D point cloud registration |
Authors | Yue Pan, Bisheng Yang, Fuxun Liang, Zhen Dong |
Abstract | In this paper, we propose a coarse-to-fine integration solution inspired by the classical ICP algorithm, to pairwise 3D point cloud registration with two improvements of hybrid metric spaces (eg, BSC feature and Euclidean geometry spaces) and globally optimal correspondences matching. First, we detect the keypoints of point clouds and use the Binary Shape Context (BSC) descriptor to encode their local features. Then, we formulate the correspondence matching task as an energy function, which models the global similarity of keypoints on the hybrid spaces of BSC feature and Euclidean geometry. Next, we estimate the globally optimal correspondences through optimizing the energy function by the Kuhn-Munkres algorithm and then calculate the transformation based on the correspondences. Finally,we iteratively refine the transformation between two point clouds by conducting optimal correspondences matching and transformation calculation in a mutually reinforcing manner, to achieve the coarse-to-fine registration under an unified framework.The proposed method is evaluated and compared to several state-of-the-art methods on selected challenging datasets with repetitive, symmetric and incomplete structures.Comprehensive experiments demonstrate that the proposed IGSP algorithm obtains good performance and outperforms the state-of-the-art methods in terms of both rotation and translation errors. |
Tasks | Point Cloud Registration |
Published | 2018-08-12 |
URL | http://arxiv.org/abs/1808.03899v1 |
http://arxiv.org/pdf/1808.03899v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-global-similarity-points-a-robust |
Repo | |
Framework | |
Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography
Title | Using a reservoir computer to learn chaotic attractors, with applications to chaos synchronisation and cryptography |
Authors | Piotr Antonik, Marvyn Gulina, Jaël Pauwels, Serge Massar |
Abstract | Using the machine learning approach known as reservoir computing, it is possible to train one dynamical system to emulate another. We show that such trained reservoir computers reproduce the properties of the attractor of the chaotic system sufficiently well to exhibit chaos synchronisation. That is, the trained reservoir computer, weakly driven by the chaotic system, will synchronise with the chaotic system. Conversely, the chaotic system, weakly driven by a trained reservoir computer, will synchronise with the reservoir computer. We illustrate this behaviour on the Mackey-Glass and Lorenz systems. We then show that trained reservoir computers can be used to crack chaos based cryptography and illustrate this on a chaos cryptosystem based on the Mackey-Glass system. We conclude by discussing why reservoir computers are so good at emulating chaotic systems. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02844v2 |
http://arxiv.org/pdf/1802.02844v2.pdf | |
PWC | https://paperswithcode.com/paper/using-a-reservoir-computer-to-learn-chaotic |
Repo | |
Framework | |
Multifunction Cognitive Radar Task Scheduling Using Monte Carlo Tree Search and Policy Networks
Title | Multifunction Cognitive Radar Task Scheduling Using Monte Carlo Tree Search and Policy Networks |
Authors | Mahdi Shaghaghi, Raviraj S. Adve, Zhen Ding |
Abstract | A modern radar may be designed to perform multiple functions, such as surveillance, tracking, and fire control. Each function requires the radar to execute a number of transmit-receive tasks. A radar resource management (RRM) module makes decisions on parameter selection, prioritization, and scheduling of such tasks. RRM becomes especially challenging in overload situations, where some tasks may need to be delayed or even dropped. In general, task scheduling is an NP-hard problem. In this work, we develop the branch-and-bound (B&B) method which obtains the optimal solution but at exponential computational complexity. On the other hand, heuristic methods have low complexity but provide relatively poor performance. We resort to machine learning-based techniques to address this issue; specifically we propose an approximate algorithm based on the Monte Carlo tree search method. Along with using bound and dominance rules to eliminate nodes from the search tree, we use a policy network to help to reduce the width of the search. Such a network can be trained using solutions obtained by running the B&B method offline on problems with feasible complexity. We show that the proposed method provides near-optimal performance, but with computational complexity orders of magnitude smaller than the B&B algorithm. |
Tasks | |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07069v1 |
http://arxiv.org/pdf/1805.07069v1.pdf | |
PWC | https://paperswithcode.com/paper/multifunction-cognitive-radar-task-scheduling |
Repo | |
Framework | |
A Tensor-based Structural Health Monitoring Approach for Aeroservoelastic Systems
Title | A Tensor-based Structural Health Monitoring Approach for Aeroservoelastic Systems |
Authors | Prasad Cheema, Nguyen Lu Dang Khoa, Moray Kidd, Gareth A. Vio |
Abstract | Structural health monitoring is a condition-based field of study utilised to monitor infrastructure, via sensing systems. It is therefore used in the field of aerospace engineering to assist in monitoring the health of aerospace structures. A difficulty however is that in structural health monitoring the data input is usually from sensor arrays, which results in data which are highly redundant and correlated, an area in which traditional two-way matrix approaches have had difficulty in deconstructing and interpreting. Newer methods involving tensor analysis allow us to analyse this multi-way structural data in a coherent manner. In our approach, we demonstrate the usefulness of tensor-based learning coupled with for damage detection, on a novel $N$-DoF Lagrangian aeroservoelastic model. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04845v1 |
http://arxiv.org/pdf/1812.04845v1.pdf | |
PWC | https://paperswithcode.com/paper/a-tensor-based-structural-health-monitoring |
Repo | |
Framework | |
Joint convolutional neural pyramid for depth map super-resolution
Title | Joint convolutional neural pyramid for depth map super-resolution |
Authors | Yi Xiao, Xiang Cao, Xianyi Zhu, Renzhi Yang, Yan Zheng |
Abstract | High-resolution depth map can be inferred from a low-resolution one with the guidance of an additional high-resolution texture map of the same scene. Recently, deep neural networks with large receptive fields are shown to benefit applications such as image completion. Our insight is that super resolution is similar to image completion, where only parts of the depth values are precisely known. In this paper, we present a joint convolutional neural pyramid model with large receptive fields for joint depth map super-resolution. Our model consists of three sub-networks, two convolutional neural pyramids concatenated by a normal convolutional neural network. The convolutional neural pyramids extract information from large receptive fields of the depth map and guidance map, while the convolutional neural network effectively transfers useful structures of the guidance image to the depth image. Experimental results show that our model outperforms existing state-of-the-art algorithms not only on data pairs of RGB/depth images, but also on other data pairs like color/saliency and color-scribbles/colorized images. |
Tasks | Depth Map Super-Resolution, Super-Resolution |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.00968v1 |
http://arxiv.org/pdf/1801.00968v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-convolutional-neural-pyramid-for-depth |
Repo | |
Framework | |
Person re-identification across different datasets with multi-task learning
Title | Person re-identification across different datasets with multi-task learning |
Authors | Matthieu Ospici, Antoine Cecchi |
Abstract | This paper presents an approach to tackle the re-identification problem. This is a challenging problem due to the large variation of pose, illumination or camera view. More and more datasets are available to train machine learning models for person re-identification. These datasets vary in conditions: cameras numbers, camera positions, location, season, in size, i.e. number of images, number of different identities. Finally in labeling: there are datasets annotated with attributes while others are not. To deal with this variety of datasets we present in this paper an approach to take information from different datasets to build a system which performs well on all of them. Our model is based on a Convolutional Neural Network (CNN) and trained using multitask learning. Several losses are used to extract the different information available in the different datasets. Our main task is learned with a classification loss. To reduce the intra-class variation we experiment with the center loss. Our paper ends with a performance evaluation in which we discuss the influence of the different losses on the global re-identification performance. We show that with our method, we are able to build a system that performs well on different datasets and simultaneously extracts attributes. We also show that our system outperforms recent re-identification works on two datasets. |
Tasks | Multi-Task Learning, Person Re-Identification |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09666v1 |
http://arxiv.org/pdf/1807.09666v1.pdf | |
PWC | https://paperswithcode.com/paper/person-re-identification-across-different |
Repo | |
Framework | |
Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits
Title | Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits |
Authors | Fabien C. Y. Benureau, Pierre-Yves Oudeyer |
Abstract | We consider a scenario where an agent has multiple available strategies to explore an unknown environment. For each new interaction with the environment, the agent must select which exploration strategy to use. We provide a new strategy-agnostic method that treat the situation as a Multi-Armed Bandits problem where the reward signal is the diversity of effects that each strategy produces. We test the method empirically on a simulated planar robotic arm, and establish that the method is both able discriminate between strategies of dissimilar quality, even when the differences are tenuous, and that the resulting performance is competitive with the best fixed mixture of strategies. |
Tasks | Multi-Armed Bandits |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07739v1 |
http://arxiv.org/pdf/1808.07739v1.pdf | |
PWC | https://paperswithcode.com/paper/diversity-driven-selection-of-exploration |
Repo | |
Framework | |
Correlated Multi-armed Bandits with a Latent Random Source
Title | Correlated Multi-armed Bandits with a Latent Random Source |
Authors | Samarth Gupta, Gauri Joshi, Osman Yağan |
Abstract | We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as $non-competitive$, and avoids exploring them. As a result, we reduce a $K$-armed bandit problem to a $C+1$-armed problem, where $C+1$ includes the best arm and $C$ $competitive$ arms. Our regret analysis shows that the competitive arms need to be pulled $\mathcal{O}(\log T)$ times, while the non-competitive arms are pulled only $\mathcal{O}(1)$ times. As a result, there are regimes where our algorithm achieves a $\mathcal{O}(1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and prove that our correlated-UCB algorithm achieves $\mathcal{O}(1)$ regret whenever possible. |
Tasks | Multi-Armed Bandits |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.05904v2 |
http://arxiv.org/pdf/1808.05904v2.pdf | |
PWC | https://paperswithcode.com/paper/correlated-multi-armed-bandits-with-a-latent |
Repo | |
Framework | |
SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaptation
Title | SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaptation |
Authors | Wenqiang Xu, Yonglu Li, Cewu Lu |
Abstract | Instance segmentation is a problem of significance in computer vision. However, preparing annotated data for this task is extremely time-consuming and costly. By combining the advantages of 3D scanning, reasoning, and GAN-based domain adaptation techniques, we introduce a novel pipeline named SRDA to obtain large quantities of training samples with very minor effort. Our pipeline is well-suited to scenes that can be scanned, i.e. most indoor and some outdoor scenarios. To evaluate our performance, we build three representative scenes and a new dataset, with 3D models of various common objects categories and annotated real-world scene images. Extensive experiments show that our pipeline can achieve decent instance segmentation performance given very low human labor cost. |
Tasks | Domain Adaptation, Instance Segmentation, Semantic Segmentation |
Published | 2018-01-26 |
URL | http://arxiv.org/abs/1801.08839v3 |
http://arxiv.org/pdf/1801.08839v3.pdf | |
PWC | https://paperswithcode.com/paper/srda-generating-instance-segmentation |
Repo | |
Framework | |
Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter
Title | Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter |
Authors | Kun Qian, Jun Zhou, Fengchao Xiong, Huixin Zhou, Juan Du |
Abstract | Target tracking in hyperspectral videos is a new research topic. In this paper, a novel method based on convolutional network and Kernelized Correlation Filter (KCF) framework is presented for tracking objects of interest in hyperspectral videos. We extract a set of normalized three-dimensional cubes from the target region as fixed convolution filters which contain spectral information surrounding a target. The feature maps generated by convolutional operations are combined to form a three-dimensional representation of an object, thereby providing effective encoding of local spectral-spatial information. We show that a simple two-layer convolutional networks is sufficient to learn robust representations without the need of offline training with a large dataset. In the tracking step, KCF is adopted to distinguish targets from neighboring environment. Experimental results demonstrate that the proposed method performs well on sample hyperspectral videos, and outperforms several state-of-the-art methods tested on grayscale and color videos in the same scene. |
Tasks | Object Tracking |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11819v1 |
http://arxiv.org/pdf/1810.11819v1.pdf | |
PWC | https://paperswithcode.com/paper/object-tracking-in-hyperspectral-videos-with |
Repo | |
Framework | |
The AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data with GPUs
Title | The AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data with GPUs |
Authors | Ahmad Al Badawi, Jin Chao, Jie Lin, Chan Fook Mun, Jun Jie Sim, Benjamin Hong Meng Tan, Xiao Nan, Khin Mi Mi Aung, Vijay Ramaseshan Chandrasekhar |
Abstract | Fully homomorphic encryption, with its widely-known feature of computing on encrypted data, empowers a wide range of privacy-concerned cloud applications including deep learning as a service. This comes at a high cost since FHE includes highly-intensive computation that requires enormous computing power. Although the literature includes a number of proposals to run CNNs on encrypted data, the performance is still far from satisfactory. In this paper, we push the level up and show how to accelerate the performance of running CNNs on encrypted data using GPUs. We evaluated a CNN to classify homomorphically the MNIST dataset into 10 classes. We used a number of techniques such as low-precision training, unified training and testing network, optimized FHE parameters and a very efficient GPU implementation to achieve high performance. Our solution achieved high security level (> 128 bit) and high accuracy (99%). In terms of performance, our best results show that we could classify the entire testing dataset in 14.105 seconds, with per-image amortized time (1.411 milliseconds) 40.41x faster than prior art. |
Tasks | |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00778v2 |
http://arxiv.org/pdf/1811.00778v2.pdf | |
PWC | https://paperswithcode.com/paper/the-alexnet-moment-for-homomorphic-encryption |
Repo | |
Framework | |
Generalizing Across Domains via Cross-Gradient Training
Title | Generalizing Across Domains via Cross-Gradient Training |
Authors | Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, Sunita Sarawagi |
Abstract | We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains. CROSSGRAD does not need an adaptation phase via labeled or unlabeled data, or domain features in the new domain. Most existing domain adaptation methods attempt to erase domain signals using techniques like domain adversarial training. In contrast, CROSSGRAD is free to use domain signals for predicting labels, if it can prevent overfitting on training domains. We conceptualize the task in a Bayesian setting, in which a sampling step is implemented as data augmentation, based on domain-guided perturbations of input instances. CROSSGRAD parallelly trains a label and a domain classifier on examples perturbed by loss gradients of each other’s objectives. This enables us to directly perturb inputs, without separating and re-mixing domain signals while making various distributional assumptions. Empirical evaluation on three different applications where this setting is natural establishes that (1) domain-guided perturbation provides consistently better generalization to unseen domains, compared to generic instance perturbation methods, and that (2) data augmentation is a more stable and accurate method than domain adversarial training. |
Tasks | Data Augmentation, Domain Adaptation |
Published | 2018-04-28 |
URL | http://arxiv.org/abs/1804.10745v2 |
http://arxiv.org/pdf/1804.10745v2.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-across-domains-via-cross |
Repo | |
Framework | |
Automatic individual pig detection and tracking in surveillance videos
Title | Automatic individual pig detection and tracking in surveillance videos |
Authors | Lei Zhang, Helen Gray, Xujiong Ye, Lisa Collins, Nigel Allinson |
Abstract | Individual pig detection and tracking is an important requirement in many video-based pig monitoring applications. However, it still remains a challenging task in complex scenes, due to problems of light fluctuation, similar appearances of pigs, shape deformations and occlusions. To tackle these problems, we propose a robust real time multiple pig detection and tracking method which does not require manual marking or physical identification of the pigs, and works under both daylight and infrared light conditions. Our method couples a CNN-based detector and a correlation filter-based tracker via a novel hierarchical data association algorithm. The detector gains the best accuracy/speed trade-off by using the features derived from multiple layers at different scales in a one-stage prediction network. We define a tag-box for each pig as the tracking target, and the multiple object tracking is conducted in a key-points tracking manner using learned correlation filters. Under challenging conditions, the tracking failures are modelled based on the relations between responses of detector and tracker, and the data association algorithm allows the detection hypotheses to be refined, meanwhile the drifted tracks can be corrected by probing the tracking failures followed by the re-initialization of tracking. As a result, the optimal tracklets can sequentially grow with on-line refined detections, and tracking fragments are correctly integrated into respective tracks while keeping the original identifications. Experiments with a dataset captured from a commercial farm show that our method can robustly detect and track multiple pigs under challenging conditions. The promising performance of the proposed method also demonstrates a feasibility of long-term individual pig tracking in a complex environment and thus promises a commercial potential. |
Tasks | Multiple Object Tracking, Object Tracking |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.04901v1 |
http://arxiv.org/pdf/1812.04901v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-individual-pig-detection-and |
Repo | |
Framework | |
Deep-learning the Latent Space of Light Transport
Title | Deep-learning the Latent Space of Light Transport |
Authors | Pedro Hermosilla, Sebastian Maisch, Tobias Ritschel, Timo Ropinski |
Abstract | We suggest a method to directly deep-learn light transport, i. e., the mapping from a 3D geometry-illumination-material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit of 3D over 2D is, that the former can also correctly capture illumination effects related to occluded and/or semi-transparent geometry. To learn 3D light transport, we represent the 3D scene as an unstructured 3D point cloud, which is later, during rendering, projected to the 2D output image. Thus, we suggest a two-stage operator comprising of a 3D network that first transforms the point cloud into a latent representation, which is later on projected to the 2D output image using a dedicated 3D-2D network in a second step. We will show that our approach results in improved quality in terms of temporal coherence while retaining most of the computational efficiency of common 2D methods. As a consequence, the proposed two stage-operator serves as a valuable extension to modern deferred shading approaches. |
Tasks | |
Published | 2018-11-12 |
URL | https://arxiv.org/abs/1811.04756v2 |
https://arxiv.org/pdf/1811.04756v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-the-latent-space-of-light |
Repo | |
Framework | |