October 20, 2019

2972 words 14 mins read

Paper Group AWR 167

Paper Group AWR 167

Automatic Latent Fingerprint Segmentation. Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks. AutoAugment: Learning Augmentation Policies from Data. Using phase instead of optical flow for action recognition. Human Motion Capture Using a Drone. Larger Norm More Transferable: An Adaptive Feature No …

Automatic Latent Fingerprint Segmentation

Title Automatic Latent Fingerprint Segmentation
Authors Dinh-Luan Nguyen, Kai Cao, Anil K. Jain
Abstract We present a simple but effective method for automatic latent fingerprint segmentation, called SegFinNet. SegFinNet takes a latent image as an input and outputs a binary mask highlighting the friction ridge pattern. Our algorithm combines fully convolutional neural network and detection-based approaches to process the entire input latent image in one shot instead of using latent patches. Experimental results on three different latent databases (i.e. NIST SD27, WVU, and an operational forensic database) show that SegFinNet outperforms both human markup for latents and the state-of-the-art latent segmentation algorithms. We further show that this improved cropping boosts the hit rate of a latent fingerprint matcher.
Tasks
Published 2018-04-25
URL http://arxiv.org/abs/1804.09650v2
PDF http://arxiv.org/pdf/1804.09650v2.pdf
PWC https://paperswithcode.com/paper/automatic-latent-fingerprint-segmentation
Repo https://github.com/luannd/MinutiaeNet
Framework tf

Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks

Title Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks
Authors Quoc Dung Cao, Youngjun Choe
Abstract After a hurricane, damage assessment is critical to emergency managers for efficient response and resource allocation. One way to gauge the damage extent is to quantify the number of flooded/damaged buildings, which is traditionally done by ground survey. This process can be labor-intensive and time-consuming. In this paper, we propose to improve the efficiency of building damage assessment by applying image classification algorithms to post-hurricane satellite imagery. At the known building coordinates (available from public data), we extract square-sized images from the satellite imagery to create training, validation, and test datasets. Each square-sized image contains a building to be classified as either ‘Flooded/Damaged’ (labeled by volunteers in a crowd-sourcing project) or ‘Undamaged’. We design and train a convolutional neural network from scratch and compare it with an existing neural network used widely for common object classification. We demonstrate the promise of our damage annotation model (over 97% accuracy) in the case study of building damage assessment in the Greater Houston area affected by 2017 Hurricane Harvey.
Tasks Image Classification, Object Classification
Published 2018-07-04
URL https://arxiv.org/abs/1807.01688v3
PDF https://arxiv.org/pdf/1807.01688v3.pdf
PWC https://paperswithcode.com/paper/detecting-damaged-buildings-on-post-hurricane
Repo https://github.com/johnson-shuffle/tdi-proposal
Framework none

AutoAugment: Learning Augmentation Policies from Data

Title AutoAugment: Learning Augmentation Policies from Data
Authors Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le
Abstract Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment to automatically search for improved data augmentation policies. In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch. A sub-policy consists of two operations, each operation being an image processing function such as translation, rotation, or shearing, and the probabilities and magnitudes with which the functions are applied. We use a search algorithm to find the best policy such that the neural network yields the highest validation accuracy on a target dataset. Our method achieves state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On ImageNet, we attain a Top-1 accuracy of 83.5% which is 0.4% better than the previous record of 83.1%. On CIFAR-10, we achieve an error rate of 1.5%, which is 0.6% better than the previous state-of-the-art. Augmentation policies we find are transferable between datasets. The policy learned on ImageNet transfers well to achieve significant improvements on other datasets, such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars.
Tasks Data Augmentation, Fine-Grained Image Classification, Image Augmentation, Image Classification
Published 2018-05-24
URL http://arxiv.org/abs/1805.09501v3
PDF http://arxiv.org/pdf/1805.09501v3.pdf
PWC https://paperswithcode.com/paper/autoaugment-learning-augmentation-policies
Repo https://github.com/northeastsquare/effficientnet
Framework tf

Using phase instead of optical flow for action recognition

Title Using phase instead of optical flow for action recognition
Authors Omar Hommos, Silvia L. Pintea, Pascal S. M. Mettes, Jan C. van Gemert
Abstract Currently, the most common motion representation for action recognition is optical flow. Optical flow is based on particle tracking which adheres to a Lagrangian perspective on dynamics. In contrast to the Lagrangian perspective, the Eulerian model of dynamics does not track, but describes local changes. For video, an Eulerian phase-based motion representation, using complex steerable filters, has been successfully employed recently for motion magnification and video frame interpolation. Inspired by these previous works, here, we proposes learning Eulerian motion representations in a deep architecture for action recognition. We learn filters in the complex domain in an end-to-end manner. We design these complex filters to resemble complex Gabor filters, typically employed for phase-information extraction. We propose a phase-information extraction module, based on these complex filters, that can be used in any network architecture for extracting Eulerian representations. We experimentally analyze the added value of Eulerian motion representations, as extracted by our proposed phase extraction module, and compare with existing motion representations based on optical flow, on the UCF101 dataset.
Tasks Optical Flow Estimation, Temporal Action Localization, Video Frame Interpolation
Published 2018-09-10
URL http://arxiv.org/abs/1809.03258v2
PDF http://arxiv.org/pdf/1809.03258v2.pdf
PWC https://paperswithcode.com/paper/using-phase-instead-of-optical-flow-for
Repo https://github.com/11maxed11/phase-based-action-recognition
Framework tf

Human Motion Capture Using a Drone

Title Human Motion Capture Using a Drone
Authors Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis
Abstract Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments. In this work we introduce a drone-based system for 3D human MoCap. The system only needs an autonomously flying drone with an on-board RGB camera and is usable in various indoor and outdoor environments. A reconstruction algorithm is developed to recover full-body motion from the video recorded by the drone. We argue that, besides the capability of tracking a moving subject, a flying drone also provides fast varying viewpoints, which is beneficial for motion reconstruction. We evaluate the accuracy of the proposed system using our new DroCap dataset and also demonstrate its applicability for MoCap in the wild using a consumer drone.
Tasks Motion Capture
Published 2018-04-17
URL http://arxiv.org/abs/1804.06112v1
PDF http://arxiv.org/pdf/1804.06112v1.pdf
PWC https://paperswithcode.com/paper/human-motion-capture-using-a-drone
Repo https://github.com/daniilidis-group/drocap
Framework none

Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation

Title Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation
Authors Ruijia Xu, Guanbin Li, Jihan Yang, Liang Lin
Abstract Domain adaptation enables the learner to safely generalize into novel environments by mitigating domain shifts across distributions. Previous works may not effectively uncover the underlying reasons that would lead to the drastic model degradation on the target task. In this paper, we empirically reveal that the erratic discrimination of the target domain mainly stems from its much smaller feature norms with respect to that of the source domain. To this end, we propose a novel parameter-free Adaptive Feature Norm approach. We demonstrate that progressively adapting the feature norms of the two domains to a large range of values can result in significant transfer gains, implying that those task-specific features with larger norms are more transferable. Our method successfully unifies the computation of both standard and partial domain adaptation with more robustness against the negative transfer issue. Without bells and whistles but a few lines of code, our method substantially lifts the performance on the target task and exceeds state-of-the-arts by a large margin (11.5% on Office-Home and 17.1% on VisDA2017). We hope our simple yet effective approach will shed some light on the future research of transfer learning. Code is available at https://github.com/jihanyang/AFN.
Tasks Domain Adaptation, Partial Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published 2018-11-19
URL https://arxiv.org/abs/1811.07456v2
PDF https://arxiv.org/pdf/1811.07456v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-domain-adaptation-an-adaptive
Repo https://github.com/jihanyang/AFN
Framework pytorch

Differentiable plasticity: training plastic neural networks with backpropagation

Title Differentiable plasticity: training plastic neural networks with backpropagation
Authors Thomas Miconi, Jeff Clune, Kenneth O. Stanley
Abstract How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem.
Tasks Meta-Learning, Omniglot
Published 2018-04-06
URL http://arxiv.org/abs/1804.02464v3
PDF http://arxiv.org/pdf/1804.02464v3.pdf
PWC https://paperswithcode.com/paper/differentiable-plasticity-training-plastic
Repo https://github.com/darylfung96/differentiable_plasticity
Framework pytorch

GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint

Title GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint
Authors Jianlin Su
Abstract We know SGAN may have a risk of gradient vanishing. A significant improvement is WGAN, with the help of 1-Lipschitz constraint on discriminator to prevent from gradient vanishing. Is there any GAN having no gradient vanishing and no 1-Lipschitz constraint on discriminator? We do find one, called GAN-QP. To construct a new framework of Generative Adversarial Network (GAN) usually includes three steps: 1. choose a probability divergence; 2. convert it into a dual form; 3. play a min-max game. In this articles, we demonstrate that the first step is not necessary. We can analyse the property of divergence and even construct new divergence in dual space directly. As a reward, we obtain a simpler alternative of WGAN: GAN-QP. We demonstrate that GAN-QP have a better performance than WGAN in theory and practice.
Tasks
Published 2018-11-18
URL http://arxiv.org/abs/1811.07296v4
PDF http://arxiv.org/pdf/1811.07296v4.pdf
PWC https://paperswithcode.com/paper/gan-qp-a-novel-gan-framework-without-gradient
Repo https://github.com/bojone/gan-qp
Framework pytorch

Fighting Fake News: Image Splice Detection via Learned Self-Consistency

Title Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Authors Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros
Abstract Advances in photo editing and manipulation tools have made it significantly easier to create fake imagery. Learning to detect such manipulations, however, remains a challenging problem due to the lack of sufficient amounts of manipulated training data. In this paper, we propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs. The algorithm uses the automatically recorded photo EXIF metadata as supervisory signal for training a model to determine whether an image is self-consistent – that is, whether its content could have been produced by a single imaging pipeline. We apply this self-consistency model to the task of detecting and localizing image splices. The proposed method obtains state-of-the-art performance on several image forensics benchmarks, despite never seeing any manipulated images at training. That said, it is merely a step in the long quest for a truly general purpose visual forensics tool.
Tasks
Published 2018-05-10
URL http://arxiv.org/abs/1805.04096v3
PDF http://arxiv.org/pdf/1805.04096v3.pdf
PWC https://paperswithcode.com/paper/fighting-fake-news-image-splice-detection-via
Repo https://github.com/minyoungg/selfconsistency
Framework tf

FARM: Functional Automatic Registration Method for 3D Human Bodies

Title FARM: Functional Automatic Registration Method for 3D Human Bodies
Authors Riccardo Marin, Simone Melzi, Emanuele Rodolà, Umberto Castellani
Abstract We introduce a new method for non-rigid registration of 3D human shapes. Our proposed pipeline builds upon a given parametric model of the human, and makes use of the functional map representation for encoding and inferring shape maps throughout the registration process. This combination endows our method with robustness to a large variety of nuisances observed in practical settings, including non-isometric transformations, downsampling, topological noise, and occlusions; further, the pipeline can be applied invariably across different shape representations (e.g. meshes and point clouds), and in the presence of (even dramatic) missing parts such as those arising in real-world depth sensing applications. We showcase our method on a selection of challenging tasks, demonstrating results in line with, or even surpassing, state-of-the-art methods in the respective areas.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10517v1
PDF http://arxiv.org/pdf/1807.10517v1.pdf
PWC https://paperswithcode.com/paper/farm-functional-automatic-registration-method
Repo https://github.com/riccardomarin/FARM
Framework none

Fast and Accurate Online Video Object Segmentation via Tracking Parts

Title Fast and Accurate Online Video Object Segmentation via Tracking Parts
Authors Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, Ming-Hsuan Yang
Abstract 基于视频的目标检测算法研究
Tasks Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published 2018-06-06
URL http://arxiv.org/abs/1806.02323v1
PDF http://arxiv.org/pdf/1806.02323v1.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-online-video-object
Repo https://github.com/guanfuchen/video_obj
Framework pytorch

Fine-grained Apparel Classification and Retrieval without rich annotations

Title Fine-grained Apparel Classification and Retrieval without rich annotations
Authors Aniket Bhatnagar, Sanchit Aggarwal
Abstract The ability to correctly classify and retrieve apparel images has a variety of applications important to e-commerce, online advertising and internet search. In this work, we propose a robust framework for fine-grained apparel classification, in-shop and cross-domain retrieval which eliminates the requirement of rich annotations like bounding boxes and human-joints or clothing landmarks, and training of bounding box/ key-landmark detector for the same. Factors such as subtle appearance differences, variations in human poses, different shooting angles, apparel deformations, and self-occlusion add to the challenges in classification and retrieval of apparel items. Cross-domain retrieval is even harder due to the presence of large variation between online shopping images, usually taken in ideal lighting, pose, positive angle and clean background as compared with street photos captured by users in complicated conditions with poor lighting and cluttered scenes. Our framework uses compact bilinear CNN with tensor sketch algorithm to generate embeddings that capture local pairwise feature interactions in a translationally invariant manner. For apparel classification, we pass the feature embeddings through a softmax classifier, while, the in-shop and cross-domain retrieval pipelines use a triplet-loss based optimization approach, such that squared Euclidean distance between embeddings measures the dissimilarity between the images. Unlike previous works that relied on bounding box, key clothing landmarks or human joint detectors to assist the final deep classifier, proposed framework can be trained directly on the provided category labels or generated triplets for triplet loss optimization. Lastly, Experimental results on the DeepFashion fine-grained categorization, and in-shop and consumer-to-shop retrieval datasets provide a comparative analysis with previous work performed in the domain.
Tasks
Published 2018-11-06
URL http://arxiv.org/abs/1811.02385v1
PDF http://arxiv.org/pdf/1811.02385v1.pdf
PWC https://paperswithcode.com/paper/fine-grained-apparel-classification-and
Repo https://github.com/aniket03/keras_compact_bilnear_CNN
Framework tf

Nonparametric inference of interaction laws in systems of agents from trajectory data

Title Nonparametric inference of interaction laws in systems of agents from trajectory data
Authors Fei Lu, Mauro Maggioni, Sui Tang, Ming Zhong
Abstract Inferring the laws of interaction between particles and agents in complex dynamical systems from observational data is a fundamental challenge in a wide variety of disciplines. We propose a non-parametric statistical learning approach to estimate the governing laws of distance-based interactions, with no reference or assumption about their analytical form, from data consisting trajectories of interacting agents. We demonstrate the effectiveness of our learning approach both by providing theoretical guarantees, and by testing the approach on a variety of prototypical systems in various disciplines. These systems include homogeneous and heterogeneous agents systems, ranging from particle systems in fundamental physics to agent-based systems modeling opinion dynamics under the social influence, prey-predator dynamics, flocking and swarming, and phototaxis in cell dynamics.
Tasks
Published 2018-12-14
URL http://arxiv.org/abs/1812.06003v4
PDF http://arxiv.org/pdf/1812.06003v4.pdf
PWC https://paperswithcode.com/paper/nonparametric-inference-of-interaction-laws
Repo https://github.com/MingZhongCodes/LearningDynamics
Framework none

Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning

Title Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning
Authors Muhammad Shayan, Clement Fung, Chris J. M. Yoon, Ivan Beschastnikh
Abstract Federated Learning is the current state of the art in supporting secure multi-party machine learning (ML): data is maintained on the owner’s device and the updates to the model are aggregated through a secure protocol. However, this process assumes a trusted centralized infrastructure for coordination, and clients must trust that the central service does not use the byproducts of client data. In addition to this, a group of malicious clients could also harm the performance of the model by carrying out a poisoning attack. As a response, we propose Biscotti: a fully decentralized peer to peer (P2P) approach to multi-party ML, which uses blockchain and cryptographic primitives to coordinate a privacy-preserving ML process between peering clients. Our evaluation demonstrates that Biscotti is scalable, fault tolerant, and defends against known attacks. For example, Biscotti is able to protect the privacy of an individual client’s update and the performance of the global model at scale when 30% of adversaries are trying to poison the model. The implementation can be found at: https://github.com/DistributedML/Biscotti
Tasks
Published 2018-11-24
URL https://arxiv.org/abs/1811.09904v4
PDF https://arxiv.org/pdf/1811.09904v4.pdf
PWC https://paperswithcode.com/paper/biscotti-a-ledger-for-private-and-secure-peer
Repo https://github.com/DistributedML/Biscotti
Framework none

Worst-case Optimal Submodular Extensions for Marginal Estimation

Title Worst-case Optimal Submodular Extensions for Marginal Estimation
Authors Pankaj Pansari, Chris Russell, M. Pawan Kumar
Abstract Submodular extensions of an energy function can be used to efficiently compute approximate marginals via variational inference. The accuracy of the marginals depends crucially on the quality of the submodular extension. To identify the best possible extension, we show an equivalence between the submodular extensions of the energy and the objective functions of linear programming (LP) relaxations for the corresponding MAP estimation problem. This allows us to (i) establish the worst-case optimality of the submodular extension for Potts model used in the literature; (ii) identify the worst-case optimal submodular extension for the more general class of metric labeling; and (iii) efficiently compute the marginals for the widely used dense CRF model with the help of a recently proposed Gaussian filtering method. Using synthetic and real data, we show that our approach provides comparable upper bounds on the log-partition function to those obtained using tree-reweighted message passing (TRW) in cases where the latter is computationally feasible. Importantly, unlike TRW, our approach provides the first practical algorithm to compute an upper bound on the dense CRF model.
Tasks
Published 2018-01-10
URL http://arxiv.org/abs/1801.06490v1
PDF http://arxiv.org/pdf/1801.06490v1.pdf
PWC https://paperswithcode.com/paper/worst-case-optimal-submodular-extensions-for
Repo https://github.com/pankajpansari/denseCRF
Framework none
comments powered by Disqus