October 20, 2019

2972 words 14 mins read

Paper Group AWR 167

Automatic Latent Fingerprint Segmentation. Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks. AutoAugment: Learning Augmentation Policies from Data. Using phase instead of optical flow for action recognition. Human Motion Capture Using a Drone. Larger Norm More Transferable: An Adaptive Feature No …

Automatic Latent Fingerprint Segmentation


Title	Automatic Latent Fingerprint Segmentation
Authors	Dinh-Luan Nguyen, Kai Cao, Anil K. Jain
Abstract	We present a simple but effective method for automatic latent fingerprint segmentation, called SegFinNet. SegFinNet takes a latent image as an input and outputs a binary mask highlighting the friction ridge pattern. Our algorithm combines fully convolutional neural network and detection-based approaches to process the entire input latent image in one shot instead of using latent patches. Experimental results on three different latent databases (i.e. NIST SD27, WVU, and an operational forensic database) show that SegFinNet outperforms both human markup for latents and the state-of-the-art latent segmentation algorithms. We further show that this improved cropping boosts the hit rate of a latent fingerprint matcher.
Tasks
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09650v2
PDF	http://arxiv.org/pdf/1804.09650v2.pdf
PWC	https://paperswithcode.com/paper/automatic-latent-fingerprint-segmentation
Repo	https://github.com/luannd/MinutiaeNet
Framework	tf

Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks


Title	Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks
Authors	Quoc Dung Cao, Youngjun Choe
Abstract	After a hurricane, damage assessment is critical to emergency managers for efficient response and resource allocation. One way to gauge the damage extent is to quantify the number of flooded/damaged buildings, which is traditionally done by ground survey. This process can be labor-intensive and time-consuming. In this paper, we propose to improve the efficiency of building damage assessment by applying image classification algorithms to post-hurricane satellite imagery. At the known building coordinates (available from public data), we extract square-sized images from the satellite imagery to create training, validation, and test datasets. Each square-sized image contains a building to be classified as either ‘Flooded/Damaged’ (labeled by volunteers in a crowd-sourcing project) or ‘Undamaged’. We design and train a convolutional neural network from scratch and compare it with an existing neural network used widely for common object classification. We demonstrate the promise of our damage annotation model (over 97% accuracy) in the case study of building damage assessment in the Greater Houston area affected by 2017 Hurricane Harvey.
Tasks	Image Classification, Object Classification
Published	2018-07-04
URL	https://arxiv.org/abs/1807.01688v3
PDF	https://arxiv.org/pdf/1807.01688v3.pdf
PWC	https://paperswithcode.com/paper/detecting-damaged-buildings-on-post-hurricane
Repo	https://github.com/johnson-shuffle/tdi-proposal
Framework	none

AutoAugment: Learning Augmentation Policies from Data


Title	AutoAugment: Learning Augmentation Policies from Data
Authors	Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le
Abstract	Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment to automatically search for improved data augmentation policies. In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch. A sub-policy consists of two operations, each operation being an image processing function such as translation, rotation, or shearing, and the probabilities and magnitudes with which the functions are applied. We use a search algorithm to find the best policy such that the neural network yields the highest validation accuracy on a target dataset. Our method achieves state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On ImageNet, we attain a Top-1 accuracy of 83.5% which is 0.4% better than the previous record of 83.1%. On CIFAR-10, we achieve an error rate of 1.5%, which is 0.6% better than the previous state-of-the-art. Augmentation policies we find are transferable between datasets. The policy learned on ImageNet transfers well to achieve significant improvements on other datasets, such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars.
Tasks	Data Augmentation, Fine-Grained Image Classification, Image Augmentation, Image Classification
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09501v3
PDF	http://arxiv.org/pdf/1805.09501v3.pdf
PWC	https://paperswithcode.com/paper/autoaugment-learning-augmentation-policies
Repo	https://github.com/northeastsquare/effficientnet
Framework	tf

Using phase instead of optical flow for action recognition


Title	Using phase instead of optical flow for action recognition
Authors	Omar Hommos, Silvia L. Pintea, Pascal S. M. Mettes, Jan C. van Gemert
Abstract	Currently, the most common motion representation for action recognition is optical flow. Optical flow is based on particle tracking which adheres to a Lagrangian perspective on dynamics. In contrast to the Lagrangian perspective, the Eulerian model of dynamics does not track, but describes local changes. For video, an Eulerian phase-based motion representation, using complex steerable filters, has been successfully employed recently for motion magnification and video frame interpolation. Inspired by these previous works, here, we proposes learning Eulerian motion representations in a deep architecture for action recognition. We learn filters in the complex domain in an end-to-end manner. We design these complex filters to resemble complex Gabor filters, typically employed for phase-information extraction. We propose a phase-information extraction module, based on these complex filters, that can be used in any network architecture for extracting Eulerian representations. We experimentally analyze the added value of Eulerian motion representations, as extracted by our proposed phase extraction module, and compare with existing motion representations based on optical flow, on the UCF101 dataset.
Tasks	Optical Flow Estimation, Temporal Action Localization, Video Frame Interpolation
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03258v2
PDF	http://arxiv.org/pdf/1809.03258v2.pdf
PWC	https://paperswithcode.com/paper/using-phase-instead-of-optical-flow-for
Repo	https://github.com/11maxed11/phase-based-action-recognition
Framework	tf

Human Motion Capture Using a Drone


Title	Human Motion Capture Using a Drone
Authors	Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis
Abstract	Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments. In this work we introduce a drone-based system for 3D human MoCap. The system only needs an autonomously flying drone with an on-board RGB camera and is usable in various indoor and outdoor environments. A reconstruction algorithm is developed to recover full-body motion from the video recorded by the drone. We argue that, besides the capability of tracking a moving subject, a flying drone also provides fast varying viewpoints, which is beneficial for motion reconstruction. We evaluate the accuracy of the proposed system using our new DroCap dataset and also demonstrate its applicability for MoCap in the wild using a consumer drone.
Tasks	Motion Capture
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06112v1
PDF	http://arxiv.org/pdf/1804.06112v1.pdf
PWC	https://paperswithcode.com/paper/human-motion-capture-using-a-drone
Repo	https://github.com/daniilidis-group/drocap
Framework	none

Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation


Title	Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation
Authors	Ruijia Xu, Guanbin Li, Jihan Yang, Liang Lin
Abstract	Domain adaptation enables the learner to safely generalize into novel environments by mitigating domain shifts across distributions. Previous works may not effectively uncover the underlying reasons that would lead to the drastic model degradation on the target task. In this paper, we empirically reveal that the erratic discrimination of the target domain mainly stems from its much smaller feature norms with respect to that of the source domain. To this end, we propose a novel parameter-free Adaptive Feature Norm approach. We demonstrate that progressively adapting the feature norms of the two domains to a large range of values can result in significant transfer gains, implying that those task-specific features with larger norms are more transferable. Our method successfully unifies the computation of both standard and partial domain adaptation with more robustness against the negative transfer issue. Without bells and whistles but a few lines of code, our method substantially lifts the performance on the target task and exceeds state-of-the-arts by a large margin (11.5% on Office-Home and 17.1% on VisDA2017). We hope our simple yet effective approach will shed some light on the future research of transfer learning. Code is available at https://github.com/jihanyang/AFN.
Tasks	Domain Adaptation, Partial Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published	2018-11-19
URL	https://arxiv.org/abs/1811.07456v2
PDF	https://arxiv.org/pdf/1811.07456v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-domain-adaptation-an-adaptive
Repo	https://github.com/jihanyang/AFN
Framework	pytorch

Differentiable plasticity: training plastic neural networks with backpropagation


Title	Differentiable plasticity: training plastic neural networks with backpropagation
Authors	Thomas Miconi, Jeff Clune, Kenneth O. Stanley
Abstract	How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem.
Tasks	Meta-Learning, Omniglot
Published	2018-04-06
URL	http://arxiv.org/abs/1804.02464v3
PDF	http://arxiv.org/pdf/1804.02464v3.pdf
PWC	https://paperswithcode.com/paper/differentiable-plasticity-training-plastic
Repo	https://github.com/darylfung96/differentiable_plasticity
Framework	pytorch

GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint


Title	GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint
Authors	Jianlin Su
Abstract	We know SGAN may have a risk of gradient vanishing. A significant improvement is WGAN, with the help of 1-Lipschitz constraint on discriminator to prevent from gradient vanishing. Is there any GAN having no gradient vanishing and no 1-Lipschitz constraint on discriminator? We do find one, called GAN-QP. To construct a new framework of Generative Adversarial Network (GAN) usually includes three steps: 1. choose a probability divergence; 2. convert it into a dual form; 3. play a min-max game. In this articles, we demonstrate that the first step is not necessary. We can analyse the property of divergence and even construct new divergence in dual space directly. As a reward, we obtain a simpler alternative of WGAN: GAN-QP. We demonstrate that GAN-QP have a better performance than WGAN in theory and practice.
Tasks
Published	2018-11-18
URL	http://arxiv.org/abs/1811.07296v4
PDF	http://arxiv.org/pdf/1811.07296v4.pdf
PWC	https://paperswithcode.com/paper/gan-qp-a-novel-gan-framework-without-gradient
Repo	https://github.com/bojone/gan-qp
Framework	pytorch

Fighting Fake News: Image Splice Detection via Learned Self-Consistency


Title	Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Authors	Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros
Abstract	Advances in photo editing and manipulation tools have made it significantly easier to create fake imagery. Learning to detect such manipulations, however, remains a challenging problem due to the lack of sufficient amounts of manipulated training data. In this paper, we propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs. The algorithm uses the automatically recorded photo EXIF metadata as supervisory signal for training a model to determine whether an image is self-consistent – that is, whether its content could have been produced by a single imaging pipeline. We apply this self-consistency model to the task of detecting and localizing image splices. The proposed method obtains state-of-the-art performance on several image forensics benchmarks, despite never seeing any manipulated images at training. That said, it is merely a step in the long quest for a truly general purpose visual forensics tool.
Tasks
Published	2018-05-10
URL	http://arxiv.org/abs/1805.04096v3
PDF	http://arxiv.org/pdf/1805.04096v3.pdf
PWC	https://paperswithcode.com/paper/fighting-fake-news-image-splice-detection-via
Repo	https://github.com/minyoungg/selfconsistency
Framework	tf

FARM: Functional Automatic Registration Method for 3D Human Bodies


Title	FARM: Functional Automatic Registration Method for 3D Human Bodies
Authors	Riccardo Marin, Simone Melzi, Emanuele Rodolà, Umberto Castellani
Abstract	We introduce a new method for non-rigid registration of 3D human shapes. Our proposed pipeline builds upon a given parametric model of the human, and makes use of the functional map representation for encoding and inferring shape maps throughout the registration process. This combination endows our method with robustness to a large variety of nuisances observed in practical settings, including non-isometric transformations, downsampling, topological noise, and occlusions; further, the pipeline can be applied invariably across different shape representations (e.g. meshes and point clouds), and in the presence of (even dramatic) missing parts such as those arising in real-world depth sensing applications. We showcase our method on a selection of challenging tasks, demonstrating results in line with, or even surpassing, state-of-the-art methods in the respective areas.
Tasks
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10517v1
PDF	http://arxiv.org/pdf/1807.10517v1.pdf
PWC	https://paperswithcode.com/paper/farm-functional-automatic-registration-method
Repo	https://github.com/riccardomarin/FARM
Framework	none

Fast and Accurate Online Video Object Segmentation via Tracking Parts


Title	Fast and Accurate Online Video Object Segmentation via Tracking Parts
Authors	Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, Ming-Hsuan Yang
Abstract	基于视频的目标检测算法研究
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02323v1
PDF	http://arxiv.org/pdf/1806.02323v1.pdf
PWC	https://paperswithcode.com/paper/fast-and-accurate-online-video-object
Repo	https://github.com/guanfuchen/video_obj
Framework	pytorch

Fine-grained Apparel Classification and Retrieval without rich annotations


Title	Fine-grained Apparel Classification and Retrieval without rich annotations
Authors	Aniket Bhatnagar, Sanchit Aggarwal
Abstract	The ability to correctly classify and retrieve apparel images has a variety of applications important to e-commerce, online advertising and internet search. In this work, we propose a robust framework for fine-grained apparel classification, in-shop and cross-domain retrieval which eliminates the requirement of rich annotations like bounding boxes and human-joints or clothing landmarks, and training of bounding box/ key-landmark detector for the same. Factors such as subtle appearance differences, variations in human poses, different shooting angles, apparel deformations, and self-occlusion add to the challenges in classification and retrieval of apparel items. Cross-domain retrieval is even harder due to the presence of large variation between online shopping images, usually taken in ideal lighting, pose, positive angle and clean background as compared with street photos captured by users in complicated conditions with poor lighting and cluttered scenes. Our framework uses compact bilinear CNN with tensor sketch algorithm to generate embeddings that capture local pairwise feature interactions in a translationally invariant manner. For apparel classification, we pass the feature embeddings through a softmax classifier, while, the in-shop and cross-domain retrieval pipelines use a triplet-loss based optimization approach, such that squared Euclidean distance between embeddings measures the dissimilarity between the images. Unlike previous works that relied on bounding box, key clothing landmarks or human joint detectors to assist the final deep classifier, proposed framework can be trained directly on the provided category labels or generated triplets for triplet loss optimization. Lastly, Experimental results on the DeepFashion fine-grained categorization, and in-shop and consumer-to-shop retrieval datasets provide a comparative analysis with previous work performed in the domain.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02385v1
PDF	http://arxiv.org/pdf/1811.02385v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-apparel-classification-and
Repo	https://github.com/aniket03/keras_compact_bilnear_CNN
Framework	tf

Nonparametric inference of interaction laws in systems of agents from trajectory data


Title	Nonparametric inference of interaction laws in systems of agents from trajectory data
Authors	Fei Lu, Mauro Maggioni, Sui Tang, Ming Zhong
Abstract	Inferring the laws of interaction between particles and agents in complex dynamical systems from observational data is a fundamental challenge in a wide variety of disciplines. We propose a non-parametric statistical learning approach to estimate the governing laws of distance-based interactions, with no reference or assumption about their analytical form, from data consisting trajectories of interacting agents. We demonstrate the effectiveness of our learning approach both by providing theoretical guarantees, and by testing the approach on a variety of prototypical systems in various disciplines. These systems include homogeneous and heterogeneous agents systems, ranging from particle systems in fundamental physics to agent-based systems modeling opinion dynamics under the social influence, prey-predator dynamics, flocking and swarming, and phototaxis in cell dynamics.
Tasks
Published	2018-12-14
URL	http://arxiv.org/abs/1812.06003v4
PDF	http://arxiv.org/pdf/1812.06003v4.pdf
PWC	https://paperswithcode.com/paper/nonparametric-inference-of-interaction-laws
Repo	https://github.com/MingZhongCodes/LearningDynamics
Framework	none

Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning


Title	Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning
Authors	Muhammad Shayan, Clement Fung, Chris J. M. Yoon, Ivan Beschastnikh
Abstract	Federated Learning is the current state of the art in supporting secure multi-party machine learning (ML): data is maintained on the owner’s device and the updates to the model are aggregated through a secure protocol. However, this process assumes a trusted centralized infrastructure for coordination, and clients must trust that the central service does not use the byproducts of client data. In addition to this, a group of malicious clients could also harm the performance of the model by carrying out a poisoning attack. As a response, we propose Biscotti: a fully decentralized peer to peer (P2P) approach to multi-party ML, which uses blockchain and cryptographic primitives to coordinate a privacy-preserving ML process between peering clients. Our evaluation demonstrates that Biscotti is scalable, fault tolerant, and defends against known attacks. For example, Biscotti is able to protect the privacy of an individual client’s update and the performance of the global model at scale when 30% of adversaries are trying to poison the model. The implementation can be found at: https://github.com/DistributedML/Biscotti
Tasks
Published	2018-11-24
URL	https://arxiv.org/abs/1811.09904v4
PDF	https://arxiv.org/pdf/1811.09904v4.pdf
PWC	https://paperswithcode.com/paper/biscotti-a-ledger-for-private-and-secure-peer
Repo	https://github.com/DistributedML/Biscotti
Framework	none

Worst-case Optimal Submodular Extensions for Marginal Estimation


Title	Worst-case Optimal Submodular Extensions for Marginal Estimation
Authors	Pankaj Pansari, Chris Russell, M. Pawan Kumar
Abstract	Submodular extensions of an energy function can be used to efficiently compute approximate marginals via variational inference. The accuracy of the marginals depends crucially on the quality of the submodular extension. To identify the best possible extension, we show an equivalence between the submodular extensions of the energy and the objective functions of linear programming (LP) relaxations for the corresponding MAP estimation problem. This allows us to (i) establish the worst-case optimality of the submodular extension for Potts model used in the literature; (ii) identify the worst-case optimal submodular extension for the more general class of metric labeling; and (iii) efficiently compute the marginals for the widely used dense CRF model with the help of a recently proposed Gaussian filtering method. Using synthetic and real data, we show that our approach provides comparable upper bounds on the log-partition function to those obtained using tree-reweighted message passing (TRW) in cases where the latter is computationally feasible. Importantly, unlike TRW, our approach provides the first practical algorithm to compute an upper bound on the dense CRF model.
Tasks
Published	2018-01-10
URL	http://arxiv.org/abs/1801.06490v1
PDF	http://arxiv.org/pdf/1801.06490v1.pdf
PWC	https://paperswithcode.com/paper/worst-case-optimal-submodular-extensions-for
Repo	https://github.com/pankajpansari/denseCRF
Framework	none