Paper Group AWR 167
Automatic Latent Fingerprint Segmentation. Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks. AutoAugment: Learning Augmentation Policies from Data. Using phase instead of optical flow for action recognition. Human Motion Capture Using a Drone. Larger Norm More Transferable: An Adaptive Feature No …
Automatic Latent Fingerprint Segmentation
Title | Automatic Latent Fingerprint Segmentation |
Authors | Dinh-Luan Nguyen, Kai Cao, Anil K. Jain |
Abstract | We present a simple but effective method for automatic latent fingerprint segmentation, called SegFinNet. SegFinNet takes a latent image as an input and outputs a binary mask highlighting the friction ridge pattern. Our algorithm combines fully convolutional neural network and detection-based approaches to process the entire input latent image in one shot instead of using latent patches. Experimental results on three different latent databases (i.e. NIST SD27, WVU, and an operational forensic database) show that SegFinNet outperforms both human markup for latents and the state-of-the-art latent segmentation algorithms. We further show that this improved cropping boosts the hit rate of a latent fingerprint matcher. |
Tasks | |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09650v2 |
http://arxiv.org/pdf/1804.09650v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-latent-fingerprint-segmentation |
Repo | https://github.com/luannd/MinutiaeNet |
Framework | tf |
Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks
Title | Building Damage Annotation on Post-Hurricane Satellite Imagery Based on Convolutional Neural Networks |
Authors | Quoc Dung Cao, Youngjun Choe |
Abstract | After a hurricane, damage assessment is critical to emergency managers for efficient response and resource allocation. One way to gauge the damage extent is to quantify the number of flooded/damaged buildings, which is traditionally done by ground survey. This process can be labor-intensive and time-consuming. In this paper, we propose to improve the efficiency of building damage assessment by applying image classification algorithms to post-hurricane satellite imagery. At the known building coordinates (available from public data), we extract square-sized images from the satellite imagery to create training, validation, and test datasets. Each square-sized image contains a building to be classified as either ‘Flooded/Damaged’ (labeled by volunteers in a crowd-sourcing project) or ‘Undamaged’. We design and train a convolutional neural network from scratch and compare it with an existing neural network used widely for common object classification. We demonstrate the promise of our damage annotation model (over 97% accuracy) in the case study of building damage assessment in the Greater Houston area affected by 2017 Hurricane Harvey. |
Tasks | Image Classification, Object Classification |
Published | 2018-07-04 |
URL | https://arxiv.org/abs/1807.01688v3 |
https://arxiv.org/pdf/1807.01688v3.pdf | |
PWC | https://paperswithcode.com/paper/detecting-damaged-buildings-on-post-hurricane |
Repo | https://github.com/johnson-shuffle/tdi-proposal |
Framework | none |
AutoAugment: Learning Augmentation Policies from Data
Title | AutoAugment: Learning Augmentation Policies from Data |
Authors | Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le |
Abstract | Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment to automatically search for improved data augmentation policies. In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch. A sub-policy consists of two operations, each operation being an image processing function such as translation, rotation, or shearing, and the probabilities and magnitudes with which the functions are applied. We use a search algorithm to find the best policy such that the neural network yields the highest validation accuracy on a target dataset. Our method achieves state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On ImageNet, we attain a Top-1 accuracy of 83.5% which is 0.4% better than the previous record of 83.1%. On CIFAR-10, we achieve an error rate of 1.5%, which is 0.6% better than the previous state-of-the-art. Augmentation policies we find are transferable between datasets. The policy learned on ImageNet transfers well to achieve significant improvements on other datasets, such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars. |
Tasks | Data Augmentation, Fine-Grained Image Classification, Image Augmentation, Image Classification |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09501v3 |
http://arxiv.org/pdf/1805.09501v3.pdf | |
PWC | https://paperswithcode.com/paper/autoaugment-learning-augmentation-policies |
Repo | https://github.com/northeastsquare/effficientnet |
Framework | tf |
Using phase instead of optical flow for action recognition
Title | Using phase instead of optical flow for action recognition |
Authors | Omar Hommos, Silvia L. Pintea, Pascal S. M. Mettes, Jan C. van Gemert |
Abstract | Currently, the most common motion representation for action recognition is optical flow. Optical flow is based on particle tracking which adheres to a Lagrangian perspective on dynamics. In contrast to the Lagrangian perspective, the Eulerian model of dynamics does not track, but describes local changes. For video, an Eulerian phase-based motion representation, using complex steerable filters, has been successfully employed recently for motion magnification and video frame interpolation. Inspired by these previous works, here, we proposes learning Eulerian motion representations in a deep architecture for action recognition. We learn filters in the complex domain in an end-to-end manner. We design these complex filters to resemble complex Gabor filters, typically employed for phase-information extraction. We propose a phase-information extraction module, based on these complex filters, that can be used in any network architecture for extracting Eulerian representations. We experimentally analyze the added value of Eulerian motion representations, as extracted by our proposed phase extraction module, and compare with existing motion representations based on optical flow, on the UCF101 dataset. |
Tasks | Optical Flow Estimation, Temporal Action Localization, Video Frame Interpolation |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03258v2 |
http://arxiv.org/pdf/1809.03258v2.pdf | |
PWC | https://paperswithcode.com/paper/using-phase-instead-of-optical-flow-for |
Repo | https://github.com/11maxed11/phase-based-action-recognition |
Framework | tf |
Human Motion Capture Using a Drone
Title | Human Motion Capture Using a Drone |
Authors | Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis |
Abstract | Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments. In this work we introduce a drone-based system for 3D human MoCap. The system only needs an autonomously flying drone with an on-board RGB camera and is usable in various indoor and outdoor environments. A reconstruction algorithm is developed to recover full-body motion from the video recorded by the drone. We argue that, besides the capability of tracking a moving subject, a flying drone also provides fast varying viewpoints, which is beneficial for motion reconstruction. We evaluate the accuracy of the proposed system using our new DroCap dataset and also demonstrate its applicability for MoCap in the wild using a consumer drone. |
Tasks | Motion Capture |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06112v1 |
http://arxiv.org/pdf/1804.06112v1.pdf | |
PWC | https://paperswithcode.com/paper/human-motion-capture-using-a-drone |
Repo | https://github.com/daniilidis-group/drocap |
Framework | none |
Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation
Title | Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation |
Authors | Ruijia Xu, Guanbin Li, Jihan Yang, Liang Lin |
Abstract | Domain adaptation enables the learner to safely generalize into novel environments by mitigating domain shifts across distributions. Previous works may not effectively uncover the underlying reasons that would lead to the drastic model degradation on the target task. In this paper, we empirically reveal that the erratic discrimination of the target domain mainly stems from its much smaller feature norms with respect to that of the source domain. To this end, we propose a novel parameter-free Adaptive Feature Norm approach. We demonstrate that progressively adapting the feature norms of the two domains to a large range of values can result in significant transfer gains, implying that those task-specific features with larger norms are more transferable. Our method successfully unifies the computation of both standard and partial domain adaptation with more robustness against the negative transfer issue. Without bells and whistles but a few lines of code, our method substantially lifts the performance on the target task and exceeds state-of-the-arts by a large margin (11.5% on Office-Home and 17.1% on VisDA2017). We hope our simple yet effective approach will shed some light on the future research of transfer learning. Code is available at https://github.com/jihanyang/AFN. |
Tasks | Domain Adaptation, Partial Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2018-11-19 |
URL | https://arxiv.org/abs/1811.07456v2 |
https://arxiv.org/pdf/1811.07456v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-an-adaptive |
Repo | https://github.com/jihanyang/AFN |
Framework | pytorch |
Differentiable plasticity: training plastic neural networks with backpropagation
Title | Differentiable plasticity: training plastic neural networks with backpropagation |
Authors | Thomas Miconi, Jeff Clune, Kenneth O. Stanley |
Abstract | How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem. |
Tasks | Meta-Learning, Omniglot |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02464v3 |
http://arxiv.org/pdf/1804.02464v3.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-plasticity-training-plastic |
Repo | https://github.com/darylfung96/differentiable_plasticity |
Framework | pytorch |
GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint
Title | GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint |
Authors | Jianlin Su |
Abstract | We know SGAN may have a risk of gradient vanishing. A significant improvement is WGAN, with the help of 1-Lipschitz constraint on discriminator to prevent from gradient vanishing. Is there any GAN having no gradient vanishing and no 1-Lipschitz constraint on discriminator? We do find one, called GAN-QP. To construct a new framework of Generative Adversarial Network (GAN) usually includes three steps: 1. choose a probability divergence; 2. convert it into a dual form; 3. play a min-max game. In this articles, we demonstrate that the first step is not necessary. We can analyse the property of divergence and even construct new divergence in dual space directly. As a reward, we obtain a simpler alternative of WGAN: GAN-QP. We demonstrate that GAN-QP have a better performance than WGAN in theory and practice. |
Tasks | |
Published | 2018-11-18 |
URL | http://arxiv.org/abs/1811.07296v4 |
http://arxiv.org/pdf/1811.07296v4.pdf | |
PWC | https://paperswithcode.com/paper/gan-qp-a-novel-gan-framework-without-gradient |
Repo | https://github.com/bojone/gan-qp |
Framework | pytorch |
Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Title | Fighting Fake News: Image Splice Detection via Learned Self-Consistency |
Authors | Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros |
Abstract | Advances in photo editing and manipulation tools have made it significantly easier to create fake imagery. Learning to detect such manipulations, however, remains a challenging problem due to the lack of sufficient amounts of manipulated training data. In this paper, we propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs. The algorithm uses the automatically recorded photo EXIF metadata as supervisory signal for training a model to determine whether an image is self-consistent – that is, whether its content could have been produced by a single imaging pipeline. We apply this self-consistency model to the task of detecting and localizing image splices. The proposed method obtains state-of-the-art performance on several image forensics benchmarks, despite never seeing any manipulated images at training. That said, it is merely a step in the long quest for a truly general purpose visual forensics tool. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.04096v3 |
http://arxiv.org/pdf/1805.04096v3.pdf | |
PWC | https://paperswithcode.com/paper/fighting-fake-news-image-splice-detection-via |
Repo | https://github.com/minyoungg/selfconsistency |
Framework | tf |
FARM: Functional Automatic Registration Method for 3D Human Bodies
Title | FARM: Functional Automatic Registration Method for 3D Human Bodies |
Authors | Riccardo Marin, Simone Melzi, Emanuele Rodolà, Umberto Castellani |
Abstract | We introduce a new method for non-rigid registration of 3D human shapes. Our proposed pipeline builds upon a given parametric model of the human, and makes use of the functional map representation for encoding and inferring shape maps throughout the registration process. This combination endows our method with robustness to a large variety of nuisances observed in practical settings, including non-isometric transformations, downsampling, topological noise, and occlusions; further, the pipeline can be applied invariably across different shape representations (e.g. meshes and point clouds), and in the presence of (even dramatic) missing parts such as those arising in real-world depth sensing applications. We showcase our method on a selection of challenging tasks, demonstrating results in line with, or even surpassing, state-of-the-art methods in the respective areas. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10517v1 |
http://arxiv.org/pdf/1807.10517v1.pdf | |
PWC | https://paperswithcode.com/paper/farm-functional-automatic-registration-method |
Repo | https://github.com/riccardomarin/FARM |
Framework | none |
Fast and Accurate Online Video Object Segmentation via Tracking Parts
Title | Fast and Accurate Online Video Object Segmentation via Tracking Parts |
Authors | Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, Ming-Hsuan Yang |
Abstract | 基于视频的目标检测算法研究 |
Tasks | Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02323v1 |
http://arxiv.org/pdf/1806.02323v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-accurate-online-video-object |
Repo | https://github.com/guanfuchen/video_obj |
Framework | pytorch |
Fine-grained Apparel Classification and Retrieval without rich annotations
Title | Fine-grained Apparel Classification and Retrieval without rich annotations |
Authors | Aniket Bhatnagar, Sanchit Aggarwal |
Abstract | The ability to correctly classify and retrieve apparel images has a variety of applications important to e-commerce, online advertising and internet search. In this work, we propose a robust framework for fine-grained apparel classification, in-shop and cross-domain retrieval which eliminates the requirement of rich annotations like bounding boxes and human-joints or clothing landmarks, and training of bounding box/ key-landmark detector for the same. Factors such as subtle appearance differences, variations in human poses, different shooting angles, apparel deformations, and self-occlusion add to the challenges in classification and retrieval of apparel items. Cross-domain retrieval is even harder due to the presence of large variation between online shopping images, usually taken in ideal lighting, pose, positive angle and clean background as compared with street photos captured by users in complicated conditions with poor lighting and cluttered scenes. Our framework uses compact bilinear CNN with tensor sketch algorithm to generate embeddings that capture local pairwise feature interactions in a translationally invariant manner. For apparel classification, we pass the feature embeddings through a softmax classifier, while, the in-shop and cross-domain retrieval pipelines use a triplet-loss based optimization approach, such that squared Euclidean distance between embeddings measures the dissimilarity between the images. Unlike previous works that relied on bounding box, key clothing landmarks or human joint detectors to assist the final deep classifier, proposed framework can be trained directly on the provided category labels or generated triplets for triplet loss optimization. Lastly, Experimental results on the DeepFashion fine-grained categorization, and in-shop and consumer-to-shop retrieval datasets provide a comparative analysis with previous work performed in the domain. |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02385v1 |
http://arxiv.org/pdf/1811.02385v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-apparel-classification-and |
Repo | https://github.com/aniket03/keras_compact_bilnear_CNN |
Framework | tf |
Nonparametric inference of interaction laws in systems of agents from trajectory data
Title | Nonparametric inference of interaction laws in systems of agents from trajectory data |
Authors | Fei Lu, Mauro Maggioni, Sui Tang, Ming Zhong |
Abstract | Inferring the laws of interaction between particles and agents in complex dynamical systems from observational data is a fundamental challenge in a wide variety of disciplines. We propose a non-parametric statistical learning approach to estimate the governing laws of distance-based interactions, with no reference or assumption about their analytical form, from data consisting trajectories of interacting agents. We demonstrate the effectiveness of our learning approach both by providing theoretical guarantees, and by testing the approach on a variety of prototypical systems in various disciplines. These systems include homogeneous and heterogeneous agents systems, ranging from particle systems in fundamental physics to agent-based systems modeling opinion dynamics under the social influence, prey-predator dynamics, flocking and swarming, and phototaxis in cell dynamics. |
Tasks | |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.06003v4 |
http://arxiv.org/pdf/1812.06003v4.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-inference-of-interaction-laws |
Repo | https://github.com/MingZhongCodes/LearningDynamics |
Framework | none |
Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning
Title | Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning |
Authors | Muhammad Shayan, Clement Fung, Chris J. M. Yoon, Ivan Beschastnikh |
Abstract | Federated Learning is the current state of the art in supporting secure multi-party machine learning (ML): data is maintained on the owner’s device and the updates to the model are aggregated through a secure protocol. However, this process assumes a trusted centralized infrastructure for coordination, and clients must trust that the central service does not use the byproducts of client data. In addition to this, a group of malicious clients could also harm the performance of the model by carrying out a poisoning attack. As a response, we propose Biscotti: a fully decentralized peer to peer (P2P) approach to multi-party ML, which uses blockchain and cryptographic primitives to coordinate a privacy-preserving ML process between peering clients. Our evaluation demonstrates that Biscotti is scalable, fault tolerant, and defends against known attacks. For example, Biscotti is able to protect the privacy of an individual client’s update and the performance of the global model at scale when 30% of adversaries are trying to poison the model. The implementation can be found at: https://github.com/DistributedML/Biscotti |
Tasks | |
Published | 2018-11-24 |
URL | https://arxiv.org/abs/1811.09904v4 |
https://arxiv.org/pdf/1811.09904v4.pdf | |
PWC | https://paperswithcode.com/paper/biscotti-a-ledger-for-private-and-secure-peer |
Repo | https://github.com/DistributedML/Biscotti |
Framework | none |
Worst-case Optimal Submodular Extensions for Marginal Estimation
Title | Worst-case Optimal Submodular Extensions for Marginal Estimation |
Authors | Pankaj Pansari, Chris Russell, M. Pawan Kumar |
Abstract | Submodular extensions of an energy function can be used to efficiently compute approximate marginals via variational inference. The accuracy of the marginals depends crucially on the quality of the submodular extension. To identify the best possible extension, we show an equivalence between the submodular extensions of the energy and the objective functions of linear programming (LP) relaxations for the corresponding MAP estimation problem. This allows us to (i) establish the worst-case optimality of the submodular extension for Potts model used in the literature; (ii) identify the worst-case optimal submodular extension for the more general class of metric labeling; and (iii) efficiently compute the marginals for the widely used dense CRF model with the help of a recently proposed Gaussian filtering method. Using synthetic and real data, we show that our approach provides comparable upper bounds on the log-partition function to those obtained using tree-reweighted message passing (TRW) in cases where the latter is computationally feasible. Importantly, unlike TRW, our approach provides the first practical algorithm to compute an upper bound on the dense CRF model. |
Tasks | |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.06490v1 |
http://arxiv.org/pdf/1801.06490v1.pdf | |
PWC | https://paperswithcode.com/paper/worst-case-optimal-submodular-extensions-for |
Repo | https://github.com/pankajpansari/denseCRF |
Framework | none |