April 3, 2020

3170 words 15 mins read

Paper Group AWR 17

Paper Group AWR 17

Variational Wasserstein Barycenters for Geometric Clustering. MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection. PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks for Thermal and NIR to Visible Image Transformation. Building a COVID-19 Vulnerability Index. End-to-End Fast Training of Communication …

Variational Wasserstein Barycenters for Geometric Clustering

Title Variational Wasserstein Barycenters for Geometric Clustering
Authors Liang Mi, Tianshu Yu, Jose Bento, Wen Zhang, Baoxin Li, Yalin Wang
Abstract We propose to compute Wasserstein barycenters (WBs) by solving for Monge maps with variational principle. We discuss the metric properties of WBs and explore their connections, especially the connections of Monge WBs, to K-means clustering and co-clustering. We also discuss the feasibility of Monge WBs on unbalanced measures and spherical domains. We propose two new problems – regularized K-means and Wasserstein barycenter compression. We demonstrate the use of VWBs in solving these clustering-related problems.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10543v1
PDF https://arxiv.org/pdf/2002.10543v1.pdf
PWC https://paperswithcode.com/paper/variational-wasserstein-barycenters-for
Repo https://github.com/icemiliang/pyvot
Framework pytorch

MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection

Title MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection
Authors Abdullah Rashwan, Rishav Agarwal, Agastya Kalra, Pascal Poupart
Abstract We present MatrixNets (xNets), a new deep architecture for object detection. xNets map objects with similar sizes and aspect ratios into many specialized layers, allowing xNets to provide a scale and aspect ratio aware architecture. We leverage xNets to enhance single-stage object detection frameworks. First, we apply xNets on anchor-based object detection, for which we predict object centers and regress the top-left and bottom-right corners. Second, we use MatrixNets for corner-based object detection by predicting top-left and bottom-right corners. Each corner predicts the center location of the object. We also enhance corner-based detection by replacing the embedding layer with center regression. Our final architecture achieves mAP of 47.8 on MS COCO, which is higher than its CornerNet counterpart by +5.6 mAP while also closing the gap between single-stage and two-stage detectors. The code is available at https://github.com/arashwan/matrixnet.
Tasks Object Detection
Published 2020-01-09
URL https://arxiv.org/abs/2001.03194v1
PDF https://arxiv.org/pdf/2001.03194v1.pdf
PWC https://paperswithcode.com/paper/matrixnets-a-new-scale-and-aspect-ratio-aware
Repo https://github.com/arashwan/matrixnet
Framework pytorch

PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks for Thermal and NIR to Visible Image Transformation

Title PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks for Thermal and NIR to Visible Image Transformation
Authors Kancharagunta Kishan Babu, Shiv Ram Dubey
Abstract In many real world scenarios, it is difficult to capture the images in the visible light spectrum (VIS) due to bad lighting conditions. However, the images can be captured in such scenarios using Near-Infrared (NIR) and Thermal (THM) cameras. The NIR and THM images contain the limited details. Thus, there is a need to transform the images from THM/NIR to VIS for better understanding. However, it is non-trivial task due to the large domain discrepancies and lack of abundant datasets. Nowadays, Generative Adversarial Network (GAN) is able to transform the images from one domain to another domain. Most of the available GAN based methods use the combination of the adversarial and the pixel-wise losses (like L1 or L2) as the objective function for training. The quality of transformed images in case of THM/NIR to VIS transformation is still not up to the mark using such objective function. Thus, better objective functions are needed to improve the quality, fine details and realism of the transformed images. A new model for THM/NIR to VIS image transformation called Perceptual Cyclic-Synthesized Generative Adversarial Network (PCSGAN) is introduced to address these issues. The PCSGAN uses the combination of the perceptual (i.e., feature based) losses along with the pixel-wise and the adversarial losses. Both the quantitative and qualitative measures are used to judge the performance of the PCSGAN model over the WHU-IIP face and the RGB-NIR scene datasets. The proposed PCSGAN outperforms the state-of-the-art image transformation models, including Pix2pix, DualGAN, CycleGAN, PS2GAN, and PAN in terms of the SSIM, MSE, PSNR and LPIPS evaluation measures. The code is available at: \url{https://github.com/KishanKancharagunta/PCSGAN}.
Published 2020-02-13
URL https://arxiv.org/abs/2002.07082v1
PDF https://arxiv.org/pdf/2002.07082v1.pdf
PWC https://paperswithcode.com/paper/pcsgan-perceptual-cyclic-synthesized
Repo https://github.com/KishanKancharagunta/PCSGAN
Framework pytorch

Building a COVID-19 Vulnerability Index

Title Building a COVID-19 Vulnerability Index
Authors Dave DeCaprio, Joseph Gartner, Thadeus Burgess, Sarthak Kothari, Shaayan Sayed, Carol J. McCall
Abstract COVID-19 is an acute respiratory disease that has been classified as a pandemic by the World Health Organization. Information regarding this particular disease is limited, however, it is known to have high mortality rates, particularly among individuals with preexisting medical conditions. Creating models to identify individuals who are at the greatest risk for severe complications due to COVID-19 will be useful to help for outreach campaigns in mitigating the diseases worst effects. While information specific to COVID-19 is limited, a model using complications due to other upper respiratory infections can be used as a proxy to help identify those individuals who are at the greatest risk. We present the results for three models predicting such complications, with each model having varying levels of predictive effectiveness at the expense of ease of implementation.
Published 2020-03-16
URL https://arxiv.org/abs/2003.07347v2
PDF https://arxiv.org/pdf/2003.07347v2.pdf
PWC https://paperswithcode.com/paper/building-a-covid-19-vulnerability-index
Repo https://github.com/closedloop-ai/cv19index
Framework none
Title End-to-End Fast Training of Communication Links Without a Channel Model via Online Meta-Learning
Authors Sangwoo Park, Osvaldo Simeone, Joonhyuk Kang
Abstract When a channel model is not available, the end-to-end training of encoder and decoder on a fading noisy channel generally requires the repeated use of the channel and of a feedback link. An important limitation of the approach is that training should be generally carried out from scratch for each new channel. To cope with this problem, prior works considered joint training over multiple channels with the aim of finding a single pair of encoder and decoder that works well on a class of channels. In this paper, we propose to obviate the limitations of joint training via meta-learning. The proposed approach is based on a meta-training phase in which the online gradient-based meta-learning of the decoder is coupled with the joint training of the encoder via the transmission of pilots and the use of a feedback link. Accounting for channel variations during the meta-training phase, this work demonstrates the advantages of meta-learning in terms of number of pilots as compared to conventional methods when the feedback link is only available for meta-training and not at run time.
Tasks Meta-Learning
Published 2020-03-03
URL https://arxiv.org/abs/2003.01479v1
PDF https://arxiv.org/pdf/2003.01479v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-fast-training-of-communication
Repo https://github.com/kclip/meta-autoencoder-without-channel-model
Framework pytorch

Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures

Title Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures
Authors Alex Luedtke, Incheoul Chung, Oleg Sofrygin
Abstract We frame the meta-learning of prediction procedures as a search for an optimal strategy in a two-player game. In this game, Nature selects a prior over distributions that generate labeled data consisting of features and an associated outcome, and the Predictor observes data sampled from a distribution drawn from this prior. The Predictor’s objective is to learn a function that maps from a new feature to an estimate of the associated outcome. We establish that, under reasonable conditions, the Predictor has an optimal strategy that is equivariant to shifts and rescalings of the outcome and is invariant to permutations of the observations and to shifts, rescalings, and permutations of the features. We introduce a neural network architecture that satisfies these properties. The proposed strategy performs favorably compared to standard practice in both parametric and nonparametric experiments.
Tasks Meta-Learning
Published 2020-02-26
URL https://arxiv.org/abs/2002.11275v1
PDF https://arxiv.org/pdf/2002.11275v1.pdf
PWC https://paperswithcode.com/paper/adversarial-monte-carlo-meta-learning-of
Repo https://github.com/alexluedtke12/amc-meta-learning-of-optimal-prediction-procedures
Framework pytorch

Multi-Step Model-Agnostic Meta-Learning: Convergence and Improved Algorithms

Title Multi-Step Model-Agnostic Meta-Learning: Convergence and Improved Algorithms
Authors Kaiyi Ji, Junjie Yang, Yingbin Liang
Abstract As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness. However, the convergence of the general multi-step MAML still remains unexplored. In this paper, we develop a new theoretical framework, under which we characterize the convergence rate and the computational complexity of multi-step MAML. Our results indicate that $N$-step MAML attains the convergence with linearly increasing complexity with $N$ under a properly chosen inner stepsize. We then take a further step to develop a more efficient Hessian-free MAML. We first show that the existing zeroth-order Hessian estimator contains a constant-level estimation error so that the MAML algorithm can perform unstably. To address this issue, we propose a novel Hessian estimator via a gradient-based Gaussian smoothing method, and show that it achieves a much smaller estimation bias and variance, and the resulting algorithm achieves the same performance guarantee as the original MAML under mild conditions. Our experiments validate our theory and demonstrate the effectiveness of the proposed Hessian estimator.
Tasks Meta-Learning
Published 2020-02-18
URL https://arxiv.org/abs/2002.07836v2
PDF https://arxiv.org/pdf/2002.07836v2.pdf
PWC https://paperswithcode.com/paper/multi-step-model-agnostic-meta-learning
Repo https://github.com/JunjieYang97/GGS-MAML-RL
Framework pytorch

Meta-Transfer Learning for Zero-Shot Super-Resolution

Title Meta-Transfer Learning for Zero-Shot Super-Resolution
Authors Jae Woong Soh, Sunwoo Cho, Nam Ik Cho
Abstract Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a “bicubic” downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process.
Tasks Image Super-Resolution, Meta-Learning, Super-Resolution, Transfer Learning
Published 2020-02-27
URL https://arxiv.org/abs/2002.12213v1
PDF https://arxiv.org/pdf/2002.12213v1.pdf
PWC https://paperswithcode.com/paper/meta-transfer-learning-for-zero-shot-super
Repo https://github.com/JWSoh/MZSR
Framework tf

SynFi: Automatic Synthetic Fingerprint Generation

Title SynFi: Automatic Synthetic Fingerprint Generation
Authors M. Sadegh Riazi, Seyed M. Chavoshian, Farinaz Koushanfar
Abstract Authentication and identification methods based on human fingerprints are ubiquitous in several systems ranging from government organizations to consumer products. The performance and reliability of such systems directly rely on the volume of data on which they have been verified. Unfortunately, a large volume of fingerprint databases is not publicly available due to many privacy and security concerns. In this paper, we introduce a new approach to automatically generate high-fidelity synthetic fingerprints at scale. Our approach relies on (i) Generative Adversarial Networks to estimate the probability distribution of human fingerprints and (ii) Super-Resolution methods to synthesize fine-grained textures. We rigorously test our system and show that our methodology is the first to generate fingerprints that are computationally indistinguishable from real ones, a task that prior art could not accomplish.
Tasks Super-Resolution
Published 2020-02-16
URL https://arxiv.org/abs/2002.08900v1
PDF https://arxiv.org/pdf/2002.08900v1.pdf
PWC https://paperswithcode.com/paper/synfi-automatic-synthetic-fingerprint
Repo https://github.com/MohammadChavosh/synthetic-fingerprint-generation
Framework pytorch

EndoL2H: Deep Super-Resolution for Capsule Endoscopy

Title EndoL2H: Deep Super-Resolution for Capsule Endoscopy
Authors Yasin Almalioglu, Abdulkadir Gokce, Kagan Incetan, Muhammed Ali Simsek, Kivanc Ararat, Richard J. Chen, Nichalos J. Durr, Faisal Mahmood, Mehmet Turan
Abstract Wireless capsule endoscopy is the preferred modality for diagnosis and assessment of small bowel disease. However, the poor resolution is a limitation for both subjective and automated diagnostics. Enhanced-resolution endoscopy has shown to improve adenoma detection rate for conventional endoscopy and is likely to do the same for capsule endoscopy. In this work, we propose and quantitatively validate a novel framework to learn a mapping from low-to-high resolution endoscopic images. We use conditional adversarial networks and spatial attention to improve the resolution by up to a factor of 8x. Our quantitative study demonstrates the superiority of our proposed approach over Super-Resolution Generative Adversarial Network (SRGAN) and bicubic interpolation. For qualitative analysis, visual Turing tests were performed by 16 gastroenterologists to confirm the clinical utility of the proposed approach. Our approach is generally applicable to any endoscopic capsule system and has the potential to improve diagnosis and better harness computational approaches for polyp detection and characterization. Our code and trained models are available at https://github.com/akgokce/EndoL2H.
Tasks Super-Resolution
Published 2020-02-13
URL https://arxiv.org/abs/2002.05459v1
PDF https://arxiv.org/pdf/2002.05459v1.pdf
PWC https://paperswithcode.com/paper/endol2h-deep-super-resolution-for-capsule
Repo https://github.com/akgokce/EndoL2H
Framework pytorch

Cross-domain Detection via Graph-induced Prototype Alignment

Title Cross-domain Detection via Graph-induced Prototype Alignment
Authors Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, Wenjun Zhang
Abstract Applying the knowledge of an object detector trained on a specific domain directly onto a new domain is risky, as the gap between two domains can severely degrade model’s performance. Furthermore, since different instances commonly embody distinct modal information in object detection scenario, the feature alignment of source and target domain is hard to be realized. To mitigate these problems, we propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment via elaborate prototype representations. In the nutshell, more precise instance-level features are obtained through graph-based information propagation among region proposals, and, on such basis, the prototype representation of each class is derived for category-level domain alignment. In addition, in order to alleviate the negative effect of class-imbalance on domain adaptation, we design a Class-reweighted Contrastive Loss to harmonize the adaptation training process. Combining with Faster R-CNN, the proposed framework conducts feature alignment in a two-stage manner. Comprehensive results on various cross-domain detection tasks demonstrate that our approach outperforms existing methods with a remarkable margin. Our code is available at https://github.com/ChrisAllenMing/GPA-detection.
Tasks Domain Adaptation, Object Detection
Published 2020-03-28
URL https://arxiv.org/abs/2003.12849v1
PDF https://arxiv.org/pdf/2003.12849v1.pdf
PWC https://paperswithcode.com/paper/cross-domain-detection-via-graph-induced
Repo https://github.com/ChrisAllenMing/GPA-detection
Framework pytorch

Training-Set Distillation for Real-Time UAV Object Tracking

Title Training-Set Distillation for Real-Time UAV Object Tracking
Authors Fan Li, Changhong Fu, Fuling Lin, Yiming Li, Peng Lu
Abstract Correlation filter (CF) has recently exhibited promising performance in visual object tracking for unmanned aerial vehicle (UAV). Such online learning method heavily depends on the quality of the training-set, yet complicated aerial scenarios like occlusion or out of view can reduce its reliability. In this work, a novel time slot-based distillation approach is proposed to efficiently and effectively optimize the training-set’s quality on the fly. A cooperative energy minimization function is established to score the historical samples adaptively. To accelerate the scoring process, frames with high confident tracking results are employed as the keyframes to divide the tracking process into multiple time slots. After the establishment of a new slot, the weighted fusion of the previous samples generates one key-sample, in order to reduce the number of samples to be scored. Besides, when the current time slot exceeds the maximum frame number, which can be scored, the sample with the lowest score will be discarded. Consequently, the training-set can be efficiently and reliably distilled. Comprehensive tests on two well-known UAV benchmarks prove the effectiveness of our method with real-time speed on a single CPU.
Tasks Object Tracking, Visual Object Tracking
Published 2020-03-11
URL https://arxiv.org/abs/2003.05326v1
PDF https://arxiv.org/pdf/2003.05326v1.pdf
PWC https://paperswithcode.com/paper/training-set-distillation-for-real-time-uav
Repo https://github.com/vision4robotics/TSD-Tracker
Framework none

GEDDnet: A Network for Gaze Estimation with Dilation and Decomposition

Title GEDDnet: A Network for Gaze Estimation with Dilation and Decomposition
Authors Zhaokang Chen, Bertram E. Shi
Abstract Appearance-based gaze estimation from RGB images provides relatively unconstrained gaze tracking from commonly available hardware. The accuracy of subject-independent models is limited partly by small intra-subject and large inter-subject variations in appearance, and partly by a latent subject-dependent bias. To improve estimation accuracy, we propose to use dilated-convolutions in a deep convolutional neural network to capture subtle changes in the eye images, and a novel gaze decomposition method that decomposes the gaze angle into the sum of a subject-independent gaze estimate from the image and a subject-dependent bias. To further reduce estimation error, we propose a calibration method that estimates the bias from a few images taken as the subject gazes at only a few or even just a single gaze target. This significantly redues calibration time and complexity. Experiments on four datasets, including a new dataset we collected containing large variations in head pose and face location, indicate that even without calibration the estimator already outperforms state-of-the-art methods by more than 6.3%. The proposed calibration method is robust to the location of calibration target and reduces estimation error significantly (up to 35.6%), achieving state-of-the-art performance with much less calibration data than required by previously proposed methods.
Tasks Calibration, Gaze Estimation
Published 2020-01-25
URL https://arxiv.org/abs/2001.09284v1
PDF https://arxiv.org/pdf/2001.09284v1.pdf
PWC https://paperswithcode.com/paper/geddnet-a-network-for-gaze-estimation-with
Repo https://github.com/czk32611/GEDDnet
Framework tf

Universal Differential Equations for Scientific Machine Learning

Title Universal Differential Equations for Scientific Machine Learning
Authors Christopher Rackauckas, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan
Abstract In the context of science, the well-known adage “a picture is worth a thousand words” might well be “a model is worth a thousand datasets.” Scientific models, such as Newtonian physics or biological gene regulatory networks, are human-driven simplifications of complex phenomena that serve as surrogates for the countless experiments that validated the models. Recently, machine learning has been able to overcome the inaccuracies of approximate modeling by directly learning the entire set of nonlinear interactions from data. However, without any predetermined structure from the scientific basis behind the problem, machine learning approaches are flexible but data-expensive, requiring large databases of homogeneous labeled training data. A central challenge is reconciling data that is at odds with simplified models without requiring “big data”. In this work we develop a new methodology, universal differential equations (UDEs), which augments scientific models with machine-learnable structures for scientifically-based learning. We show how UDEs can be utilized to discover previously unknown governing equations, accurately extrapolate beyond the original data, and accelerate model simulation, all in a time and data-efficient manner. This advance is coupled with open-source software that allows for training UDEs which incorporate physical constraints, delayed interactions, implicitly-defined events, and intrinsic stochasticity in the model. Our examples show how a diverse set of computationally-difficult modeling issues across scientific disciplines, from automatically discovering biological mechanisms to accelerating climate simulations by 15,000x, can be handled by training UDEs.
Published 2020-01-13
URL https://arxiv.org/abs/2001.04385v1
PDF https://arxiv.org/pdf/2001.04385v1.pdf
PWC https://paperswithcode.com/paper/universal-differential-equations-for
Repo https://github.com/ChrisRackauckas/universal_differential_equations
Framework none

Unsupervised Sentiment Analysis for Code-mixed Data

Title Unsupervised Sentiment Analysis for Code-mixed Data
Authors Siddharth Yadav, Tanmoy Chakraborty
Abstract Code-mixing is the practice of alternating between two or more languages. Mostly observed in multilingual societies, its occurrence is increasing and therefore its importance. A major part of sentiment analysis research has been monolingual, and most of them perform poorly on code-mixed text. In this work, we introduce methods that use different kinds of multilingual and cross-lingual embeddings to efficiently transfer knowledge from monolingual text to code-mixed text for sentiment analysis of code-mixed text. Our methods can handle code-mixed text through a zero-shot learning. Our methods beat state-of-the-art on English-Spanish code-mixed sentiment analysis by absolute 3% F1-score. We are able to achieve 0.58 F1-score (without parallel corpus) and 0.62 F1-score (with parallel corpus) on the same benchmark in a zero-shot way as compared to 0.68 F1-score in supervised settings. Our code is publicly available.
Tasks Sentiment Analysis, Zero-Shot Learning
Published 2020-01-20
URL https://arxiv.org/abs/2001.11384v1
PDF https://arxiv.org/pdf/2001.11384v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-sentiment-analysis-for-code
Repo https://github.com/sedflix/unsacmt
Framework none
comments powered by Disqus