April 3, 2020

3364 words 16 mins read

Paper Group ANR 52

Paper Group ANR 52

Learning to Inpaint by Progressively Growing the Mask Regions. Deep Slow Motion Video Reconstruction with Hybrid Imaging System. Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble. DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition. A Connection between Feedback Capacity …

Learning to Inpaint by Progressively Growing the Mask Regions

Title Learning to Inpaint by Progressively Growing the Mask Regions
Authors Mohamed Abbas Hedjazi, Yakup Genc
Abstract Image inpainting is one of the most challenging tasks in computer vision. Recently, generative-based image inpainting methods have been shown to produce visually plausible images. However, they still have difficulties to generate the correct structures and colors as the masked region grows large. This drawback is due to the training stability issue of the generative models. This work introduces a new curriculum-style training approach in the context of image inpainting. The proposed method increases the masked region size progressively in training time, during test time the user gives variable size and multiple holes at arbitrary locations. Incorporating such an approach in GANs may stabilize the training and provides better color consistencies and captures object continuities. We validate our approach on the MSCOCO and CelebA datasets. We report qualitative and quantitative comparisons of our training approach in different models.
Tasks Image Inpainting
Published 2020-02-21
URL https://arxiv.org/abs/2002.09280v1
PDF https://arxiv.org/pdf/2002.09280v1.pdf
PWC https://paperswithcode.com/paper/learning-to-inpaint-by-progressively-growing

Deep Slow Motion Video Reconstruction with Hybrid Imaging System

Title Deep Slow Motion Video Reconstruction with Hybrid Imaging System
Authors Avinash Paliwal, Nima Khademi Kalantari
Abstract Slow motion videos are becoming increasingly popular, but capturing high-resolution videos at extremely high frame rates requires professional high-speed cameras. To mitigate this problem, current techniques increase the frame rate of standard videos through frame interpolation by assuming linear motion between the existing frames. While this assumption holds true for simple cases with small motion, in challenging cases the motion is usually complex and this assumption is no longer valid. Therefore, they typically produce results with unnatural motion in these challenging cases. In this paper, we address this problem using two video streams as the input; an auxiliary video with high frame rate and low spatial resolution, providing temporal information, in addition to the standard main video with low frame rate and high spatial resolution. We propose a two-stage deep learning system consisting of alignment and appearance estimation that reconstructs high resolution slow motion video from the hybrid video input. For alignment, we propose to use a set of pre-trained and trainable convolutional neural networks (CNNs) to compute the flows between the missing frame and the two existing frames of the main video by utilizing the content of the auxiliary video frames. We then warp the existing frames using the flows to produce a set of aligned frames. For appearance estimation, we propose to combine the aligned and auxiliary frames using a context and occlusion aware CNN. We train our model on a set of synthetically generated hybrid videos and show high-quality results on a wide range of test scenes. We further demonstrate the practicality of our approach by showing the performance of our system on two real dual camera setups with small baseline.
Tasks Video Reconstruction
Published 2020-02-27
URL https://arxiv.org/abs/2002.12106v1
PDF https://arxiv.org/pdf/2002.12106v1.pdf
PWC https://paperswithcode.com/paper/deep-slow-motion-video-reconstruction-with

Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble

Title Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble
Authors Yongming Li, Yan Lei, Pin Wang, Yuchuan Liu
Abstract Deep learning is a kind of feature learning method with strong nonliear feature transformation and becomes more and more important in many fields of artificial intelligence. Deep autoencoder is one representative method of the deep learning methods, and can effectively extract abstract the information of datasets. However, it does not consider the complementarity between the deep features and original features during deep feature transformation. Besides, it suffers from small sample problem. In order to solve these problems, a novel deep autoencoder - hybrid feature embedded stacked sparse autoencoder(HESSAE) has been proposed in this paper. HFESAE is capable to learn discriminant deep features with the help of embedding original features to filter weak hidden-layer outputs during training. For the issue that class representation ability of abstract information is limited by small sample problem, a feature fusion strategy has been designed aiming to combining abstract information learned by HFESAE with original feature and obtain hybrid features for feature reduction. The strategy is hybrid feature selection strategy based on L1 regularization followed by an support vector machine(SVM) ensemble model, in which weighted local discriminant preservation projection (w_LPPD), is designed and employed on each base classifier. At the end of this paper, several representative public datasets are used to verify the effectiveness of the proposed algorithm. The experimental results demonstrated that, the proposed feature learning method yields superior performance compared to other existing and state of art feature learning algorithms including some representative deep autoencoder methods.
Tasks Feature Selection
Published 2020-02-17
URL https://arxiv.org/abs/2002.06761v1
PDF https://arxiv.org/pdf/2002.06761v1.pdf
PWC https://paperswithcode.com/paper/hybrid-embedded-deep-stacked-sparse

DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition

Title DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition
Authors Hao-Chiang Shao, Kang-Yu Liu, Chia-Wen Lin, Jiwen Lu
Abstract The performance of a convolutional neural network (CNN) based face recognition model largely relies on the richness of labelled training data. Collecting a training set with large variations of a face identity under different poses and illumination changes, however, is very expensive, making the diversity of within-class face images a critical issue in practice. In this paper, we propose a 3D model-assisted domain-transferred face augmentation network (DotFAN) that can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains. DotFAN is structurally a conditional CycleGAN but has two additional subnetworks, namely face expert network (FEM) and face shape regressor (FSR), for latent code control. While FSR aims to extract face attributes, FEM is designed to capture a face identity. With their aid, DotFAN can learn a disentangled face representation and effectively generate face images of various facial attributes while preserving the identity of augmented faces. Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity so that a better face recognition model can be learned from the augmented dataset.
Tasks Face Recognition
Published 2020-02-23
URL https://arxiv.org/abs/2002.09859v1
PDF https://arxiv.org/pdf/2002.09859v1.pdf
PWC https://paperswithcode.com/paper/dotfan-a-domain-transferred-face-augmentation

A Connection between Feedback Capacity and Kalman Filter for Colored Gaussian Noises

Title A Connection between Feedback Capacity and Kalman Filter for Colored Gaussian Noises
Authors Song Fang, Quanyan Zhu
Abstract In this paper, we establish a connection between the feedback capacity of additive colored Gaussian noise channels and the Kalman filters with additive colored Gaussian noises. In light of this, we are able to provide lower bounds on feedback capacity of such channels with finite-order auto-regressive moving average colored noises, and the bounds are seen to be consistent with various existing results in the literature; particularly, the bound is tight in the case of first-order auto-regressive moving average colored noises. On the other hand, the Kalman filtering systems, after certain equivalence transformations, can be employed as recursive coding schemes/algorithms to achieve the lower bounds. In general, our results provide an alternative perspective while pointing to potentially tighter bounds for the feedback capacity problem.
Published 2020-01-09
URL https://arxiv.org/abs/2001.03108v2
PDF https://arxiv.org/pdf/2001.03108v2.pdf
PWC https://paperswithcode.com/paper/a-connection-between-feedback-capacity-and

Class Conditional Alignment for Partial Domain Adaptation

Title Class Conditional Alignment for Partial Domain Adaptation
Authors Mohsen Kheirandishfard, Fariba Zohrizadeh, Farhad Kamangar
Abstract Adversarial adaptation models have demonstrated significant progress towards transferring knowledge from a labeled source dataset to an unlabeled target dataset. Partial domain adaptation (PDA) investigates the scenarios in which the source domain is large and diverse, and the target label space is a subset of the source label space. The main purpose of PDA is to identify the shared classes between the domains and promote learning transferable knowledge from these classes. In this paper, we propose a multi-class adversarial architecture for PDA. The proposed approach jointly aligns the marginal and class-conditional distributions in the shared label space by minimaxing a novel multi-class adversarial loss function. Furthermore, we incorporate effective regularization terms to encourage selecting the most relevant subset of source domain classes. In the absence of target labels, the proposed approach is able to effectively learn domain-invariant feature representations, which in turn can enhance the classification performance in the target domain. Comprehensive experiments on three benchmark datasets Office-31, Office-Home, and Caltech-Office corroborate the effectiveness of the proposed approach in addressing different partial transfer learning tasks.
Tasks Domain Adaptation, Partial Domain Adaptation, Transfer Learning
Published 2020-03-14
URL https://arxiv.org/abs/2003.06722v1
PDF https://arxiv.org/pdf/2003.06722v1.pdf
PWC https://paperswithcode.com/paper/class-conditional-alignment-for-partial

Observational nonidentifiability, generalized likelihood and free energy

Title Observational nonidentifiability, generalized likelihood and free energy
Authors A. E. Allahverdyan
Abstract We study the parameter estimation problem in mixture models with observational nonidentifiability: the full model (also containing hidden variables) is identifiable, but the marginal (observed) model is not. Hence global maxima of the marginal likelihood are (infinitely) degenerate and predictions of the marginal likelihood are not unique. We show how to generalize the marginal likelihood by introducing an effective temperature, and making it similar to the free energy. This generalization resolves the observational nonidentifiability, since its maximization leads to unique results that are better than a random selection of one degenerate maximum of the marginal likelihood or the averaging over many such maxima. The generalized likelihood inherits many features from the usual likelihood, e.g. it holds the conditionality principle, and its local maximum can be searched for via suitably modified expectation-maximization method. The maximization of the generalized likelihood relates to entropy optimization.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07884v1
PDF https://arxiv.org/pdf/2002.07884v1.pdf
PWC https://paperswithcode.com/paper/observational-nonidentifiability-generalized

DDKSP: A Data-Driven Stochastic Programming Framework for Car-Sharing Relocation Problem

Title DDKSP: A Data-Driven Stochastic Programming Framework for Car-Sharing Relocation Problem
Authors Xiaoming Li, Chun Wang, Xiao Huang
Abstract Car-sharing issue is a popular research field in sharing economy. In this paper, we investigate the car-sharing relocation problem (CSRP) under uncertain demands. Normally, the real customer demands follow complicating probability distribution which cannot be described by parametric approaches. In order to overcome the problem, an innovative framework called Data-Driven Kernel Stochastic Programming (DDKSP) that integrates a non-parametric approach - kernel density estimation (KDE) and a two-stage stochastic programming (SP) model is proposed. Specifically, the probability distributions are derived from historical data by KDE, which are used as the input uncertain parameters for SP. Additionally, the CSRP is formulated as a two-stage SP model. Meanwhile, a Monte Carlo method called sample average approximation (SAA) and Benders decomposition algorithm are introduced to solve the large-scale optimization model. Finally, the numerical experimental validations which are based on New York taxi trip data sets show that the proposed framework outperforms the pure parametric approaches including Gaussian, Laplace and Poisson distributions with 3.72% , 4.58% and 11% respectively in terms of overall profits.
Tasks Density Estimation
Published 2020-01-20
URL https://arxiv.org/abs/2001.08109v1
PDF https://arxiv.org/pdf/2001.08109v1.pdf
PWC https://paperswithcode.com/paper/ddksp-a-data-driven-stochastic-programming

Subset Sampling For Progressive Neural Network Learning

Title Subset Sampling For Progressive Neural Network Learning
Authors Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis
Abstract Progressive Neural Network Learning is a class of algorithms that incrementally construct the network’s topology and optimize its parameters based on the training data. While this approach exempts the users from the manual task of designing and validating multiple network topologies, it often requires an enormous number of computations. In this paper, we propose to speed up this process by exploiting subsets of training data at each incremental training step. Three different sampling strategies for selecting the training samples according to different criteria are proposed and evaluated. We also propose to perform online hyperparameter selection during the network progression, which further reduces the overall training time. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably while operating on par with the baseline approach exploiting the entire training set throughout the training process.
Tasks Face Recognition
Published 2020-02-17
URL https://arxiv.org/abs/2002.07141v1
PDF https://arxiv.org/pdf/2002.07141v1.pdf
PWC https://paperswithcode.com/paper/subset-sampling-for-progressive-neural

Deep HyperNetwork-Based MIMO Detection

Title Deep HyperNetwork-Based MIMO Detection
Authors Mathieu Goutay, Fayçal Ait Aoudia, Jakob Hoydis
Abstract Optimal symbol detection for multiple-input multiple-output (MIMO) systems is known to be an NP-hard problem. Conventional heuristic algorithms are either too complex to be practical or suffer from poor performance. Recently, several approaches tried to address those challenges by implementing the detector as a deep neural network. However, they either still achieve unsatisfying performance on practical spatially correlated channels, or are computationally demanding since they require retraining for each channel realization. In this work, we address both issues by training an additional neural network (NN), referred to as the hypernetwork, which takes as input the channel matrix and generates the weights of the neural NN-based detector. Results show that the proposed approach achieves near state-of-the-art performance without the need for re-training.
Published 2020-02-07
URL https://arxiv.org/abs/2002.02750v2
PDF https://arxiv.org/pdf/2002.02750v2.pdf
PWC https://paperswithcode.com/paper/deep-hypernetwork-based-mimo-detection

Improvement of electronic Governance and mobile Governance in Multilingual Countries with Digital Etymology using Sanskrit Grammar

Title Improvement of electronic Governance and mobile Governance in Multilingual Countries with Digital Etymology using Sanskrit Grammar
Authors Arijit Das, Diganta Saha
Abstract With huge improvement of digital connectivity (Wifi,3G,4G) and digital devices access to internet has reached in the remotest corners now a days. Rural people can easily access web or apps from PDAs, laptops, smartphones etc. This is an opportunity of the Government to reach to the citizen in large number, get their feedback, associate them in policy decision with e governance without deploying huge man, material or resourses. But the Government of multilingual countries face a lot of problem in successful implementation of Government to Citizen (G2C) and Citizen to Government (C2G) governance as the rural people tend and prefer to interact in their native languages. Presenting equal experience over web or app to different language group of speakers is a real challenge. In this research we have sorted out the problems faced by Indo Aryan speaking netizens which is in general also applicable to any language family groups or subgroups. Then we have tried to give probable solutions using Etymology. Etymology is used to correlate the words using their ROOT forms. In 5th century BC Panini wrote Astadhyayi where he depicted sutras or rules – how a word is changed according to person,tense,gender,number etc. Later this book was followed in Western countries also to derive their grammar of comparatively new languages. We have trained our system for automatic root extraction from the surface level or morphed form of words using Panian Gramatical rules. We have tested our system over 10000 bengali Verbs and extracted the root form with 98% accuracy. We are now working to extend the program to successfully lemmatize any words of any language and correlate them by applying those rule sets in Artificial Neural Network.
Published 2020-03-31
URL https://arxiv.org/abs/2004.00104v1
PDF https://arxiv.org/pdf/2004.00104v1.pdf
PWC https://paperswithcode.com/paper/improvement-of-electronic-governance-and

Quantified limits of the nuclear landscape

Title Quantified limits of the nuclear landscape
Authors Léo Neufcourt, Yuchen Cao, Samuel A. Giuliani, Witold Nazarewicz, Erik Olsen, Oleg B. Tarasov
Abstract The chart of the nuclides is limited by particle drip lines beyond which nuclear stability to proton or neutron emission is lost. Predicting the range of particle-bound isotopes poses an appreciable challenge for nuclear theory as it involves extreme extrapolations of nuclear masses beyond the regions where experimental information is available. Still, quantified extrapolations are crucial for a variety of applications, including the modeling of stellar nucleosynthesis. We use microscopic nuclear mass models and Bayesian methodology to provide quantified predictions of proton and neutron separation energies as well as Bayesian probabilities of existence throughout the nuclear landscape all the way to the particle drip lines. We apply nuclear density functional theory with several energy density functionals. To account for uncertainties, Bayesian Gaussian processes are trained on the separation-energy residuals for each individual model, and the resulting predictions are combined via Bayesian model averaging. This framework allows to account for systematic and statistical uncertainties and propagate them to extrapolative predictions. We characterize the drip-line regions where the probability that the nucleus is particle-bound decreases from $1$ to $0$. In these regions, we provide quantified predictions for one- and two-nucleon separation energies. According to our Bayesian model averaging analysis, 7759 nuclei with $Z\leq 119$ have a probability of existence $\geq 0.5$. The extrapolations obtained in this study will be put through stringent tests when new experimental information on exotic nuclei becomes available. In this respect, the quantified landscape of nuclear existence obtained in this study should be viewed as a dynamical prediction that will be fine-tuned when new experimental information and improved global mass models become available.
Tasks Gaussian Processes
Published 2020-01-16
URL https://arxiv.org/abs/2001.05924v2
PDF https://arxiv.org/pdf/2001.05924v2.pdf
PWC https://paperswithcode.com/paper/quantified-limits-of-the-nuclear-landscape

Crop Knowledge Discovery Based on Agricultural Big Data Integration

Title Crop Knowledge Discovery Based on Agricultural Big Data Integration
Authors Vuong M. Ngo, M-Tahar Kechadi
Abstract Nowadays, the agricultural data can be generated through various sources, such as: Internet of Thing (IoT), sensors, satellites, weather stations, robots, farm equipment, agricultural laboratories, farmers, government agencies and agribusinesses. The analysis of this big data enables farmers, companies and agronomists to extract high business and scientific knowledge, improving their operational processes and product quality. However, before analysing this data, different data sources need to be normalised, homogenised and integrated into a unified data representation. In this paper, we propose an agricultural data integration method using a constellation schema which is designed to be flexible enough to incorporate other datasets and big data models. We also apply some methods to extract knowledge with the view to improve crop yield; these include finding suitable quantities of soil properties, herbicides and insecticides for both increasing crop yield and protecting the environment.
Published 2020-03-11
URL https://arxiv.org/abs/2003.05043v1
PDF https://arxiv.org/pdf/2003.05043v1.pdf
PWC https://paperswithcode.com/paper/crop-knowledge-discovery-based-on

Regularizing activations in neural networks via distribution matching with the Wasserstein metric

Title Regularizing activations in neural networks via distribution matching with the Wasserstein metric
Authors Taejong Joo, Donggu Kang, Byunghoon Kim
Abstract Regularization and normalization have become indispensable components in training deep neural networks, resulting in faster training and improved generalization performance. We propose the projected error function regularization loss (PER) that encourages activations to follow the standard normal distribution. PER randomly projects activations onto one-dimensional space and computes the regularization loss in the projected space. PER is similar to the Pseudo-Huber loss in the projected space, thus taking advantage of both $L^1$ and $L^2$ regularization losses. Besides, PER can capture the interaction between hidden units by projection vector drawn from a unit sphere. By doing so, PER minimizes the upper bound of the Wasserstein distance of order one between an empirical distribution of activations and the standard normal distribution. To the best of the authors’ knowledge, this is the first work to regularize activations via distribution matching in the probability distribution space. We evaluate the proposed method on the image classification task and the word-level language modeling task.
Tasks Image Classification, Language Modelling
Published 2020-02-13
URL https://arxiv.org/abs/2002.05366v1
PDF https://arxiv.org/pdf/2002.05366v1.pdf
PWC https://paperswithcode.com/paper/regularizing-activations-in-neural-networks-1

Lesion Conditional Image Generation for Improved Segmentation of Intracranial Hemorrhage from CT Images

Title Lesion Conditional Image Generation for Improved Segmentation of Intracranial Hemorrhage from CT Images
Authors Manohar Karki, Junghwan Cho
Abstract Data augmentation can effectively resolve a scarcity of images when training machine-learning algorithms. It can make them more robust to unseen images. We present a lesion conditional Generative Adversarial Network LcGAN to generate synthetic Computed Tomography (CT) images for data augmentation. A lesion conditional image (segmented mask) is an input to both the generator and the discriminator of the LcGAN during training. The trained model generates contextual CT images based on input masks. We quantify the quality of the images by using a fully convolutional network (FCN) score and blurriness. We also train another classification network to select better synthetic images. These synthetic CT images are then augmented to our hemorrhagic lesion segmentation network. By applying this augmentation method on 2.5%, 10% and 25% of original data, segmentation improved by 12.8%, 6% and 1.6% respectively.
Tasks Computed Tomography (CT), Conditional Image Generation, Data Augmentation, Image Generation, Lesion Segmentation
Published 2020-03-30
URL https://arxiv.org/abs/2003.13868v1
PDF https://arxiv.org/pdf/2003.13868v1.pdf
PWC https://paperswithcode.com/paper/lesion-conditional-image-generation-for
comments powered by Disqus