Paper Group ANR 52
Learning to Inpaint by Progressively Growing the Mask Regions. Deep Slow Motion Video Reconstruction with Hybrid Imaging System. Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble. DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition. A Connection between Feedback Capacity …
Learning to Inpaint by Progressively Growing the Mask Regions
Title | Learning to Inpaint by Progressively Growing the Mask Regions |
Authors | Mohamed Abbas Hedjazi, Yakup Genc |
Abstract | Image inpainting is one of the most challenging tasks in computer vision. Recently, generative-based image inpainting methods have been shown to produce visually plausible images. However, they still have difficulties to generate the correct structures and colors as the masked region grows large. This drawback is due to the training stability issue of the generative models. This work introduces a new curriculum-style training approach in the context of image inpainting. The proposed method increases the masked region size progressively in training time, during test time the user gives variable size and multiple holes at arbitrary locations. Incorporating such an approach in GANs may stabilize the training and provides better color consistencies and captures object continuities. We validate our approach on the MSCOCO and CelebA datasets. We report qualitative and quantitative comparisons of our training approach in different models. |
Tasks | Image Inpainting |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09280v1 |
https://arxiv.org/pdf/2002.09280v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-inpaint-by-progressively-growing |
Repo | |
Framework | |
Deep Slow Motion Video Reconstruction with Hybrid Imaging System
Title | Deep Slow Motion Video Reconstruction with Hybrid Imaging System |
Authors | Avinash Paliwal, Nima Khademi Kalantari |
Abstract | Slow motion videos are becoming increasingly popular, but capturing high-resolution videos at extremely high frame rates requires professional high-speed cameras. To mitigate this problem, current techniques increase the frame rate of standard videos through frame interpolation by assuming linear motion between the existing frames. While this assumption holds true for simple cases with small motion, in challenging cases the motion is usually complex and this assumption is no longer valid. Therefore, they typically produce results with unnatural motion in these challenging cases. In this paper, we address this problem using two video streams as the input; an auxiliary video with high frame rate and low spatial resolution, providing temporal information, in addition to the standard main video with low frame rate and high spatial resolution. We propose a two-stage deep learning system consisting of alignment and appearance estimation that reconstructs high resolution slow motion video from the hybrid video input. For alignment, we propose to use a set of pre-trained and trainable convolutional neural networks (CNNs) to compute the flows between the missing frame and the two existing frames of the main video by utilizing the content of the auxiliary video frames. We then warp the existing frames using the flows to produce a set of aligned frames. For appearance estimation, we propose to combine the aligned and auxiliary frames using a context and occlusion aware CNN. We train our model on a set of synthetically generated hybrid videos and show high-quality results on a wide range of test scenes. We further demonstrate the practicality of our approach by showing the performance of our system on two real dual camera setups with small baseline. |
Tasks | Video Reconstruction |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12106v1 |
https://arxiv.org/pdf/2002.12106v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-slow-motion-video-reconstruction-with |
Repo | |
Framework | |
Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble
Title | Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble |
Authors | Yongming Li, Yan Lei, Pin Wang, Yuchuan Liu |
Abstract | Deep learning is a kind of feature learning method with strong nonliear feature transformation and becomes more and more important in many fields of artificial intelligence. Deep autoencoder is one representative method of the deep learning methods, and can effectively extract abstract the information of datasets. However, it does not consider the complementarity between the deep features and original features during deep feature transformation. Besides, it suffers from small sample problem. In order to solve these problems, a novel deep autoencoder - hybrid feature embedded stacked sparse autoencoder(HESSAE) has been proposed in this paper. HFESAE is capable to learn discriminant deep features with the help of embedding original features to filter weak hidden-layer outputs during training. For the issue that class representation ability of abstract information is limited by small sample problem, a feature fusion strategy has been designed aiming to combining abstract information learned by HFESAE with original feature and obtain hybrid features for feature reduction. The strategy is hybrid feature selection strategy based on L1 regularization followed by an support vector machine(SVM) ensemble model, in which weighted local discriminant preservation projection (w_LPPD), is designed and employed on each base classifier. At the end of this paper, several representative public datasets are used to verify the effectiveness of the proposed algorithm. The experimental results demonstrated that, the proposed feature learning method yields superior performance compared to other existing and state of art feature learning algorithms including some representative deep autoencoder methods. |
Tasks | Feature Selection |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06761v1 |
https://arxiv.org/pdf/2002.06761v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-embedded-deep-stacked-sparse |
Repo | |
Framework | |
DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition
Title | DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition |
Authors | Hao-Chiang Shao, Kang-Yu Liu, Chia-Wen Lin, Jiwen Lu |
Abstract | The performance of a convolutional neural network (CNN) based face recognition model largely relies on the richness of labelled training data. Collecting a training set with large variations of a face identity under different poses and illumination changes, however, is very expensive, making the diversity of within-class face images a critical issue in practice. In this paper, we propose a 3D model-assisted domain-transferred face augmentation network (DotFAN) that can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains. DotFAN is structurally a conditional CycleGAN but has two additional subnetworks, namely face expert network (FEM) and face shape regressor (FSR), for latent code control. While FSR aims to extract face attributes, FEM is designed to capture a face identity. With their aid, DotFAN can learn a disentangled face representation and effectively generate face images of various facial attributes while preserving the identity of augmented faces. Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity so that a better face recognition model can be learned from the augmented dataset. |
Tasks | Face Recognition |
Published | 2020-02-23 |
URL | https://arxiv.org/abs/2002.09859v1 |
https://arxiv.org/pdf/2002.09859v1.pdf | |
PWC | https://paperswithcode.com/paper/dotfan-a-domain-transferred-face-augmentation |
Repo | |
Framework | |
A Connection between Feedback Capacity and Kalman Filter for Colored Gaussian Noises
Title | A Connection between Feedback Capacity and Kalman Filter for Colored Gaussian Noises |
Authors | Song Fang, Quanyan Zhu |
Abstract | In this paper, we establish a connection between the feedback capacity of additive colored Gaussian noise channels and the Kalman filters with additive colored Gaussian noises. In light of this, we are able to provide lower bounds on feedback capacity of such channels with finite-order auto-regressive moving average colored noises, and the bounds are seen to be consistent with various existing results in the literature; particularly, the bound is tight in the case of first-order auto-regressive moving average colored noises. On the other hand, the Kalman filtering systems, after certain equivalence transformations, can be employed as recursive coding schemes/algorithms to achieve the lower bounds. In general, our results provide an alternative perspective while pointing to potentially tighter bounds for the feedback capacity problem. |
Tasks | |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.03108v2 |
https://arxiv.org/pdf/2001.03108v2.pdf | |
PWC | https://paperswithcode.com/paper/a-connection-between-feedback-capacity-and |
Repo | |
Framework | |
Class Conditional Alignment for Partial Domain Adaptation
Title | Class Conditional Alignment for Partial Domain Adaptation |
Authors | Mohsen Kheirandishfard, Fariba Zohrizadeh, Farhad Kamangar |
Abstract | Adversarial adaptation models have demonstrated significant progress towards transferring knowledge from a labeled source dataset to an unlabeled target dataset. Partial domain adaptation (PDA) investigates the scenarios in which the source domain is large and diverse, and the target label space is a subset of the source label space. The main purpose of PDA is to identify the shared classes between the domains and promote learning transferable knowledge from these classes. In this paper, we propose a multi-class adversarial architecture for PDA. The proposed approach jointly aligns the marginal and class-conditional distributions in the shared label space by minimaxing a novel multi-class adversarial loss function. Furthermore, we incorporate effective regularization terms to encourage selecting the most relevant subset of source domain classes. In the absence of target labels, the proposed approach is able to effectively learn domain-invariant feature representations, which in turn can enhance the classification performance in the target domain. Comprehensive experiments on three benchmark datasets Office-31, Office-Home, and Caltech-Office corroborate the effectiveness of the proposed approach in addressing different partial transfer learning tasks. |
Tasks | Domain Adaptation, Partial Domain Adaptation, Transfer Learning |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06722v1 |
https://arxiv.org/pdf/2003.06722v1.pdf | |
PWC | https://paperswithcode.com/paper/class-conditional-alignment-for-partial |
Repo | |
Framework | |
Observational nonidentifiability, generalized likelihood and free energy
Title | Observational nonidentifiability, generalized likelihood and free energy |
Authors | A. E. Allahverdyan |
Abstract | We study the parameter estimation problem in mixture models with observational nonidentifiability: the full model (also containing hidden variables) is identifiable, but the marginal (observed) model is not. Hence global maxima of the marginal likelihood are (infinitely) degenerate and predictions of the marginal likelihood are not unique. We show how to generalize the marginal likelihood by introducing an effective temperature, and making it similar to the free energy. This generalization resolves the observational nonidentifiability, since its maximization leads to unique results that are better than a random selection of one degenerate maximum of the marginal likelihood or the averaging over many such maxima. The generalized likelihood inherits many features from the usual likelihood, e.g. it holds the conditionality principle, and its local maximum can be searched for via suitably modified expectation-maximization method. The maximization of the generalized likelihood relates to entropy optimization. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07884v1 |
https://arxiv.org/pdf/2002.07884v1.pdf | |
PWC | https://paperswithcode.com/paper/observational-nonidentifiability-generalized |
Repo | |
Framework | |
DDKSP: A Data-Driven Stochastic Programming Framework for Car-Sharing Relocation Problem
Title | DDKSP: A Data-Driven Stochastic Programming Framework for Car-Sharing Relocation Problem |
Authors | Xiaoming Li, Chun Wang, Xiao Huang |
Abstract | Car-sharing issue is a popular research field in sharing economy. In this paper, we investigate the car-sharing relocation problem (CSRP) under uncertain demands. Normally, the real customer demands follow complicating probability distribution which cannot be described by parametric approaches. In order to overcome the problem, an innovative framework called Data-Driven Kernel Stochastic Programming (DDKSP) that integrates a non-parametric approach - kernel density estimation (KDE) and a two-stage stochastic programming (SP) model is proposed. Specifically, the probability distributions are derived from historical data by KDE, which are used as the input uncertain parameters for SP. Additionally, the CSRP is formulated as a two-stage SP model. Meanwhile, a Monte Carlo method called sample average approximation (SAA) and Benders decomposition algorithm are introduced to solve the large-scale optimization model. Finally, the numerical experimental validations which are based on New York taxi trip data sets show that the proposed framework outperforms the pure parametric approaches including Gaussian, Laplace and Poisson distributions with 3.72% , 4.58% and 11% respectively in terms of overall profits. |
Tasks | Density Estimation |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.08109v1 |
https://arxiv.org/pdf/2001.08109v1.pdf | |
PWC | https://paperswithcode.com/paper/ddksp-a-data-driven-stochastic-programming |
Repo | |
Framework | |
Subset Sampling For Progressive Neural Network Learning
Title | Subset Sampling For Progressive Neural Network Learning |
Authors | Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis |
Abstract | Progressive Neural Network Learning is a class of algorithms that incrementally construct the network’s topology and optimize its parameters based on the training data. While this approach exempts the users from the manual task of designing and validating multiple network topologies, it often requires an enormous number of computations. In this paper, we propose to speed up this process by exploiting subsets of training data at each incremental training step. Three different sampling strategies for selecting the training samples according to different criteria are proposed and evaluated. We also propose to perform online hyperparameter selection during the network progression, which further reduces the overall training time. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably while operating on par with the baseline approach exploiting the entire training set throughout the training process. |
Tasks | Face Recognition |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.07141v1 |
https://arxiv.org/pdf/2002.07141v1.pdf | |
PWC | https://paperswithcode.com/paper/subset-sampling-for-progressive-neural |
Repo | |
Framework | |
Deep HyperNetwork-Based MIMO Detection
Title | Deep HyperNetwork-Based MIMO Detection |
Authors | Mathieu Goutay, Fayçal Ait Aoudia, Jakob Hoydis |
Abstract | Optimal symbol detection for multiple-input multiple-output (MIMO) systems is known to be an NP-hard problem. Conventional heuristic algorithms are either too complex to be practical or suffer from poor performance. Recently, several approaches tried to address those challenges by implementing the detector as a deep neural network. However, they either still achieve unsatisfying performance on practical spatially correlated channels, or are computationally demanding since they require retraining for each channel realization. In this work, we address both issues by training an additional neural network (NN), referred to as the hypernetwork, which takes as input the channel matrix and generates the weights of the neural NN-based detector. Results show that the proposed approach achieves near state-of-the-art performance without the need for re-training. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02750v2 |
https://arxiv.org/pdf/2002.02750v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-hypernetwork-based-mimo-detection |
Repo | |
Framework | |
Improvement of electronic Governance and mobile Governance in Multilingual Countries with Digital Etymology using Sanskrit Grammar
Title | Improvement of electronic Governance and mobile Governance in Multilingual Countries with Digital Etymology using Sanskrit Grammar |
Authors | Arijit Das, Diganta Saha |
Abstract | With huge improvement of digital connectivity (Wifi,3G,4G) and digital devices access to internet has reached in the remotest corners now a days. Rural people can easily access web or apps from PDAs, laptops, smartphones etc. This is an opportunity of the Government to reach to the citizen in large number, get their feedback, associate them in policy decision with e governance without deploying huge man, material or resourses. But the Government of multilingual countries face a lot of problem in successful implementation of Government to Citizen (G2C) and Citizen to Government (C2G) governance as the rural people tend and prefer to interact in their native languages. Presenting equal experience over web or app to different language group of speakers is a real challenge. In this research we have sorted out the problems faced by Indo Aryan speaking netizens which is in general also applicable to any language family groups or subgroups. Then we have tried to give probable solutions using Etymology. Etymology is used to correlate the words using their ROOT forms. In 5th century BC Panini wrote Astadhyayi where he depicted sutras or rules – how a word is changed according to person,tense,gender,number etc. Later this book was followed in Western countries also to derive their grammar of comparatively new languages. We have trained our system for automatic root extraction from the surface level or morphed form of words using Panian Gramatical rules. We have tested our system over 10000 bengali Verbs and extracted the root form with 98% accuracy. We are now working to extend the program to successfully lemmatize any words of any language and correlate them by applying those rule sets in Artificial Neural Network. |
Tasks | |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2004.00104v1 |
https://arxiv.org/pdf/2004.00104v1.pdf | |
PWC | https://paperswithcode.com/paper/improvement-of-electronic-governance-and |
Repo | |
Framework | |
Quantified limits of the nuclear landscape
Title | Quantified limits of the nuclear landscape |
Authors | Léo Neufcourt, Yuchen Cao, Samuel A. Giuliani, Witold Nazarewicz, Erik Olsen, Oleg B. Tarasov |
Abstract | The chart of the nuclides is limited by particle drip lines beyond which nuclear stability to proton or neutron emission is lost. Predicting the range of particle-bound isotopes poses an appreciable challenge for nuclear theory as it involves extreme extrapolations of nuclear masses beyond the regions where experimental information is available. Still, quantified extrapolations are crucial for a variety of applications, including the modeling of stellar nucleosynthesis. We use microscopic nuclear mass models and Bayesian methodology to provide quantified predictions of proton and neutron separation energies as well as Bayesian probabilities of existence throughout the nuclear landscape all the way to the particle drip lines. We apply nuclear density functional theory with several energy density functionals. To account for uncertainties, Bayesian Gaussian processes are trained on the separation-energy residuals for each individual model, and the resulting predictions are combined via Bayesian model averaging. This framework allows to account for systematic and statistical uncertainties and propagate them to extrapolative predictions. We characterize the drip-line regions where the probability that the nucleus is particle-bound decreases from $1$ to $0$. In these regions, we provide quantified predictions for one- and two-nucleon separation energies. According to our Bayesian model averaging analysis, 7759 nuclei with $Z\leq 119$ have a probability of existence $\geq 0.5$. The extrapolations obtained in this study will be put through stringent tests when new experimental information on exotic nuclei becomes available. In this respect, the quantified landscape of nuclear existence obtained in this study should be viewed as a dynamical prediction that will be fine-tuned when new experimental information and improved global mass models become available. |
Tasks | Gaussian Processes |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05924v2 |
https://arxiv.org/pdf/2001.05924v2.pdf | |
PWC | https://paperswithcode.com/paper/quantified-limits-of-the-nuclear-landscape |
Repo | |
Framework | |
Crop Knowledge Discovery Based on Agricultural Big Data Integration
Title | Crop Knowledge Discovery Based on Agricultural Big Data Integration |
Authors | Vuong M. Ngo, M-Tahar Kechadi |
Abstract | Nowadays, the agricultural data can be generated through various sources, such as: Internet of Thing (IoT), sensors, satellites, weather stations, robots, farm equipment, agricultural laboratories, farmers, government agencies and agribusinesses. The analysis of this big data enables farmers, companies and agronomists to extract high business and scientific knowledge, improving their operational processes and product quality. However, before analysing this data, different data sources need to be normalised, homogenised and integrated into a unified data representation. In this paper, we propose an agricultural data integration method using a constellation schema which is designed to be flexible enough to incorporate other datasets and big data models. We also apply some methods to extract knowledge with the view to improve crop yield; these include finding suitable quantities of soil properties, herbicides and insecticides for both increasing crop yield and protecting the environment. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05043v1 |
https://arxiv.org/pdf/2003.05043v1.pdf | |
PWC | https://paperswithcode.com/paper/crop-knowledge-discovery-based-on |
Repo | |
Framework | |
Regularizing activations in neural networks via distribution matching with the Wasserstein metric
Title | Regularizing activations in neural networks via distribution matching with the Wasserstein metric |
Authors | Taejong Joo, Donggu Kang, Byunghoon Kim |
Abstract | Regularization and normalization have become indispensable components in training deep neural networks, resulting in faster training and improved generalization performance. We propose the projected error function regularization loss (PER) that encourages activations to follow the standard normal distribution. PER randomly projects activations onto one-dimensional space and computes the regularization loss in the projected space. PER is similar to the Pseudo-Huber loss in the projected space, thus taking advantage of both $L^1$ and $L^2$ regularization losses. Besides, PER can capture the interaction between hidden units by projection vector drawn from a unit sphere. By doing so, PER minimizes the upper bound of the Wasserstein distance of order one between an empirical distribution of activations and the standard normal distribution. To the best of the authors’ knowledge, this is the first work to regularize activations via distribution matching in the probability distribution space. We evaluate the proposed method on the image classification task and the word-level language modeling task. |
Tasks | Image Classification, Language Modelling |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05366v1 |
https://arxiv.org/pdf/2002.05366v1.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-activations-in-neural-networks-1 |
Repo | |
Framework | |
Lesion Conditional Image Generation for Improved Segmentation of Intracranial Hemorrhage from CT Images
Title | Lesion Conditional Image Generation for Improved Segmentation of Intracranial Hemorrhage from CT Images |
Authors | Manohar Karki, Junghwan Cho |
Abstract | Data augmentation can effectively resolve a scarcity of images when training machine-learning algorithms. It can make them more robust to unseen images. We present a lesion conditional Generative Adversarial Network LcGAN to generate synthetic Computed Tomography (CT) images for data augmentation. A lesion conditional image (segmented mask) is an input to both the generator and the discriminator of the LcGAN during training. The trained model generates contextual CT images based on input masks. We quantify the quality of the images by using a fully convolutional network (FCN) score and blurriness. We also train another classification network to select better synthetic images. These synthetic CT images are then augmented to our hemorrhagic lesion segmentation network. By applying this augmentation method on 2.5%, 10% and 25% of original data, segmentation improved by 12.8%, 6% and 1.6% respectively. |
Tasks | Computed Tomography (CT), Conditional Image Generation, Data Augmentation, Image Generation, Lesion Segmentation |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13868v1 |
https://arxiv.org/pdf/2003.13868v1.pdf | |
PWC | https://paperswithcode.com/paper/lesion-conditional-image-generation-for |
Repo | |
Framework | |