Paper Group ANR 219
LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis. Deep Sigma Point Processes. Short Term Blood Glucose Prediction based on Continuous Glucose Monitoring Data. Knot Selection in Sparse Gaussian Processes. Fast Depth Estimation for View Synthesis. Weakly-supervised Multi-output Regression via Correlated Gaussian Pr …
LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis
Title | LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis |
Authors | Quan Huu Cap, Hiroyuki Uga, Satoshi Kagiwada, Hitoshi Iyatomi |
Abstract | Many applications for the automated diagnosis of plant disease have been developed based on the success of deep learning techniques. However, these applications often suffer from overfitting, and the diagnostic performance is drastically decreased when used on test datasets from new environments. The typical reasons for this are that the symptoms to be detected are unclear or faint, and there are limitations related to data diversity. In this paper, we propose LeafGAN, a novel image-to-image translation system with own attention mechanism. LeafGAN generates a wide variety of diseased images via transformation from healthy images, as a data augmentation tool for improving the performance of plant disease diagnosis. Thanks to its own attention mechanism, our model can transform only relevant areas from images with a variety of backgrounds, thus enriching the versatility of the training images. Experiments with five-class cucumber disease classification show that data augmentation with vanilla CycleGAN cannot help to improve the generalization, i.e. disease diagnostic performance increased by only 0.7% from the baseline. In contrast, LeafGAN boosted the diagnostic performance by 7.4%. We also visually confirmed the generated images by our LeafGAN were much better quality and more convincing than those generated by vanilla CycleGAN. |
Tasks | Data Augmentation, Image-to-Image Translation |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10100v1 |
https://arxiv.org/pdf/2002.10100v1.pdf | |
PWC | https://paperswithcode.com/paper/leafgan-an-effective-data-augmentation-method |
Repo | |
Framework | |
Deep Sigma Point Processes
Title | Deep Sigma Point Processes |
Authors | Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner |
Abstract | We introduce Deep Sigma Point Processes, a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs). Deep Sigma Point Processes (DSPPs) retain many of the attractive features of (variational) DGPs, including mini-batch training and predictive uncertainty that is controlled by kernel basis functions. Importantly, since DSPPs admit a simple maximum likelihood inference procedure, the resulting predictive distributions are not degraded by any posterior approximations. In an extensive empirical comparison on univariate and multivariate regression tasks we find that the resulting predictive distributions are significantly better calibrated than those obtained with other probabilistic methods for scalable regression, including variational DGPs–often by as much as a nat per datapoint. |
Tasks | Gaussian Processes, Point Processes |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09112v1 |
https://arxiv.org/pdf/2002.09112v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-sigma-point-processes |
Repo | |
Framework | |
Short Term Blood Glucose Prediction based on Continuous Glucose Monitoring Data
Title | Short Term Blood Glucose Prediction based on Continuous Glucose Monitoring Data |
Authors | Ali Mohebbi, Alexander R. Johansen, Nicklas Hansen, Peter E. Christensen, Jens M. Tarp, Morten L. Jensen, Henrik Bengtsson, Morten Mørup |
Abstract | Continuous Glucose Monitoring (CGM) has enabled important opportunities for diabetes management. This study explores the use of CGM data as input for digital decision support tools. We investigate how Recurrent Neural Networks (RNNs) can be used for Short Term Blood Glucose (STBG) prediction and compare the RNNs to conventional time-series forecasting using Autoregressive Integrated Moving Average (ARIMA). A prediction horizon up to 90 min into the future is considered. In this context, we evaluate both population-based and patient-specific RNNs and contrast them to patient-specific ARIMA models and a simple baseline predicting future observations as the last observed. We find that the population-based RNN model is the best performing model across the considered prediction horizons without the need of patient-specific data. This demonstrates the potential of RNNs for STBG prediction in diabetes patients towards detecting/mitigating severe events in the STBG, in particular hypoglycemic events. However, further studies are needed in regards to the robustness and practical use of the investigated STBG prediction models. |
Tasks | Time Series, Time Series Forecasting |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02805v1 |
https://arxiv.org/pdf/2002.02805v1.pdf | |
PWC | https://paperswithcode.com/paper/short-term-blood-glucose-prediction-based-on |
Repo | |
Framework | |
Knot Selection in Sparse Gaussian Processes
Title | Knot Selection in Sparse Gaussian Processes |
Authors | Nathaniel Garton, Jarad Niemi, Alicia Carriquiry |
Abstract | Knot-based, sparse Gaussian processes have enjoyed considerable success as scalable approximations to full Gaussian processes. Problems can occur, however, when knot selection is done by optimizing the marginal likelihood. For example, the marginal likelihood surface is highly multimodal, which can cause suboptimal knot placement where some knots serve practically no function. This is especially a problem when many more knots are used than are necessary, resulting in extra computational cost for little to no gains in accuracy. We propose a one-at-a-time knot selection algorithm to select both the number and placement of knots. Our algorithm uses Bayesian optimization to efficiently propose knots that are likely to be good and largely avoids the pathologies encountered when using the marginal likelihood as the objective function. We provide empirical results showing improved accuracy and speed over the current standard approaches. |
Tasks | Gaussian Processes |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09538v1 |
https://arxiv.org/pdf/2002.09538v1.pdf | |
PWC | https://paperswithcode.com/paper/knot-selection-in-sparse-gaussian-processes |
Repo | |
Framework | |
Fast Depth Estimation for View Synthesis
Title | Fast Depth Estimation for View Synthesis |
Authors | Nantheera Anantrasirichai, Majid Geravand, David Braendler, David R. Bull |
Abstract | Disparity/depth estimation from sequences of stereo images is an important element in 3D vision. Owing to occlusions, imperfect settings and homogeneous luminance, accurate estimate of depth remains a challenging problem. Targetting view synthesis, we propose a novel learning-based framework making use of dilated convolution, densely connected convolutional modules, compact decoder and skip connections. The network is shallow but dense, so it is fast and accurate. Two additional contributions - a non-linear adjustment of the depth resolution and the introduction of a projection loss, lead to reduction of estimation error by up to 20% and 25% respectively. The results show that our network outperforms state-of-the-art methods with an average improvement in accuracy of depth estimation and view synthesis by approximately 45% and 34% respectively. Where our method generates comparable quality of estimated depth, it performs 10 times faster than those methods. |
Tasks | Depth Estimation |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06637v1 |
https://arxiv.org/pdf/2003.06637v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-depth-estimation-for-view-synthesis |
Repo | |
Framework | |
Weakly-supervised Multi-output Regression via Correlated Gaussian Processes
Title | Weakly-supervised Multi-output Regression via Correlated Gaussian Processes |
Authors | Seokhyun Chung, Raed Al Kontar, Zhenke Wu |
Abstract | Multi-output regression seeks to infer multiple latent functions using data from multiple groups/sources while accounting for potential between-group similarities. In this paper, we consider multi-output regression under a weakly-supervised setting where a subset of data points from multiple groups are unlabeled. We use dependent Gaussian processes for multiple outputs constructed by convolutions with shared latent processes. We introduce hyperpriors for the multinomial probabilities of the unobserved labels and optimize the hyperparameters which we show improves estimation. We derive two variational bounds: (i) a modified variational bound for fast and stable convergence in model inference, (ii) a scalable variational bound that is amenable to stochastic optimization. We use experiments on synthetic and real-world data to show that the proposed model outperforms state-of-the-art models with more accurate estimation of multiple latent functions and unobserved labels. |
Tasks | Gaussian Processes, Stochastic Optimization |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08412v1 |
https://arxiv.org/pdf/2002.08412v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-multi-output-regression-via |
Repo | |
Framework | |
Uncertainty depth estimation with gated images for 3D reconstruction
Title | Uncertainty depth estimation with gated images for 3D reconstruction |
Authors | Stefanie Walz, Tobias Gruber, Werner Ritter, Klaus Dietmayer |
Abstract | Gated imaging is an emerging sensor technology for self-driving cars that provides high-contrast images even under adverse weather influence. It has been shown that this technology can even generate high-fidelity dense depth maps with accuracy comparable to scanning LiDAR systems. In this work, we extend the recent Gated2Depth framework with aleatoric uncertainty providing an additional confidence measure for the depth estimates. This confidence can help to filter out uncertain estimations in regions without any illumination. Moreover, we show that training on dense depth maps generated by LiDAR depth completion algorithms can further improve the performance. |
Tasks | 3D Reconstruction, Depth Completion, Depth Estimation, Self-Driving Cars |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05122v1 |
https://arxiv.org/pdf/2003.05122v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-depth-estimation-with-gated |
Repo | |
Framework | |
$π$VAE: Encoding stochastic process priors with variational autoencoders
Title | $π$VAE: Encoding stochastic process priors with variational autoencoders |
Authors | Swapnil Mishra, Seth Flaxman, Samir Bhatt |
Abstract | Stochastic processes provide a mathematically elegant way model complex data. In theory, they provide flexible priors over function classes that can encode a wide range of interesting assumptions. In practice, however, efficient inference by optimisation or marginalisation is difficult, a problem further exacerbated with big data and high dimensional input spaces. We propose a novel variational autoencoder (VAE) called the prior encoding variational autoencoder ($\pi$VAE). The $\pi$VAE is finitely exchangeable and Kolmogorov consistent, and thus is a continuous stochastic process. We use $\pi$VAE to learn low dimensional embeddings of function classes. We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference (such as the integral of a log Gaussian process). For popular tasks, such as spatial interpolation, $\pi$VAE achieves state-of-the-art performance both in terms of accuracy and computational efficiency. Perhaps most usefully, we demonstrate that the low dimensional independently distributed latent space representation learnt provides an elegant and scalable means of performing Bayesian inference for stochastic processes within probabilistic programming languages such as Stan. |
Tasks | Bayesian Inference, Gaussian Processes, Probabilistic Programming |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06873v1 |
https://arxiv.org/pdf/2002.06873v1.pdf | |
PWC | https://paperswithcode.com/paper/vae-encoding-stochastic-process-priors-with |
Repo | |
Framework | |
Combining Parametric Land Surface Models with Machine Learning
Title | Combining Parametric Land Surface Models with Machine Learning |
Authors | Craig Pelissier, Jonathan Frame, Grey Nearing |
Abstract | A hybrid machine learning and process-based-modeling (PBM) approach is proposed and evaluated at a handful of AmeriFlux sites to simulate the top-layer soil moisture state. The Hybrid-PBM (HPBM) employed here uses the Noah land-surface model integrated with Gaussian Processes. It is designed to correct the model only in climatological situations similar to the training data else it reverts to the PBM. In this way, our approach avoids bad predictions in scenarios where similar training data is not available and incorporates our physical understanding of the system. Here we assume an autoregressive model and obtain out-of-sample results with upwards of a 3-fold reduction in the RMSE using a one-year leave-one-out cross-validation at each of the selected sites. A path is outlined for using hybrid modeling to build global land-surface models with the potential to significantly outperform the current state-of-the-art. |
Tasks | Gaussian Processes |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06141v1 |
https://arxiv.org/pdf/2002.06141v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-parametric-land-surface-models-with |
Repo | |
Framework | |
Safe Wasserstein Constrained Deep Q-Learning
Title | Safe Wasserstein Constrained Deep Q-Learning |
Authors | Aaron Kandel, Scott J. Moura |
Abstract | This paper presents a distributionally robust Q-Learning algorithm (DrQ) which leverages Wasserstein ambiguity sets to provide probabilistic out-of-sample safety guarantees during online learning. First, we follow past work by separating the constraint functions from the principal objective to create a hierarchy of machines within the constrained Markov decision process (CMDP). DrQ works within this framework by augmenting constraint costs with tightening offset variables obtained through Wasserstein distributionally robust optimization (DRO). These offset variables correspond to worst-case distributions of modeling error characterized by the TD-errors of the constraint Q-functions. This overall procedure allows us to safely approach the nominal constraint boundaries with strong probabilistic out-of-sample safety guarantees. Using a case study of safe lithium-ion battery fast charging, we demonstrate dramatic improvements in safety and performance relative to a conventional DQN. |
Tasks | Q-Learning |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.03016v1 |
https://arxiv.org/pdf/2002.03016v1.pdf | |
PWC | https://paperswithcode.com/paper/safe-wasserstein-constrained-deep-q-learning |
Repo | |
Framework | |
Data integration and prediction models of photovoltaic production from Brazilian northeastern
Title | Data integration and prediction models of photovoltaic production from Brazilian northeastern |
Authors | Hugo Abreu Mendes, Henrique Ferreira Nunes, Manoel da Nobrega Marinho, Paulo Salgado Gomes de Mattos Neto |
Abstract | All productive branches of society need an estimate to be able to control their expenses well. In the energy business, electric utilities use this information to control the power flow in the grid. For better energy production estimation of photovoltaic systems, it is necessary to join multiples geospatial and meteorological variables. This work proposes the creation of a satellite data integration platform, with production estimation models, base stations measurement and actual production capacity. This work presents statistical, probabilistic and artificial intelligence models that generate spatial and temporal production estimates that could improve production gains as well as facilitate the monitoring and supervision of new enterprises are presented. |
Tasks | |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.10866v2 |
https://arxiv.org/pdf/2001.10866v2.pdf | |
PWC | https://paperswithcode.com/paper/data-integration-and-prediction-models-of |
Repo | |
Framework | |
Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
Title | Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds |
Authors | Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool |
Abstract | Humans can robustly recognize and localize objects by integrating visual and auditory cues. While machines are able to do the same now with images, less work has been done with sounds. This work develops an approach for dense semantic labelling of sound-making objects, purely based on binaural sounds. We propose a novel sensor setup and record a new audio-visual dataset of street scenes with eight professional binaural microphones and a 360 degree camera. The co-existence of visual and audio cues is leveraged for supervision transfer. In particular, we employ a cross-modal distillation framework that consists of a vision teacher' method and a sound student’ method – the student method is trained to generate the same results as the teacher method. This way, the auditory system can be trained without using human annotations. We also propose two auxiliary tasks namely, a) a novel task on Spatial Sound Super-resolution to increase the spatial resolution of sounds, and b) dense depth prediction of the scene. We then formulate the three tasks into one end-to-end trainable multi-tasking network aiming to boost the overall performance. Experimental results on the dataset show that 1) our method achieves promising results for semantic prediction and the two auxiliary tasks; and 2) the three tasks are mutually beneficial – training them together achieves the best performance and 3) the number and orientations of microphones are both important. The data and code will be released to facilitate the research in this new direction. |
Tasks | Depth Estimation, Super-Resolution |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04210v1 |
https://arxiv.org/pdf/2003.04210v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-object-prediction-and-spatial-sound |
Repo | |
Framework | |
Harnessing Multi-View Perspective of Light Fields for Low-Light Imaging
Title | Harnessing Multi-View Perspective of Light Fields for Low-Light Imaging |
Authors | Mohit Lamba, Kranthi Kumar, Kaushik Mitra |
Abstract | Light Field (LF) offers unique advantages such as post-capture refocusing and depth estimation, but low-light conditions limit these capabilities. To restore low-light LFs we should harness the geometric cues present in different LF views, which is not possible using single-frame low-light enhancement techniques. We, therefore, propose a deep neural network for Low-Light Light Field (L3F) restoration, which we refer to as L3Fnet. The proposed L3Fnet not only performs the necessary visual enhancement of each LF view but also preserves the epipolar geometry across views. We achieve this by adopting a two-stage architecture for L3Fnet. Stage-I looks at all the LF views to encode the LF geometry. This encoded information is then used in Stage-II to reconstruct each LF view. To facilitate learning-based techniques for low-light LF imaging, we collected a comprehensive LF dataset of various scenes. For each scene, we captured four LFs, one with near-optimal exposure and ISO settings and the others at different levels of low-light conditions varying from low to extreme low-light settings. The effectiveness of the proposed L3Fnet is supported by both visual and numerical comparisons on this dataset. To further analyze the performance of low-light reconstruction methods, we also propose an L3F-wild dataset that contains LF captured late at night with almost zero lux values. No ground truth is available in this dataset. To perform well on the L3F-wild dataset, any method must adapt to the light level of the captured scene. To do this we propose a novel pre-processing block that makes L3Fnet robust to various degrees of low-light conditions. Lastly, we show that L3Fnet can also be used for low-light enhancement of single-frame images, despite it being engineered for LF data. We do so by converting the single-frame DSLR image into a form suitable to L3Fnet, which we call as pseudo-LF. |
Tasks | Depth Estimation |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02438v1 |
https://arxiv.org/pdf/2003.02438v1.pdf | |
PWC | https://paperswithcode.com/paper/harnessing-multi-view-perspective-of-light |
Repo | |
Framework | |
Identifying Table Structure in Documents using Conditional Generative Adversarial Networks
Title | Identifying Table Structure in Documents using Conditional Generative Adversarial Networks |
Authors | Nataliya Le Vine, Claus Horn, Matthew Zeigenfuse, Mark Rowan |
Abstract | In many industries, as well as in academic research, information is primarily transmitted in the form of unstructured documents (this article, for example). Hierarchically-related data is rendered as tables, and extracting information from tables in such documents presents a significant challenge. Many existing methods take a bottom-up approach, first integrating lines into cells, then cells into rows or columns, and finally inferring a structure from the resulting 2-D layout. But such approaches neglect the available prior information relating to table structure, namely that the table is merely an arbitrary representation of a latent logical structure. We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised `skeleton’ table form denoting approximate row and column borders without table content, then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation. The approach is easily adaptable to different table configurations and requires small data set sizes for training. | |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.05853v1 |
https://arxiv.org/pdf/2001.05853v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-table-structure-in-documents |
Repo | |
Framework | |
Meta-SVDD: Probabilistic Meta-Learning for One-Class Classification in Cancer Histology Images
Title | Meta-SVDD: Probabilistic Meta-Learning for One-Class Classification in Cancer Histology Images |
Authors | Jevgenij Gamper, Brandon Chan, Yee Wah Tsang, David Snead, Nasir Rajpoot |
Abstract | To train a robust deep learning model, one usually needs a balanced set of categories in the training data. The data acquired in a medical domain, however, frequently contains an abundance of healthy patients, versus a small variety of positive, abnormal cases. Moreover, the annotation of a positive sample requires time consuming input from medical domain experts. This scenario would suggest a promise for one-class classification type approaches. In this work we propose a general one-class classification model for histology, that is meta-trained on multiple histology datasets simultaneously, and can be applied to new tasks without expensive re-training. This model could be easily used by pathology domain experts, and potentially be used for screening purposes. |
Tasks | Meta-Learning |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03109v1 |
https://arxiv.org/pdf/2003.03109v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-svdd-probabilistic-meta-learning-for-one |
Repo | |
Framework | |