Paper Group ANR 1166
Machine Learning using the Variational Predictive Information Bottleneck with a Validation Set. Variational Bayesian inference of hidden stochastic processes with unknown parameters. Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network. Agile Domain Adaptation. …
Machine Learning using the Variational Predictive Information Bottleneck with a Validation Set
Title | Machine Learning using the Variational Predictive Information Bottleneck with a Validation Set |
Authors | Sayandev Mukherjee |
Abstract | Zellner (1988) modeled statistical inference in terms of information processing and postulated the Information Conservation Principle (ICP) between the input and output of the information processing block, showing that this yielded Bayesian inference as the optimum information processing rule. Recently, Alemi (2019) reviewed Zellner’s work in the context of machine learning and showed that the ICP could be seen as a special case of a more general optimum information processing criterion, namely the Predictive Information Bottleneck Objective. However, Alemi modeled the model training step in machine learning as using training and test data sets only, and did not account for the use of a validation data set during training. The present note is an attempt to extend Alemi’s information processing formulation of machine learning, and the predictive information bottleneck objective for model training, to the widely-used scenario where training utilizes not only a training but also a validation data set. |
Tasks | Bayesian Inference |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02210v2 |
https://arxiv.org/pdf/1911.02210v2.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-using-the-variational |
Repo | |
Framework | |
Variational Bayesian inference of hidden stochastic processes with unknown parameters
Title | Variational Bayesian inference of hidden stochastic processes with unknown parameters |
Authors | Komlan Atitey, Pavel Loskot, Lyudmila Mihaylova |
Abstract | Estimating hidden processes from non-linear noisy observations is particularly difficult when the parameters of these processes are not known. This paper adopts a machine learning approach to devise variational Bayesian inference for such scenarios. In particular, a random process generated by the autoregressive moving average (ARMA) linear model is inferred from non-linearity noise observations. The posterior distribution of hidden states are approximated by a set of weighted particles generated by the sequential Monte carlo (SMC) algorithm involving sampling with importance sampling resampling (SISR). Numerical efficiency and estimation accuracy of the proposed inference method are evaluated by computer simulations. Furthermore, the proposed inference method is demonstrated on a practical problem of estimating the missing values in the gene expression time series assuming vector autoregressive (VAR) data model. |
Tasks | Bayesian Inference, Time Series |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00757v1 |
https://arxiv.org/pdf/1911.00757v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-bayesian-inference-of-hidden |
Repo | |
Framework | |
Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network
Title | Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network |
Authors | Pengfei Zhou, Tianyi Li, Pan Zhang |
Abstract | We perform theoretical and algorithmic studies for the problem of clustering and semi-supervised classification on graphs with both pairwise relational information and single-point feature information, upon a joint stochastic block model for generating synthetic graphs with both edges and node features. Asymptotically exact analysis based on the Bayesian inference of the underlying model are conducted, using the cavity method in statistical physics. Theoretically, we identify a phase transition of the generative model, which puts fundamental limits on the ability of all possible algorithms in the clustering task of the underlying model. Algorithmically, we propose a belief propagation algorithm that is asymptotically optimal on the generative model, and can be further extended to a belief propagation graph convolution neural network (BPGCN) for semi-supervised classification on graphs. For the first time, well-controlled benchmark datasets with asymptotially exact properties and optimal solutions could be produced for the evaluation of graph convolution neural networks, and for the theoretical understanding of their strengths and weaknesses. In particular, on these synthetic benchmark networks we observe that existing graph convolution neural networks are subject to an sparsity issue and an ovefitting issue in practice, both of which are successfully overcome by our BPGCN. Moreover, when combined with classic neural network methods, BPGCN yields extraordinary classification performances on some real-world datasets that have never been achieved before. |
Tasks | Bayesian Inference |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00197v2 |
https://arxiv.org/pdf/1911.00197v2.pdf | |
PWC | https://paperswithcode.com/paper/phase-transitions-and-optimal-algorithms-in-1 |
Repo | |
Framework | |
Agile Domain Adaptation
Title | Agile Domain Adaptation |
Authors | Jingjing Li, Mengmeng Jing, Yue Xie, Ke Lu, Zi Huang |
Abstract | Domain adaptation investigates the problem of leveraging knowledge from a well-labeled source domain to an unlabeled target domain, where the two domains are drawn from different data distributions. Because of the distribution shifts, different target samples have distinct degrees of difficulty in adaptation. However, existing domain adaptation approaches overwhelmingly neglect the degrees of difficulty and deploy exactly the same framework for all of the target samples. Generally, a simple or shadow framework is fast but rough. A sophisticated or deep framework, on the contrary, is accurate but slow. In this paper, we aim to challenge the fundamental contradiction between the accuracy and speed in domain adaptation tasks. We propose a novel approach, named {\it agile domain adaptation}, which agilely applies optimal frameworks to different target samples and classifies the target samples according to their adaptation difficulties. Specifically, we propose a paradigm which performs several early detections before the final classification. If a sample can be classified at one of the early stage with enough confidence, the sample would exit without the subsequent processes. Notably, the proposed method can significantly reduce the running cost of domain adaptation approaches, which can extend the application scenarios of domain adaptation to even mobile devices and real-time systems. Extensive experiments on two open benchmarks verify the effectiveness and efficiency of the proposed method. |
Tasks | Domain Adaptation |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.04978v1 |
https://arxiv.org/pdf/1907.04978v1.pdf | |
PWC | https://paperswithcode.com/paper/agile-domain-adaptation |
Repo | |
Framework | |
Strong homotopy of digitally continuous functions
Title | Strong homotopy of digitally continuous functions |
Authors | P. Christopher Staecker |
Abstract | We introduce a new type of homotopy relation for digitally continuous functions which we call strong homotopy.'' Both digital homotopy and strong homotopy are natural digitizations of classical topological homotopy: the difference between them is analogous to the difference between digital 4-adjacency and 8-adjacency in the plane. We explore basic properties of strong homotopy, and give some equivalent characterizations. In particular we show that strong homotopy is related to punctuated homotopy,’’ in which the function changes by only one point in each homotopy time step. We also show that strongly homotopic maps always have the same induced homomorphisms in the digital homology theory. This is not generally true for digitally homotopic maps, though we do show that it is true for any homotopic selfmaps on the digital cycle $C_n$ with $n\ge 4$. We also define and consider strong homotopy equivalence of digital images. Using some computer assistance, we produce a catalog of all small digital images up to strong homotopy equivalence. We also briefly consider pointed strong homotopy equivalence, and give an example of a pointed contractible image which is not pointed strongly contractible. |
Tasks | |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00706v1 |
http://arxiv.org/pdf/1903.00706v1.pdf | |
PWC | https://paperswithcode.com/paper/strong-homotopy-of-digitally-continuous |
Repo | |
Framework | |
Jointly Discriminative and Generative Recurrent Neural Networks for Learning from fMRI
Title | Jointly Discriminative and Generative Recurrent Neural Networks for Learning from fMRI |
Authors | Nicha C. Dvornek, Xiaoxiao Li, Juntang Zhuang, James S. Duncan |
Abstract | Recurrent neural networks (RNNs) were designed for dealing with time-series data and have recently been used for creating predictive models from functional magnetic resonance imaging (fMRI) data. However, gathering large fMRI datasets for learning is a difficult task. Furthermore, network interpretability is unclear. To address these issues, we utilize multitask learning and design a novel RNN-based model that learns to discriminate between classes while simultaneously learning to generate the fMRI time-series data. Employing the long short-term memory (LSTM) structure, we develop a discriminative model based on the hidden state and a generative model based on the cell state. The addition of the generative model constrains the network to learn functional communities represented by the LSTM nodes that are both consistent with the data generation as well as useful for the classification task. We apply our approach to the classification of subjects with autism vs. healthy controls using several datasets from the Autism Brain Imaging Data Exchange. Experiments show that our jointly discriminative and generative model improves classification learning while also producing robust and meaningful functional communities for better model understanding. |
Tasks | Time Series |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06950v1 |
https://arxiv.org/pdf/1910.06950v1.pdf | |
PWC | https://paperswithcode.com/paper/jointly-discriminative-and-generative |
Repo | |
Framework | |
Deep learning on edge: extracting field boundaries from satellite images with a convolutional neural network
Title | Deep learning on edge: extracting field boundaries from satellite images with a convolutional neural network |
Authors | François Waldner, Foivos I. Diakogiannis |
Abstract | Applications of digital agricultural services often require either farmers or their advisers to provide digital records of their field boundaries. Automatic extraction of field boundaries from satellite imagery would reduce the reliance on manual input of these records which is time consuming and error-prone, and would underpin the provision of remote products and services. The lack of current field boundary data sets seems to indicate low uptake of existing methods,presumably because of expensive image preprocessing requirements and local, often arbitrary, tuning. In this paper, we address the problem of field boundary extraction from satellite images as a multitask semantic segmentation problem. We used ResUNet-a, a deep convolutional neural network with a fully connected UNet backbone that features dilated convolutions and conditioned inference, to assign three labels to each pixel: 1) the probability of belonging to a field; 2) the probability of being part of a boundary; and 3) the distance to the closest boundary. These labels can then be combined to obtain closed field boundaries. Using a single composite image from Sentinel-2, the model was highly accurate in mapping field extent, field boundaries, and, consequently, individual fields. Replacing the monthly composite with a single-date image close to the compositing period only marginally decreased accuracy. We then showed in a series of experiments that our model generalised well across resolutions, sensors, space and time without recalibration. Building consensus by averaging model predictions from at least four images acquired across the season is the key to coping with the temporal variations of accuracy. By minimising image preprocessing requirements and replacing local arbitrary decisions by data-driven ones, our approach is expected to facilitate the extraction of individual crop fields at scale. |
Tasks | Semantic Segmentation |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.12023v2 |
https://arxiv.org/pdf/1910.12023v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-on-edge-extracting-field |
Repo | |
Framework | |
Computer Aided Detection of Deep Inferior Epigastric Perforators in Computed Tomography Angiography scans
Title | Computer Aided Detection of Deep Inferior Epigastric Perforators in Computed Tomography Angiography scans |
Authors | Ricardo J. Araújo, Vera Garrido, Catarina A. Baraças, Maria A. Vasconcelos, Carlos Mavioso, João C. Anacleto, Maria J. Cardoso, Hélder P. Oliveira |
Abstract | The deep inferior epigastric artery perforator (DIEAP) flap is the most common free flap used for breast reconstruction after a mastectomy. It makes use of the skin and fat of the lower abdomen to build a new breast mound either at the same time of the mastectomy or in a second surgery. This operation requires preoperative imaging studies to evaluate the branches - the perforators - that irrigate the tissue that will be used to reconstruct the breast mound. These branches will support tissue viability after the microsurgical ligation of the inferior epigastric vessels to the receptor vessels in the thorax. Usually through a Computed Tomography Angiography (CTA), each perforator, diameter and direction is manually identified by the imaging team, who will subsequently draw a map for the identification of the best vascular support for the reconstruction. In the current work we propose a semi-automatic methodology that aims at reducing the time and subjectivity inherent to the manual annotation. In 21 CTAs from patients proposed for breast reconstruction with DIEAP flaps, the subcutaneous region of each perforator was extracted, by means of a tracking procedure, whereas the intramuscular portion was detected through a minimum cost approach. Both were subsequently compared with the radiologist manual annotation. Results showed that the semi-automatic procedure was able to correctly detect the course of the DIEAPs with a minimum error (average error of 0.64 mm and 0.50 mm regarding the extraction of subcutaneous and intramuscular paths, respectively). The objective methodology is a promising tool in the automatic detection of perforators in CTA and can contribute to spare human resources and reduce subjectivity in the aforementioned task. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10354v1 |
https://arxiv.org/pdf/1907.10354v1.pdf | |
PWC | https://paperswithcode.com/paper/computer-aided-detection-of-deep-inferior |
Repo | |
Framework | |
A Leisurely Look at Versions and Variants of the Cross Validation Estimator
Title | A Leisurely Look at Versions and Variants of the Cross Validation Estimator |
Authors | Waleed A. Yousef |
Abstract | Many versions of cross-validation (CV) exist in the literature; and each version though has different variants. All are used interchangeably by many practitioners; yet, without explanation to the connection or difference among them. This article has three contributions. First, it starts by mathematical formalization of these different versions and variants that estimate the error rate and the Area Under the ROC Curve (AUC) of a classification rule, to show the connection and difference among them. Second, we prove some of their properties and prove that many variants are either redundant or “not smooth”. Hence, we suggest to abandon all redundant versions and variants and only keep the leave-one-out, the $K$-fold, and the repeated $K$-fold. We show that the latter is the only among the three versions that is “smooth” and hence looks mathematically like estimating the mean performance of the classification rules. However, empirically, for the known phenomenon of “weak correlation”, which we explain mathematically and experimentally, it estimates both conditional and mean performance almost with the same accuracy. Third, we conclude the article with suggesting two research points that may answer the remaining question of whether we can come up with a finalist among the three estimators: (1) a comparative study, that is much more comprehensive than those available in literature and conclude no overall winner, is needed to consider a wide range of distributions, datasets, and classifiers including complex ones obtained via the recent deep learning approach. (2) we sketch the path of deriving a rigorous method for estimating the variance of the only “smooth” version, repeated $K$-fold CV, rather than those ad-hoc methods available in the literature that ignore the covariance structure among the folds of CV. |
Tasks | |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13413v1 |
https://arxiv.org/pdf/1907.13413v1.pdf | |
PWC | https://paperswithcode.com/paper/a-leisurely-look-at-versions-and-variants-of |
Repo | |
Framework | |
Inter and Intra Document Attention for Depression Risk Assessment
Title | Inter and Intra Document Attention for Depression Risk Assessment |
Authors | Diego Maupomé, Marc Queudot, Marie-Jean Meurs |
Abstract | We take interest in the early assessment of risk for depression in social media users. We focus on the eRisk 2018 dataset, which represents users as a sequence of their written online contributions. We implement four RNN-based systems to classify the users. We explore several aggregations methods to combine predictions on individual posts. Our best model reads through all writings of a user in parallel but uses an attention mechanism to prioritize the most important ones at each timestep. |
Tasks | |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00462v1 |
https://arxiv.org/pdf/1907.00462v1.pdf | |
PWC | https://paperswithcode.com/paper/inter-and-intra-document-attention-for |
Repo | |
Framework | |
End-to-end Projector Photometric Compensation
Title | End-to-end Projector Photometric Compensation |
Authors | Bingyao Huang, Haibin Ling |
Abstract | Projector photometric compensation aims to modify a projector input image such that it can compensate for disturbance from the appearance of projection surface. In this paper, for the first time, we formulate the compensation problem as an end-to-end learning problem and propose a convolutional neural network, named CompenNet, to implicitly learn the complex compensation function. CompenNet consists of a UNet-like backbone network and an autoencoder subnet. Such architecture encourages rich multi-level interactions between the camera-captured projection surface image and the input image, and thus captures both photometric and environment information of the projection surface. In addition, the visual details and interaction information are carried to deeper layers along the multi-level skip convolution layers. The architecture is of particular importance for the projector compensation task, for which only a small training dataset is allowed in practice. Another contribution we make is a novel evaluation benchmark, which is independent of system setup and thus quantitatively verifiable. Such benchmark is not previously available, to our best knowledge, due to the fact that conventional evaluation requests the hardware system to actually project the final results. Our key idea, motivated from our end-to-end problem formulation, is to use a reasonable surrogate to avoid such projection process so as to be setup-independent. Our method is evaluated carefully on the benchmark, and the results show that our end-to-end learning solution outperforms state-of-the-arts both qualitatively and quantitatively by a significant margin. |
Tasks | |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.04335v1 |
http://arxiv.org/pdf/1904.04335v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-projector-photometric-compensation |
Repo | |
Framework | |
SAL: Sign Agnostic Learning of Shapes from Raw Data
Title | SAL: Sign Agnostic Learning of Shapes from Raw Data |
Authors | Matan Atzmon, Yaron Lipman |
Abstract | Recently, neural networks have been used as implicit representations for surface reconstruction, modelling, learning, and generation. So far, training neural networks to be implicit representations of surfaces required training data sampled from a ground-truth signed implicit functions such as signed distance or occupancy functions, which are notoriously hard to compute. In this paper we introduce Sign Agnostic Learning (SAL), a deep learning approach for learning implicit shape representations directly from raw, unsigned geometric data, such as point clouds and triangle soups. We have tested SAL on the challenging problem of surface reconstruction from an un-oriented point cloud, as well as end-to-end human shape space learning directly from raw scans dataset, and achieved state of the art reconstructions compared to current approaches. We believe SAL opens the door to many geometric deep learning applications with real-world data, alleviating the usual painstaking, often manual pre-process. |
Tasks | |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10414v2 |
https://arxiv.org/pdf/1911.10414v2.pdf | |
PWC | https://paperswithcode.com/paper/sal-sign-agnostic-learning-of-shapes-from-raw |
Repo | |
Framework | |
ProteinNet: a standardized data set for machine learning of protein structure
Title | ProteinNet: a standardized data set for machine learning of protein structure |
Authors | Mohammed AlQuraishi |
Abstract | Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training / validation splits that account for deep but only weakly detectable homology across protein space. We have created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. ProteinNet thus represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. |
Tasks | Protein Secondary Structure Prediction |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00249v1 |
http://arxiv.org/pdf/1902.00249v1.pdf | |
PWC | https://paperswithcode.com/paper/proteinnet-a-standardized-data-set-for |
Repo | |
Framework | |
Kalman Filtering with Gaussian Processes Measurement Noise
Title | Kalman Filtering with Gaussian Processes Measurement Noise |
Authors | Vince Kurtz, Hai Lin |
Abstract | Real-world measurement noise in applications like robotics is often correlated in time, but we typically assume i.i.d. Gaussian noise for filtering. We propose general Gaussian Processes as a non-parametric model for correlated measurement noise that is flexible enough to accurately reflect correlation in time, yet simple enough to enable efficient computation. We show that this model accurately reflects the measurement noise resulting from vision-based Simultaneous Localization and Mapping (SLAM), and argue that it provides a flexible means of modeling measurement noise for a wide variety of sensor systems and perception algorithms. We then extend existing results for Kalman filtering with autoregressive processes to more general Gaussian Processes, and demonstrate the improved performance of our approach. |
Tasks | Gaussian Processes, Simultaneous Localization and Mapping |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10582v1 |
https://arxiv.org/pdf/1909.10582v1.pdf | |
PWC | https://paperswithcode.com/paper/kalman-filtering-with-gaussian-processes |
Repo | |
Framework | |
Rethinking Continual Learning for Autonomous Agents and Robots
Title | Rethinking Continual Learning for Autonomous Agents and Robots |
Authors | German I. Parisi, Christopher Kanan |
Abstract | Continual learning refers to the ability of a biological or artificial system to seamlessly learn from continuous streams of information while preventing catastrophic forgetting, i.e., a condition in which new incoming information strongly interferes with previously learned representations. Since it is unrealistic to provide artificial agents with all the necessary prior knowledge to effectively operate in real-world conditions, they must exhibit a rich set of learning capabilities enabling them to interact in complex environments with the aim to process and make sense of continuous streams of (often uncertain) information. While the vast majority of continual learning models are designed to alleviate catastrophic forgetting on simplified classification tasks, here we focus on continual learning for autonomous agents and robots required to operate in much more challenging experimental settings. In particular, we discuss well-established biological learning factors such as developmental and curriculum learning, transfer learning, and intrinsic motivation and their computational counterparts for modeling the progressive acquisition of increasingly complex knowledge and skills in a continual fashion. |
Tasks | Continual Learning, Transfer Learning |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01929v1 |
https://arxiv.org/pdf/1907.01929v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-continual-learning-for-autonomous |
Repo | |
Framework | |