Paper Group ANR 31
Ensemble learning with Conformal Predictors: Targeting credible predictions of conversion from Mild Cognitive Impairment to Alzheimer’s Disease. Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer. Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks. Bounding the Err …
Ensemble learning with Conformal Predictors: Targeting credible predictions of conversion from Mild Cognitive Impairment to Alzheimer’s Disease
Title | Ensemble learning with Conformal Predictors: Targeting credible predictions of conversion from Mild Cognitive Impairment to Alzheimer’s Disease |
Authors | Telma Pereira, Sandra Cardoso, Dina Silva, Manuela Guerreiro, Alexandre de Mendonça, Sara C. Madeira |
Abstract | Most machine learning classifiers give predictions for new examples accurately, yet without indicating how trustworthy predictions are. In the medical domain, this hampers their integration in decision support systems, which could be useful in the clinical practice. We use a supervised learning approach that combines Ensemble learning with Conformal Predictors to predict conversion from Mild Cognitive Impairment to Alzheimer’s Disease. Our goal is to enhance the classification performance (Ensemble learning) and complement each prediction with a measure of credibility (Conformal Predictors). Our results showed the superiority of the proposed approach over a similar ensemble framework with standard classifiers. |
Tasks | |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01619v2 |
http://arxiv.org/pdf/1807.01619v2.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-learning-with-conformal-predictors |
Repo | |
Framework | |
Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer
Title | Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer |
Authors | Kunal Nagpal, Davis Foote, Yun Liu, Po-Hsuan, Chen, Ellery Wulczyn, Fraser Tan, Niels Olson, Jenny L. Smith, Arash Mohtashamian, James H. Wren, Greg S. Corrado, Robert MacDonald, Lily H. Peng, Mahul B. Amin, Andrew J. Evans, Ankur R. Sangoi, Craig H. Mermel, Jason D. Hipp, Martin C. Stumpe |
Abstract | For prostate cancer patients, the Gleason score is one of the most important prognostic factors, potentially determining treatment independent of the stage. However, Gleason scoring is based on subjective microscopic examination of tumor morphology and suffers from poor reproducibility. Here we present a deep learning system (DLS) for Gleason scoring whole-slide images of prostatectomies. Our system was developed using 112 million pathologist-annotated image patches from 1,226 slides, and evaluated on an independent validation dataset of 331 slides, where the reference standard was established by genitourinary specialist pathologists. On the validation dataset, the mean accuracy among 29 general pathologists was 0.61. The DLS achieved a significantly higher diagnostic accuracy of 0.70 (p=0.002) and trended towards better patient risk stratification in correlations to clinical follow-up data. Our approach could improve the accuracy of Gleason scoring and subsequent therapy decisions, particularly where specialist expertise is unavailable. The DLS also goes beyond the current Gleason system to more finely characterize and quantitate tumor morphology, providing opportunities for refinement of the Gleason system itself. |
Tasks | |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06497v1 |
http://arxiv.org/pdf/1811.06497v1.pdf | |
PWC | https://paperswithcode.com/paper/development-and-validation-of-a-deep-learning |
Repo | |
Framework | |
Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks
Title | Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks |
Authors | Lena Granovsky, Gabi Shalev, Nancy Yacovzada, Yotam Frank, Shai Fine |
Abstract | Common medical conditions are often associated with sleep abnormalities. Patients with medical disorders often suffer from poor sleep quality compared to healthy individuals, which in turn may worsen the symptoms of the disorder. Accurate detection of sleep/wake patterns is important in developing personalized digital markers, which can be used for objective measurements and efficient disease management. Big Data technologies and advanced analytics methods hold the promise to revolutionize clinical research processes, enabling the effective blending of digital data into clinical trials. Actigraphy, a non-invasive activity monitoring method is heavily used to detect and evaluate activities and movement disorders, and assess sleep/wake behavior. In order to study the connection between sleep/wake patterns and a cluster headache disorder, activity data was collected using a wearable device in the course of a clinical trial. This study presents two novel modeling schemes that utilize Deep Convolutional Neural Networks (CNN) to identify sleep/wake states. The proposed methods are a sequential CNN, reminiscent of the bi-directional CNN for slot filling, and a Multi-Task Learning (MTL) based model. Furthermore, we expand standard “Sleep” and “Wake” activity states space by adding the “Falling asleep” and “Siesta” states. We show that the proposed methods provide promising results in accurate detection of the expanded sleep/wake states. Finally, we explore the relations between the detected sleep/wake patterns and onset of cluster headache attacks, and present preliminary observations. |
Tasks | Multi-Task Learning, Sleep Quality, Slot Filling |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07945v1 |
http://arxiv.org/pdf/1802.07945v1.pdf | |
PWC | https://paperswithcode.com/paper/actigraphy-based-sleepwake-pattern-detection |
Repo | |
Framework | |
Bounding the Error From Reference Set Kernel Maximum Mean Discrepancy
Title | Bounding the Error From Reference Set Kernel Maximum Mean Discrepancy |
Authors | Alexander Cloninger |
Abstract | In this paper, we bound the error induced by using a weighted skeletonization of two data sets for computing a two sample test with kernel maximum mean discrepancy. The error is quantified in terms of the speed in which heat diffuses from those points to the rest of the data, as well as how at the weights on the reference points are, and gives a non-asymptotic, non-probabilistic bound. The result ties into the problem of the eigenvector triple product, which appears in a number of important problems. The error bound also suggests an optimization scheme for choosing the best set of reference points and weights. The method is tested on a several two sample test examples. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04594v1 |
http://arxiv.org/pdf/1812.04594v1.pdf | |
PWC | https://paperswithcode.com/paper/bounding-the-error-from-reference-set-kernel |
Repo | |
Framework | |
Dynamical Component Analysis (DyCA): Dimensionality Reduction For High-Dimensional Deterministic Time-Series
Title | Dynamical Component Analysis (DyCA): Dimensionality Reduction For High-Dimensional Deterministic Time-Series |
Authors | Bastian Seifert, Katharina Korn, Steffen Hartmann, Christian Uhl |
Abstract | Multivariate signal processing is often based on dimensionality reduction techniques. We propose a new method, Dynamical Component Analysis (DyCA), leading to a classification of the underlying dynamics and - for a certain type of dynamics - to a signal subspace representing the dynamics of the data. In this paper the algorithm is derived leading to a generalized eigenvalue problem of correlation matrices. The application of the DyCA on high-dimensional chaotic signals is presented both for simulated data as well as real EEG data of epileptic seizures. |
Tasks | Dimensionality Reduction, EEG, Time Series |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10629v2 |
http://arxiv.org/pdf/1807.10629v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamical-component-analysis-dyca |
Repo | |
Framework | |
Instance-Optimality in the Noisy Value-and Comparison-Model — Accept, Accept, Strong Accept: Which Papers get in?
Title | Instance-Optimality in the Noisy Value-and Comparison-Model — Accept, Accept, Strong Accept: Which Papers get in? |
Authors | Vincent Cohen-Addad, Frederik Mallmann-Trenn, Claire Mathieu |
Abstract | Motivated by crowdsourced computation, peer-grading, and recommendation systems, Braverman, Mao and Weinberg [STOC’16] studied the \emph{query} and \emph{round} complexity of fundamental problems such as finding the maximum (\textsc{max}), finding all elements above a certain value (\textsc{threshold-$v$}) or computing the top$-k$ elements (\textsc{Top}-$k$) in a noisy environment. For example, consider the task of selecting papers for a conference. This task is challenging due the crowdsourcing nature of peer reviews: the results of reviews are noisy and it is necessary to parallelize the review process as much as possible. We study the noisy value model and the noisy comparison model: In the \emph{noisy value model}, a reviewer is asked to evaluate a single element: “What is the value of paper $i$?” (\eg accept). In the \emph{noisy comparison model} (introduced in the seminal work of Feige, Peleg, Raghavan and Upfal [SICOMP’94]) a reviewer is asked to do a pairwise comparison: “Is paper $i$ better than paper $j$?” In this paper, we show optimal worst-case query complexity for the \textsc{max},\textsc{threshold-$v$} and \textsc{Top}-$k$ problems. For \textsc{max} and \textsc{Top}-$k$, we obtain optimal worst-case upper and lower bounds on the round vs query complexity in both models. For \textsc{threshold}-$v$, we obtain optimal query complexity and nearly-optimal round complexity, where $k$ is the size of the output) for both models. We then go beyond the worst-case and address the question of the importance of knowledge of the instance by providing, for a large range of parameters, instance-optimal algorithms with respect to the query complexity. Furthermore, we show that the value model is strictly easier than the comparison model. |
Tasks | Recommendation Systems |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08182v2 |
http://arxiv.org/pdf/1806.08182v2.pdf | |
PWC | https://paperswithcode.com/paper/instance-optimality-in-the-noisy-value-and |
Repo | |
Framework | |
Collaborative Learning for Extremely Low Bit Asymmetric Hashing
Title | Collaborative Learning for Extremely Low Bit Asymmetric Hashing |
Authors | Yadan Luo, Zi Huang, Yang Li, Fumin Shen, Yang Yang, Peng Cui |
Abstract | Hashing techniques are in great demand for a wide range of real-world applications such as image retrieval and network compression. Nevertheless, existing approaches could hardly guarantee a satisfactory performance with the extremely low-bit (e.g., 4-bit) hash codes due to the severe information loss and the shrink of the discrete solution space. In this paper, we propose a novel \textit{Collaborative Learning} strategy that is tailored for generating high-quality low-bit hash codes. The core idea is to jointly distill bit-specific and informative representations for a group of pre-defined code lengths. The learning of short hash codes among the group can benefit from the manifold shared with other long codes, where multiple views from different hash codes provide the supplementary guidance and regularization, making the convergence faster and more stable. To achieve that, an asymmetric hashing framework with two variants of multi-head embedding structures is derived, termed as Multi-head Asymmetric Hashing (MAH), leading to great efficiency of training and querying. Extensive experiments on three benchmark datasets have been conducted to verify the superiority of the proposed MAH, and have shown that the 8-bit hash codes generated by MAH achieve $94.3%$ of the MAP (Mean Average Precision (MAP)) score on the CIFAR-10 dataset, which significantly surpasses the performance of the 48-bit codes by the state-of-the-arts in image retrieval tasks. |
Tasks | Image Retrieval |
Published | 2018-09-25 |
URL | https://arxiv.org/abs/1809.09329v3 |
https://arxiv.org/pdf/1809.09329v3.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-learning-for-extremely-low-bit |
Repo | |
Framework | |
Cross-spectral Periocular Recognition: A Survey
Title | Cross-spectral Periocular Recognition: A Survey |
Authors | S. S. Behera, Bappaditya Mandal, N. B. Puhan |
Abstract | Among many biometrics such as face, iris, fingerprint and others, periocular region has the advantages over other biometrics because it is non-intrusive and serves as a balance between iris or eye region (very stringent, small area) and the whole face region (very relaxed large area). Research have shown that this is the region which does not get affected much because of various poses, aging, expression, facial changes and other artifacts, which otherwise would change to a large variation. Active research has been carried out on this topic since past few years due to its obvious advantages over face and iris biometrics in unconstrained and uncooperative scenarios. Many researchers have explored periocular biometrics involving both visible (VIS) and infra-red (IR) spectrum images. For a system to work for 24/7 (such as in surveillance scenarios), the registration process may depend on the day time VIS periocular images (or any mug shot image) and the testing or recognition process may occur in the night time involving only IR periocular images. This gives rise to a challenging research problem called the cross-spectral matching of images where VIS images are used for registration or as gallery images and IR images are used for testing or recognition process and vice versa. After intensive research of more than two decades on face and iris biometrics in cross-spectral domain, a number of researchers have now focused their work on matching heterogeneous (cross-spectral) periocular images. Though a number of surveys have been made on existing periocular biometric research, no study has been done on its cross-spectral aspect. This paper analyses and reviews current state-of-the-art techniques in cross-spectral periocular recognition including various methodologies, databases, their protocols and current-state-of-the-art recognition performances. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01465v1 |
http://arxiv.org/pdf/1812.01465v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-spectral-periocular-recognition-a |
Repo | |
Framework | |
Are pre-trained CNNs good feature extractors for anomaly detection in surveillance videos?
Title | Are pre-trained CNNs good feature extractors for anomaly detection in surveillance videos? |
Authors | Tiago S. Nazare, Rodrigo F. de Mello, Moacir A. Ponti |
Abstract | Recently, several techniques have been explored to detect unusual behaviour in surveillance videos. Nevertheless, few studies leverage features from pre-trained CNNs and none of then present a comparison of features generate by different models. Motivated by this gap, we compare features extracted by four state-of-the-art image classification networks as a way of describing patches from security video frames. We carry out experiments on the Ped1 and Ped2 datasets and analyze the usage of different feature normalization techniques. Our results indicate that choosing the appropriate normalization is crucial to improve the anomaly detection performance when working with CNN features. Also, in the Ped2 dataset our approach was able to obtain results comparable to the ones of several state-of-the-art methods. Lastly, as our method only considers the appearance of each frame, we believe that it can be combined with approaches that focus on motion patterns to further improve performance. |
Tasks | Anomaly Detection, Anomaly Detection In Surveillance Videos, Image Classification |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08495v1 |
http://arxiv.org/pdf/1811.08495v1.pdf | |
PWC | https://paperswithcode.com/paper/are-pre-trained-cnns-good-feature-extractors |
Repo | |
Framework | |
VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning
Title | VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning |
Authors | Fanhua Shang, Kaiwen Zhou, Hongying Liu, James Cheng, Ivor W. Tsang, Lijun Zhang, Dacheng Tao, Licheng Jiao |
Abstract | In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD). Unlike the choices of snapshot and starting points in SVRG and its proximal variant, Prox-SVRG, the two vectors of VR-SGD are set to the average and last iterate of the previous epoch, respectively. The settings allow us to use much larger learning rates, and also make our convergence analysis more challenging. We also design two different update rules for smooth and non-smooth objective functions, respectively, which means that VR-SGD can tackle non-smooth and/or non-strongly convex problems directly without any reduction techniques. Moreover, we analyze the convergence properties of VR-SGD for strongly convex problems, which show that VR-SGD attains linear convergence. Different from its counterparts that have no convergence guarantees for non-strongly convex problems, we also provide the convergence guarantees of VR-SGD for this case, and empirically verify that VR-SGD with varying learning rates achieves similar performance to its momentum accelerated variant that has the optimal convergence rate $\mathcal{O}(1/T^2)$. Finally, we apply VR-SGD to solve various machine learning problems, such as convex and non-convex empirical risk minimization, and leading eigenvalue computation. Experimental results show that VR-SGD converges significantly faster than SVRG and Prox-SVRG, and usually outperforms state-of-the-art accelerated methods, e.g., Katyusha. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09932v2 |
http://arxiv.org/pdf/1802.09932v2.pdf | |
PWC | https://paperswithcode.com/paper/vr-sgd-a-simple-stochastic-variance-reduction |
Repo | |
Framework | |
Multi-label classification search space in the MEKA software
Title | Multi-label classification search space in the MEKA software |
Authors | Alex G. C. de Sá, Alex A. Freitas, Gisele L. Pappa |
Abstract | This technical report describes the multi-label classification (MLC) search space in the MEKA software, including the traditional/meta MLC algorithms, and the traditional/meta/pre-processing single-label classification (SLC) algorithms. The SLC search space is also studied because is part of MLC search space as several methods use problem transformation methods to create a solution (i.e., a classifier) for a MLC problem. This was done in order to understand better the MLC algorithms. Finally, we propose a grammar that formally expresses this understatement. |
Tasks | Multi-Label Classification |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1811.11353v2 |
https://arxiv.org/pdf/1811.11353v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-label-classification-search-space-in |
Repo | |
Framework | |
Gaussian AutoEncoder
Title | Gaussian AutoEncoder |
Authors | Jarek Duda |
Abstract | Generative AutoEncoders require a chosen probability distribution in latent space, usually multivariate Gaussian. The original Variational AutoEncoder (VAE) uses randomness in encoder - causing problematic distortion, and overlaps in latent space for distinct inputs. It turned out unnecessary: we can instead use deterministic encoder with additional regularizer to ensure that sample distribution in latent space is close to the required. The original approach (WAE) uses Wasserstein metric, what required comparing with random sample and using an arbitrarily chosen kernel. Later CWAE finally derived a non-random analytic formula by averaging $L_2$ distance of Gaussian-smoothened sample over all 1D projections. However, these arbitrarily chosen regularizers do not lead to Gaussian distribution. This article proposes approach for regularizers directly optimizing agreement between empirical distribution function and its desired CDF for chosen properties, for example radii and distances for Gaussian distribution, or coordinate-wise, to directly attract this distribution in latent space of AutoEncoder. We can also attract different distributions with this general approach, for example latent space uniform distribution on $[0,1]^D$ hypercube or torus would allow for data compression without entropy coding, increased density near codewords would optimize for the required quantization. |
Tasks | Quantization |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04751v4 |
http://arxiv.org/pdf/1811.04751v4.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-autoencoder |
Repo | |
Framework | |
Vision-based Structural Inspection using Multiscale Deep Convolutional Neural Networks
Title | Vision-based Structural Inspection using Multiscale Deep Convolutional Neural Networks |
Authors | Vedhus Hoskere, Yasutaka Narazaki, Tu Hoang, BillieF Spencer Jr |
Abstract | Current methods of practice for inspection of civil infrastructure typically involve visual assessments conducted manually by trained inspectors. For post-earthquake structural inspections, the number of structures to be inspected often far exceeds the capability of the available inspectors. The labor intensive and time consuming natures of manual inspection have engendered research into development of algorithms for automated damage identification using computer vision techniques. In this paper, a novel damage localization and classification technique based on a state of the art computer vision algorithm is presented to address several key limitations of current computer vision techniques. The proposed algorithm carries out a pixel-wise classification of each image at multiple scales using a deep convolutional neural network and can recognize 6 different types of damage. The resulting output is a segmented image where the portion of the image representing damage is outlined and classified as one of the trained damage categories. The proposed method is evaluated in terms of pixel accuracy and the application of the method to real world images is shown. |
Tasks | |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.01055v1 |
http://arxiv.org/pdf/1805.01055v1.pdf | |
PWC | https://paperswithcode.com/paper/vision-based-structural-inspection-using |
Repo | |
Framework | |
On Lazy Training in Differentiable Programming
Title | On Lazy Training in Differentiable Programming |
Authors | Lenaic Chizat, Edouard Oyallon, Francis Bach |
Abstract | In a series of recent theoretical works, it was shown that strongly over-parameterized neural networks trained with gradient-based methods could converge exponentially fast to zero training loss, with their parameters hardly varying. In this work, we show that this “lazy training” phenomenon is not specific to over-parameterized neural networks, and is due to a choice of scaling, often implicit, that makes the model behave as its linearization around the initialization, thus yielding a model equivalent to learning with positive-definite kernels. Through a theoretical analysis, we exhibit various situations where this phenomenon arises in non-convex optimization and we provide bounds on the distance between the lazy and linearized optimization paths. Our numerical experiments bring a critical note, as we observe that the performance of commonly used non-linear deep convolutional neural networks in computer vision degrades when trained in the lazy regime. This makes it unlikely that “lazy training” is behind the many successes of neural networks in difficult high dimensional tasks. |
Tasks | |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.07956v5 |
https://arxiv.org/pdf/1812.07956v5.pdf | |
PWC | https://paperswithcode.com/paper/a-note-on-lazy-training-in-supervised |
Repo | |
Framework | |
Cross-position Activity Recognition with Stratified Transfer Learning
Title | Cross-position Activity Recognition with Stratified Transfer Learning |
Authors | Yiqiang Chen, Jindong Wang, Meiyu Huang, Han Yu |
Abstract | Human activity recognition aims to recognize the activities of daily living by utilizing the sensors on different body parts. However, when the labeled data from a certain body position (i.e. target domain) is missing, how to leverage the data from other positions (i.e. source domain) to help learn the activity labels of this position? When there are several source domains available, it is often difficult to select the most similar source domain to the target domain. With the selected source domain, we need to perform accurate knowledge transfer between domains. Existing methods only learn the global distance between domains while ignoring the local property. In this paper, we propose a \textit{Stratified Transfer Learning} (STL) framework to perform both source domain selection and knowledge transfer. STL is based on our proposed \textit{Stratified} distance to capture the local property of domains. STL consists of two components: Stratified Domain Selection (STL-SDS) can select the most similar source domain to the target domain; Stratified Activity Transfer (STL-SAT) is able to perform accurate knowledge transfer. Extensive experiments on three public activity recognition datasets demonstrate the superiority of STL. Furthermore, we extensively investigate the performance of transfer learning across different degrees of similarities and activity levels between domains. We also discuss the potential applications of STL in other fields of pervasive computing for future research. |
Tasks | Activity Recognition, Human Activity Recognition, Transfer Learning |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09776v2 |
http://arxiv.org/pdf/1806.09776v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-position-activity-recognition-with |
Repo | |
Framework | |