Paper Group ANR 315
Optimal Transfer Learning Model for Binary Classification of Funduscopic Images through Simple Heuristics. Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets. Calibrate and Prune: Improving Reliability of Lottery Tickets Through Prediction Calibration. Deep Multi-Modal Sets. CALVIS: chest, waist and pelvis circumference from 3D hum …
Optimal Transfer Learning Model for Binary Classification of Funduscopic Images through Simple Heuristics
Title | Optimal Transfer Learning Model for Binary Classification of Funduscopic Images through Simple Heuristics |
Authors | Rohit Jammula, Vishnu Rajan Tejus, Shreya Shankar |
Abstract | Deep learning models have the capacity to fundamentally revolutionize medical imaging analysis, and they have particularly interesting applications in computer-aided diagnosis. We attempt to use deep learning neural networks to diagnose funduscopic images, visual representations of the interior of the eye. Recently, a few robust deep learning approaches have performed binary classification to infer the presence of a specific ocular disease, such as glaucoma or diabetic retinopathy. In an effort to broaden the applications of computer-aided ocular disease diagnosis, we propose a unifying model for disease classification: low-cost inference of a fundus image to determine whether it is healthy or diseased. To achieve this, we use transfer learning techniques, which retain the more overarching capabilities of a pre-trained base architecture but can adapt to another dataset. For comparisons, we then develop a custom heuristic equation and evaluation metric ranking system to determine the optimal base architecture and hyperparameters. The Xception base architecture, Adam optimizer, and mean squared error loss function perform best, achieving 90% accuracy, 94% sensitivity, and 86% specificity. For additional ease of use, we contain the model in a web interface whose file chooser can access the local filesystem, allowing for use on any internet-connected device: mobile, PC, or otherwise. |
Tasks | Transfer Learning |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04189v3 |
https://arxiv.org/pdf/2002.04189v3.pdf | |
PWC | https://paperswithcode.com/paper/optimal-transfer-learning-model-for-binary |
Repo | |
Framework | |
Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets
Title | Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets |
Authors | Serena Booth, Yilun Zhou, Ankit Shah, Julie Shah |
Abstract | Building machine learning models requires a suite of tools for interpretation, understanding, and debugging. Many existing methods have been proposed, but it can still be difficult to probe for examples which communicate model behaviour. We introduce Bayes-Probe, a model inspection method for analyzing neural networks by generating distribution-conforming examples of known prediction confidence. By selecting appropriate distributions and confidence prediction values, Bayes-Probe can be used to synthesize ambivalent predictions, uncover in-distribution adversarial examples, and understand novel-class extrapolation and domain adaptation behaviours. Bayes-Probe is model agnostic, requiring only a data generator and classifier prediction. We use Bayes-Probe to analyze models trained on both procedurally-generated data (CLEVR) and organic data (MNIST and Fashion-MNIST). Code is available at https://github.com/serenabooth/Bayes-Probe. |
Tasks | Domain Adaptation |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.10248v1 |
https://arxiv.org/pdf/2002.10248v1.pdf | |
PWC | https://paperswithcode.com/paper/bayes-probe-distribution-guided-sampling-for |
Repo | |
Framework | |
Calibrate and Prune: Improving Reliability of Lottery Tickets Through Prediction Calibration
Title | Calibrate and Prune: Improving Reliability of Lottery Tickets Through Prediction Calibration |
Authors | Bindya Venkatesh, Jayaraman J. Thiagarajan, Kowshik Thopalli, Prasanna Sattigeri |
Abstract | The hypothesis that sub-network initializations (lottery) exist within the initializations of over-parameterized networks, which when trained in isolation produce highly generalizable models, has led to crucial insights into network initialization and has enabled computationally efficient inferencing. In order to realize the full potential of these pruning strategies, particularly when utilized in transfer learning scenarios, it is necessary to understand the behavior of winning tickets when they might overfit to the dataset characteristics. In supervised and semi-supervised learning, prediction calibration is a commonly adopted strategy to handle such inductive biases in models. In this paper, we study the impact of incorporating calibration strategies during model training on the quality of the resulting lottery tickets, using several evaluation metrics. More specifically, we incorporate a suite of calibration strategies to different combinations of architectures and datasets, and evaluate the fidelity of sub-networks retrained based on winning tickets. Furthermore, we report the generalization performance of tickets across distributional shifts, when the inductive biases are explicitly controlled using calibration mechanisms. Finally, we provide key insights and recommendations for obtaining reliable lottery tickets, which we demonstrate to achieve improved generalization. |
Tasks | Calibration, Transfer Learning |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03875v2 |
https://arxiv.org/pdf/2002.03875v2.pdf | |
PWC | https://paperswithcode.com/paper/calibrate-and-prune-improving-reliability-of |
Repo | |
Framework | |
Deep Multi-Modal Sets
Title | Deep Multi-Modal Sets |
Authors | Austin Reiter, Menglin Jia, Pu Yang, Ser-Nam Lim |
Abstract | Many vision-related tasks benefit from reasoning over multiple modalities to leverage complementary views of data in an attempt to learn robust embedding spaces. Most deep learning-based methods rely on a late fusion technique whereby multiple feature types are encoded and concatenated and then a multi layer perceptron (MLP) combines the fused embedding to make predictions. This has several limitations, such as an unnatural enforcement that all features be present at all times as well as constraining only a constant number of occurrences of a feature modality at any given time. Furthermore, as more modalities are added, the concatenated embedding grows. To mitigate this, we propose Deep Multi-Modal Sets: a technique that represents a collection of features as an unordered set rather than one long ever-growing fixed-size vector. The set is constructed so that we have invariance both to permutations of the feature modalities as well as to the cardinality of the set. We will also show that with particular choices in our model architecture, we can yield interpretable feature performance such that during inference time we can observe which modalities are most contributing to the prediction.With this in mind, we demonstrate a scalable, multi-modal framework that reasons over different modalities to learn various types of tasks. We demonstrate new state-of-the-art performance on two multi-modal datasets (Ads-Parallelity [34] and MM-IMDb [1]). |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01607v1 |
https://arxiv.org/pdf/2003.01607v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-modal-sets |
Repo | |
Framework | |
CALVIS: chest, waist and pelvis circumference from 3D human body meshes as ground truth for deep learning
Title | CALVIS: chest, waist and pelvis circumference from 3D human body meshes as ground truth for deep learning |
Authors | Yansel Gonzalez Tejeda, Helmut Mayer |
Abstract | In this paper we present CALVIS, a method to calculate $\textbf{C}$hest, w$\textbf{A}$ist and pe$\textbf{LVIS}$ circumference from 3D human body meshes. Our motivation is to use this data as ground truth for training convolutional neural networks (CNN). Previous work had used the large scale CAESAR dataset or determined these anthropometrical measurements $\textit{manually}$ from a person or human 3D body meshes. Unfortunately, acquiring these data is a cost and time consuming endeavor. In contrast, our method can be used on 3D meshes automatically. We synthesize eight human body meshes and apply CALVIS to calculate chest, waist and pelvis circumference. We evaluate the results qualitatively and observe that the measurements can indeed be used to estimate the shape of a person. We then asses the plausibility of our approach by generating ground truth with CALVIS to train a small CNN. After having trained the network with our data, we achieve competitive validation error. Furthermore, we make the implementation of CALVIS publicly available to advance the field. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2003.00834v1 |
https://arxiv.org/pdf/2003.00834v1.pdf | |
PWC | https://paperswithcode.com/paper/calvis-chest-waist-and-pelvis-circumference |
Repo | |
Framework | |
SupRB: A Supervised Rule-based Learning System for Continuous Problems
Title | SupRB: A Supervised Rule-based Learning System for Continuous Problems |
Authors | Michael Heider, David Pätzel, Jörg Hähner |
Abstract | We propose the SupRB learning system, a new Pittsburgh-style learning classifier system (LCS) for supervised learning on multi-dimensional continuous decision problems. SupRB learns an approximation of a quality function from examples (consisting of situations, choices and associated qualities) and is then able to make an optimal choice as well as predict the quality of a choice in a given situation. One area of application for SupRB is parametrization of industrial machinery. In this field, acceptance of the recommendations of machine learning systems is highly reliant on operators’ trust. While an essential and much-researched ingredient for that trust is prediction quality, it seems that this alone is not enough. At least as important is a human-understandable explanation of the reasoning behind a recommendation. While many state-of-the-art methods such as artificial neural networks fall short of this, LCSs such as SupRB provide human-readable rules that can be understood very easily. The prevalent LCSs are not directly applicable to this problem as they lack support for continuous choices. This paper lays the foundations for SupRB and shows its general applicability on a simplified model of an additive manufacturing problem. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10295v1 |
https://arxiv.org/pdf/2002.10295v1.pdf | |
PWC | https://paperswithcode.com/paper/suprb-a-supervised-rule-based-learning-system |
Repo | |
Framework | |
Facial Attribute Capsules for Noise Face Super Resolution
Title | Facial Attribute Capsules for Noise Face Super Resolution |
Authors | Jingwei Xin, Nannan Wang, Xinrui Jiang, Jie Li, Xinbo Gao, Zhifeng Li |
Abstract | Existing face super-resolution (SR) methods mainly assume the input image to be noise-free. Their performance degrades drastically when applied to real-world scenarios where the input image is always contaminated by noise. In this paper, we propose a Facial Attribute Capsules Network (FACN) to deal with the problem of high-scale super-resolution of noisy face image. Capsule is a group of neurons whose activity vector models different properties of the same entity. Inspired by the concept of capsule, we propose an integrated representation model of facial information, which named Facial Attribute Capsule (FAC). In the SR processing, we first generated a group of FACs from the input LR face, and then reconstructed the HR face from this group of FACs. Aiming to effectively improve the robustness of FAC to noise, we generate FAC in semantic, probabilistic and facial attributes manners by means of integrated learning strategy. Each FAC can be divided into two sub-capsules: Semantic Capsule (SC) and Probabilistic Capsule (PC). Them describe an explicit facial attribute in detail from two aspects of semantic representation and probability distribution. The group of FACs model an image as a combination of facial attribute information in the semantic space and probabilistic space by an attribute-disentangling way. The diverse FACs could better combine the face prior information to generate the face images with fine-grained semantic attributes. Extensive benchmark experiments show that our method achieves superior hallucination results and outperforms state-of-the-art for very low resolution (LR) noise face image super resolution. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06518v1 |
https://arxiv.org/pdf/2002.06518v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-attribute-capsules-for-noise-face |
Repo | |
Framework | |
DLow: Diversifying Latent Flows for Diverse Human Motion Prediction
Title | DLow: Diversifying Latent Flows for Diverse Human Motion Prediction |
Authors | Ye Yuan, Kris Kitani |
Abstract | Deep generative models are often used for human motion prediction as they are able to model multi-modal data distributions and characterize diverse human behavior. While much care has been taken into designing and learning deep generative models, how to efficiently produce diverse samples from a deep generative model after it has been trained is still an under-explored problem. To obtain samples from a pretrained generative model, most existing generative human motion prediction methods draw a set of independent Gaussian latent codes and convert them to motion samples. Clearly, this random sampling strategy is not guaranteed to produce diverse samples for two reasons: (1) The independent sampling cannot force the samples to be diverse; (2) The sampling is based solely on likelihood which may only produce samples that correspond to the major modes of the data distribution. To address these problems, we propose a novel sampling method, Diversifying Latent Flows (DLow), to produce a diverse set of samples from a pretrained deep generative model. Unlike random (independent) sampling, the proposed DLow sampling method samples a single random variable and then maps it with a set of learnable mapping functions to a set of correlated latent codes. The correlated latent codes are then decoded into a set of correlated samples. During training, DLow uses a diversity-promoting prior over samples as an objective to optimize the latent mappings to improve sample diversity. The design of the prior is highly flexible and can be customized to generate diverse motions with common features (e.g., similar leg motion but diverse upper-body motion). Our experiments demonstrate that DLow outperforms state-of-the-art baseline methods in terms of sample diversity and accuracy. Video: https://youtu.be/64OEdSadb00. |
Tasks | motion prediction |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08386v1 |
https://arxiv.org/pdf/2003.08386v1.pdf | |
PWC | https://paperswithcode.com/paper/dlow-diversifying-latent-flows-for-diverse |
Repo | |
Framework | |
Quantum device fine-tuning using unsupervised embedding learning
Title | Quantum device fine-tuning using unsupervised embedding learning |
Authors | N. M. van Esbroeck, D. T. Lennon, H. Moon, V. Nguyen, F. Vigneau, L. C. Camenzind, L. Yu, D. M. Zumbühl, G. A. D. Briggs, D. Sejdinovic, N. Ares |
Abstract | Quantum devices with a large number of gate electrodes allow for precise control of device parameters. This capability is hard to fully exploit due to the complex dependence of these parameters on applied gate voltages. We experimentally demonstrate an algorithm capable of fine-tuning several device parameters at once. The algorithm acquires a measurement and assigns it a score using a variational auto-encoder. Gate voltage settings are set to optimise this score in real-time in an unsupervised fashion. We report fine-tuning times of a double quantum dot device within approximately 40 min. |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04409v1 |
https://arxiv.org/pdf/2001.04409v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-device-fine-tuning-using-unsupervised |
Repo | |
Framework | |
Pedestrian Models for Autonomous Driving Part II: high level models of human behaviour
Title | Pedestrian Models for Autonomous Driving Part II: high level models of human behaviour |
Authors | Fanta Camara, Nicola Bellotto, Serhan Cosar, Florian Weber, Dimitris Nathanael, Matthias Althoff, Jingyuan Wu, Johannes Ruenz, André Dietrich, Gustav Markkula, Anna Schieben, Fabio Tango, Natasha Merat, Charles W. Fox |
Abstract | Autonomous vehicles (AVs) must share space with human pedestrians, both in on-road cases such as cars at pedestrian crossings and off-road cases such as delivery vehicles navigating through crowds on high-streets. Unlike static and kinematic obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detection and tracking which enable such modelling. This narrative review article is Part II of a pair which together survey the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low level image detection to high-level psychological models, from the perspective of an AV designer. This self-contained Part II covers the higher levels of this stack, consisting of models of pedestrian behaviour, from prediction of individual pedestrians’ likely destinations and paths, to game theoretic models of interactions between pedestrians and autonomous vehicles. This survey clearly shows that, although there are good models for optimal walking behaviour, high-level psychological and social modelling of pedestrian behaviour still remains an open research question that requires many conceptual issues to be clarified by the community. At these levels, early work has been done on descriptive and qualitative models of behaviour, but much work is still needed to translate them into quantitative algorithms for practical AV control. |
Tasks | Autonomous Driving, Autonomous Vehicles |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.11959v1 |
https://arxiv.org/pdf/2003.11959v1.pdf | |
PWC | https://paperswithcode.com/paper/pedestrian-models-for-autonomous-driving-part-1 |
Repo | |
Framework | |
Optimizing Generative Adversarial Networks for Image Super Resolution via Latent Space Regularization
Title | Optimizing Generative Adversarial Networks for Image Super Resolution via Latent Space Regularization |
Authors | Sheng Zhong, Shifu Zhou |
Abstract | Natural images can be regarded as residing in a manifold that is embedded in a higher dimensional Euclidean space. Generative Adversarial Networks (GANs) try to learn the distribution of the real images in the manifold to generate samples that look real. But the results of existing methods still exhibit many unpleasant artifacts and distortions even for the cases where the desired ground truth target images are available for supervised learning such as in single image super resolution (SISR). We probe for ways to alleviate these problems for supervised GANs in this paper. We explicitly apply the Lipschitz Continuity Condition (LCC) to regularize the GAN. An encoding network that maps the image space to a new optimal latent space is derived from the LCC, and it is used to augment the GAN as a coupling component. The LCC is also converted to new regularization terms in the generator loss function to enforce local invariance. The GAN is optimized together with the encoding network in an attempt to make the generator converge to a more ideal and disentangled mapping that can generate samples more faithful to the target images. When the proposed models are applied to the single image super resolution problem, the results outperform the state of the art. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08126v1 |
https://arxiv.org/pdf/2001.08126v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-generative-adversarial-networks |
Repo | |
Framework | |
Frequency Fitness Assignment: Making Optimization Algorithms Invariant under Bijective Transformations of the Objective Function
Title | Frequency Fitness Assignment: Making Optimization Algorithms Invariant under Bijective Transformations of the Objective Function |
Authors | Thomas Weise, Zhize Wu, Xinlu Li, Yan Chen |
Abstract | Under Frequency Fitness Assignment (FFA), the fitness corresponding to an objective value is its encounter frequency in fitness assignment steps and is subject to minimization. FFA renders optimization processes invariant under bijective transformations of the objective function. This is the strongest invariance property of any optimization procedure to our knowledge. On TwoMax, Jump, and Trap functions of scale s, a (1+1)-EA with standard mutation at rate 1/s can have expected running times exponential in s. In our experiments, a (1+1)-FEA, the same algorithm but using FFA, exhibits mean running times quadratic in s. Since Jump and Trap are bijective transformations of OneMax, it behaves identical on all three. On the LeadingOnes and Plateau problems, it seems to be slower than the (1+1)-EA by a factor linear in s. The (1+1)-FEA performs much better than the (1+1)-EA on W-Model and MaxSat instances. Due to the bijection invariance, the behavior of an optimization algorithm using FFA does not change when the objective values are encrypted. We verify this by applying the Md5 checksum computation as transformation to some of the above problems and yield the same behaviors. Finally, FFA can improve the performance of a Memetic Algorithm for Job Shop Scheduling. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01416v2 |
https://arxiv.org/pdf/2001.01416v2.pdf | |
PWC | https://paperswithcode.com/paper/frequency-fitness-assignment-making |
Repo | |
Framework | |
Towards a Complete Pipeline for Segmenting Nuclei in Feulgen-Stained Images
Title | Towards a Complete Pipeline for Segmenting Nuclei in Feulgen-Stained Images |
Authors | Luiz Antonio Buschetto Macarini, Aldo von Wangenheim, Felipe Perozzo Daltoé, Alexandre Sherlley Casimiro Onofre, Fabiana Botelho de Miranda Onofre, Marcelo Ricardo Stemmer |
Abstract | Cervical cancer is the second most common cancer type in women around the world. In some countries, due to non-existent or inadequate screening, it is often detected at late stages, making standard treatment options often absent or unaffordable. It is a deadly disease that could benefit from early detection approaches. It is usually done by cytological exams which consist of visually inspecting the nuclei searching for morphological alteration. Since it is done by humans, naturally, some subjectivity is introduced. Computational methods could be used to reduce this, where the first stage of the process would be the nuclei segmentation. In this context, we present a complete pipeline for the segmentation of nuclei in Feulgen-stained images using Convolutional Neural Networks. Here we show the entire process of segmentation, since the collection of the samples, passing through pre-processing, training the network, post-processing and results evaluation. We achieved an overall IoU of 0.78, showing the affordability of the approach of nuclei segmentation on Feulgen-stained images. The code is available in: https://github.com/luizbuschetto/feulgen_nuclei_segmentation. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08331v1 |
https://arxiv.org/pdf/2002.08331v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-complete-pipeline-for-segmenting |
Repo | |
Framework | |
Gated Path Selection Network for Semantic Segmentation
Title | Gated Path Selection Network for Semantic Segmentation |
Authors | Qichuan Geng, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Zhong Zhou, Gao Huang |
Abstract | Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints. In this paper, we develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields. In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields. To dynamically select desirable semantic context, a gate prediction module is further introduced. In contrast to previous works that focus on optimizing sample positions on the regular grids, GPSNet can adaptively capture free form dense semantic contexts. The derived adaptive receptive fields are data-dependent, and are flexible that can model different object geometric transformations. On two representative semantic segmentation datasets, i.e., Cityscapes, and ADE20K, we show that the proposed approach consistently outperforms previous methods and achieves competitive performance without bells and whistles. |
Tasks | Semantic Segmentation |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06819v1 |
https://arxiv.org/pdf/2001.06819v1.pdf | |
PWC | https://paperswithcode.com/paper/gated-path-selection-network-for-semantic |
Repo | |
Framework | |
Multi-stream Faster RCNN for Mitosis Counting in Breast Cancer Images
Title | Multi-stream Faster RCNN for Mitosis Counting in Breast Cancer Images |
Authors | Robin Elizabeth Yancey |
Abstract | Mitotic count is a commonly used method to assess the level of progression of breast cancer, which is now the fourth most prevalent cancer. Unfortunately, counting mitosis is a tedious and subjective task with poor reproducibility, especially for non-experts. Luckily, since the machine can read and compare more data with greater efficiency this could be the next modern technique to count mitosis. Furthermore, technological advancements in medicine have led to the increase in image data available for use in training. In this work, we propose a network constructed using a similar approach to one that has been used for image fraud detection with the segmented image map as the second stream input to Faster RCNN. This region-based detection model combines a fully convolutional Region Proposal Network to generate proposals and a classification network to classify each of these proposals as containing mitosis or not. Features from both streams are fused in the bilinear pooling layer to maintain the spatial concurrence of each. After training this model on the ICPR 2014 MITOSIS contest dataset, we received an F-measure score of 0.507, higher than both the winners score and scores from recent tests on the same data. Our method is clinically applicable, taking only around five min per ten full High Power Field slides when tested on a Quadro P6000 cloud GPU. |
Tasks | Fraud Detection |
Published | 2020-02-01 |
URL | https://arxiv.org/abs/2002.03781v1 |
https://arxiv.org/pdf/2002.03781v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-stream-faster-rcnn-for-mitosis-counting |
Repo | |
Framework | |