April 1, 2020

3043 words 15 mins read

Paper Group ANR 487

Paper Group ANR 487

Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes. Co-VeGAN: Complex-Valued Generative Adversarial Network for Compressive Sensing MR Image Reconstruction. Superbloom: Bloom filter meets Transformer. 2.75D Convolutional Neural Network for Pulmonary Nodule Classification in Chest CT. Outlier Gu …

Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes

Title Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes
Authors Rachel Lea Draelos, David Dov, Maciej A. Mazurowski, Joseph Y. Lo, Ricardo Henao, Geoffrey D. Rubin, Lawrence Carin
Abstract Machine learning models for radiology benefit from large-scale data sets with high quality labels for abnormalities. We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. This is the largest multiply-annotated volumetric medical imaging data set reported. To annotate this data set, we developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports with an average F-score of 0.976 (min 0.941, max 1.0). We also developed a model for multi-organ, multi-disease classification of chest CT volumes that uses a deep convolutional neural network (CNN). This model reached a classification performance of AUROC greater than 0.90 for 18 abnormalities, with an average AUROC of 0.773 for all 83 abnormalities, demonstrating the feasibility of learning from unfiltered whole volume CT data. We show that training on more labels improves performance significantly: for a subset of 9 labels - nodule, opacity, atelectasis, pleural effusion, consolidation, mass, pericardial effusion, cardiomegaly, and pneumothorax - the model’s average AUROC increased by 10% when the number of training labels was increased from 9 to all 83. All code for volume preprocessing, automated label extraction, and the volume abnormality prediction model will be made publicly available. The 36,316 CT volumes and labels will also be made publicly available pending institutional approval.
Tasks Computed Tomography (CT)
Published 2020-02-12
URL https://arxiv.org/abs/2002.04752v2
PDF https://arxiv.org/pdf/2002.04752v2.pdf
PWC https://paperswithcode.com/paper/machine-learning-based-multiple-abnormality

Co-VeGAN: Complex-Valued Generative Adversarial Network for Compressive Sensing MR Image Reconstruction

Title Co-VeGAN: Complex-Valued Generative Adversarial Network for Compressive Sensing MR Image Reconstruction
Authors Bhavya Vasudeva, Puneesh Deora, Saumik Bhattacharya, Pyari Mohan Pradhan
Abstract Compressive sensing (CS) is widely used to reduce the image acquisition time of magnetic resonance imaging (MRI). Though CS based undersampling has numerous benefits, like high quality images with less motion artefacts, low storage requirement, etc., the reconstruction of the image from the CS-undersampled data is an ill-posed inverse problem which requires extensive computation and resources. In this paper, we propose a novel deep network that can process complex-valued input to perform high-quality reconstruction. Our model is based on generative adversarial network (GAN) that uses residual-in-residual dense blocks in a modified U-net generator with patch based discriminator. We introduce a wavelet based loss in the complex GAN model for better reconstruction quality. Extensive analyses on different datasets demonstrate that the proposed model significantly outperforms the existing CS reconstruction techniques in terms of peak signal-to-noise ratio and structural similarity index.
Tasks Compressive Sensing, Image Reconstruction
Published 2020-02-24
URL https://arxiv.org/abs/2002.10523v1
PDF https://arxiv.org/pdf/2002.10523v1.pdf
PWC https://paperswithcode.com/paper/co-vegan-complex-valued-generative

Superbloom: Bloom filter meets Transformer

Title Superbloom: Bloom filter meets Transformer
Authors John Anderson, Qingqing Huang, Walid Krichene, Steffen Rendle, Li Zhang
Abstract We extend the idea of word pieces in natural language models to machine learning tasks on opaque ids. This is achieved by applying hash functions to map each id to multiple hash tokens in a much smaller space, similarly to a Bloom filter. We show that by applying a multi-layer Transformer to these Bloom filter digests, we are able to obtain models with high accuracy. They outperform models of a similar size without hashing and, to a large degree, models of a much larger size trained using sampled softmax with the same computational budget. Our key observation is that it is important to use a multi-layer Transformer for Bloom filter digests to remove ambiguity in the hashed input. We believe this provides an alternative method to solving problems with large vocabulary size.
Published 2020-02-11
URL https://arxiv.org/abs/2002.04723v1
PDF https://arxiv.org/pdf/2002.04723v1.pdf
PWC https://paperswithcode.com/paper/superbloom-bloom-filter-meets-transformer-1

2.75D Convolutional Neural Network for Pulmonary Nodule Classification in Chest CT

Title 2.75D Convolutional Neural Network for Pulmonary Nodule Classification in Chest CT
Authors Ruisheng Su, Weiyi Xie, Tao Tan
Abstract Early detection and classification of pulmonary nodules in Chest Computed tomography (CT) images is an essential step for effective treatment of lung cancer. However, due to the large volume of CT data, finding nodules in chest CT is a time consuming thus error prone task for radiologists. Benefited from the recent advances in Convolutional Neural Networks (ConvNets), many algorithms based on ConvNets for automatic nodule detection have been proposed. According to the data representation in their input, these algorithms can be further categorized into: 2D, 3D and 2.5D which uses a combination of 2D images to approximate 3D information. Leveraging 3D spatial and contextual information, the method using 3D input generally outperform that based on 2D or 2.5D input, whereas its large memory footprints becomes the bottleneck for many applications. In this paper, we propose a novel 2D data representation of a 3D CT volume, which is constructed by spiral scanning a set of radials originated from the 3D volume center, referred to as the 2.75D. Comparing to the 2.5D, the 2.75D representation captures omni-directional spatial information of a 3D volume. Based on 2.75D representation of 3D nodule candidates in Chest CT, we train a convolutional neural network to perform the false positive reduction in the nodule detection pipeline. We evaluate the nodule false positive reduction system on the LUNA16 data set which contains 1186 nodules out of 551,065 candidates. By comparing 2.75D with 2D, 2.5D and 3D, we show that our system using 2.75D input outperforms 2D and 2.5D, yet slightly inferior to the systems using 3D input. The proposed strategy dramatically reduces the memory consumption thus allow fast inference and training by enabling larger number of batches comparing to the methods using 3D input.
Tasks Computed Tomography (CT)
Published 2020-02-11
URL https://arxiv.org/abs/2002.04251v1
PDF https://arxiv.org/pdf/2002.04251v1.pdf
PWC https://paperswithcode.com/paper/275d-convolutional-neural-network-for

Outlier Guided Optimization of Abdominal Segmentation

Title Outlier Guided Optimization of Abdominal Segmentation
Authors Yuchen Xu, Olivia Tang, Yucheng Tang, Ho Hin Lee, Yunqiang Chen, Dashan Gao, Shizhong Han, Riqiang Gao, Michael R. Savona, Richard G. Abramson, Yuankai Huo, Bennett A. Landman
Abstract Abdominal multi-organ segmentation of computed tomography (CT) images has been the subject of extensive research interest. It presents a substantial challenge in medical image processing, as the shape and distribution of abdominal organs can vary greatly among the population and within an individual over time. While continuous integration of novel datasets into the training set provides potential for better segmentation performance, collection of data at scale is not only costly, but also impractical in some contexts. Moreover, it remains unclear what marginal value additional data have to offer. Herein, we propose a single-pass active learning method through human quality assurance (QA). We built on a pre-trained 3D U-Net model for abdominal multi-organ segmentation and augmented the dataset either with outlier data (e.g., exemplars for which the baseline algorithm failed) or inliers (e.g., exemplars for which the baseline algorithm worked). The new models were trained using the augmented datasets with 5-fold cross-validation (for outlier data) and withheld outlier samples (for inlier data). Manual labeling of outliers increased Dice scores with outliers by 0.130, compared to an increase of 0.067 with inliers (p<0.001, two-tailed paired t-test). By adding 5 to 37 inliers or outliers to training, we find that the marginal value of adding outliers is higher than that of adding inliers. In summary, improvement on single-organ performance was obtained without diminishing multi-organ performance or significantly increasing training time. Hence, identification and correction of baseline failure cases present an effective and efficient method of selecting training data to improve algorithm performance.
Tasks Active Learning, Computed Tomography (CT)
Published 2020-02-10
URL https://arxiv.org/abs/2002.04098v1
PDF https://arxiv.org/pdf/2002.04098v1.pdf
PWC https://paperswithcode.com/paper/outlier-guided-optimization-of-abdominal

FPGA Implementation of Minimum Mean Brightness Error Bi-Histogram Equalization

Title FPGA Implementation of Minimum Mean Brightness Error Bi-Histogram Equalization
Authors Abhishek Saroha, Avichal Rakesh, Rajiv Kumar Tripathi
Abstract Histogram Equalization (HE) is a popular method for contrast enhancement. Generally, mean brightness is not conserved in Histogram Equalization. Initially, Bi-Histogram Equalization (BBHE) was proposed to enhance contrast while maintaining a the mean brightness. However, when mean brightness is primary concern, Minimum Mean Brightness Error Bi-Histogram Equalization (MMBEBHE) is the best technique. There are several implementations of Histogram Equalization on FPGA, however to our knowledge MMBEBHE has not been implemented on FPGAs before. Therefore, we present an implementation of MMBEBHE on FPGA.
Published 2020-02-12
URL https://arxiv.org/abs/2003.00840v1
PDF https://arxiv.org/pdf/2003.00840v1.pdf
PWC https://paperswithcode.com/paper/fpga-implementation-of-minimum-mean

Attention U-Net Based Adversarial Architectures for Chest X-ray Lung Segmentation

Title Attention U-Net Based Adversarial Architectures for Chest X-ray Lung Segmentation
Authors Gusztáv Gaál, Balázs Maga, András Lukács
Abstract Chest X-ray is the most common test among medical imaging modalities. It is applied for detection and differentiation of, among others, lung cancer, tuberculosis, and pneumonia, the last with importance due to the COVID-19 disease. Integrating computer-aided detection methods into the radiologist diagnostic pipeline, greatly reduces the doctors’ workload, increasing reliability and quantitative analysis. Here we present a novel deep learning approach for lung segmentation, a basic, but arduous task in the diagnostic pipeline. Our method uses state-of-the-art fully convolutional neural networks in conjunction with an adversarial critic model. It generalized well to CXR images of unseen datasets with different patient profiles, achieving a final DSC of 97.5% on the JSRT dataset.
Published 2020-03-23
URL https://arxiv.org/abs/2003.10304v1
PDF https://arxiv.org/pdf/2003.10304v1.pdf
PWC https://paperswithcode.com/paper/attention-u-net-based-adversarial

A Deep Learning Approach to Automate High-Resolution Blood Vessel Reconstruction on Computerized Tomography Images With or Without the Use of Contrast Agent

Title A Deep Learning Approach to Automate High-Resolution Blood Vessel Reconstruction on Computerized Tomography Images With or Without the Use of Contrast Agent
Authors Anirudh Chandrashekar, Ashok Handa, Natesh Shivakumar, Pierfrancesco Lapolla, Vicente Grau, Regent Lee
Abstract Existing methods to reconstruct vascular structures from a computed tomography (CT) angiogram rely on injection of intravenous contrast to enhance the radio-density within the vessel lumen. However, pathological changes can be present in the blood lumen, vessel wall or a combination of both that prevent accurate reconstruction. In the example of aortic aneurysmal disease, a blood clot or thrombus adherent to the aortic wall within the expanding aneurysmal sac is present in 70-80% of cases. These deformations prevent the automatic extraction of vital clinically relevant information by current methods. In this study, we implemented a modified U-Net architecture with attention-gating to establish a high-throughput and automated segmentation pipeline of pathological blood vessels in CT images acquired with or without the use of a contrast agent. Twenty-six patients with paired non-contrast and contrast-enhanced CT images within the ongoing Oxford Abdominal Aortic Aneurysm (OxAAA) study were randomly selected, manually annotated and used for model training and evaluation (13/13). Data augmentation methods were implemented to diversify the training data set in a ratio of 10:1. The performance of our Attention-based U-Net in extracting both the inner lumen and the outer wall of the aortic aneurysm from CT angiograms (CTA) was compared against a generic 3-D U-Net and displayed superior results. Subsequent implementation of this network architecture within the aortic segmentation pipeline from both contrast-enhanced CTA and non-contrast CT images has allowed for accurate and efficient extraction of the entire aortic volume. This extracted volume can be used to standardize current methods of aneurysmal disease management and sets the foundation for subsequent complex geometric and morphological analysis. Furthermore, the proposed pipeline can be extended to other vascular pathologies.
Tasks Computed Tomography (CT), Data Augmentation, Morphological Analysis
Published 2020-02-09
URL https://arxiv.org/abs/2002.03463v1
PDF https://arxiv.org/pdf/2002.03463v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-approach-to-automate-high

Web Table Extraction, Retrieval and Augmentation: A Survey

Title Web Table Extraction, Retrieval and Augmentation: A Survey
Authors Shuo Zhang, Krisztian Balog
Abstract Tables are a powerful and popular tool for organizing and manipulating data. A vast number of tables can be found on the Web, which represents a valuable knowledge resource. The objective of this survey is to synthesize and present two decades of research on web tables. In particular, we organize existing literature into six main categories of information access tasks: table extraction, table interpretation, table search, question answering, knowledge base augmentation, and table augmentation. For each of these tasks, we identify and describe seminal approaches, present relevant resources, and point out interdependencies among the different tasks.
Tasks Question Answering
Published 2020-02-01
URL https://arxiv.org/abs/2002.00207v2
PDF https://arxiv.org/pdf/2002.00207v2.pdf
PWC https://paperswithcode.com/paper/web-table-extraction-retrieval-and

DGST : Discriminator Guided Scene Text detector

Title DGST : Discriminator Guided Scene Text detector
Authors Jinyuan Zhao, Yanna Wang, Baihua Xiao, Cunzhao Shi, Fuxi Jia, Chunheng Wang
Abstract Scene text detection task has attracted considerable attention in computer vision because of its wide application. In recent years, many researchers have introduced methods of semantic segmentation into the task of scene text detection, and achieved promising results. This paper proposes a detector framework based on the conditional generative adversarial networks to improve the segmentation effect of scene text detection, called DGST (Discriminator Guided Scene Text detector). Instead of binary text score maps generated by some existing semantic segmentation based methods, we generate a multi-scale soft text score map with more information to represent the text position more reasonably, and solve the problem of text pixel adhesion in the process of text extraction. Experiments on standard datasets demonstrate that the proposed DGST brings noticeable gain and outperforms state-of-the-art methods. Specifically, it achieves an F-measure of 87% on ICDAR 2015 dataset.
Tasks Scene Text Detection, Semantic Segmentation
Published 2020-02-28
URL https://arxiv.org/abs/2002.12509v1
PDF https://arxiv.org/pdf/2002.12509v1.pdf
PWC https://paperswithcode.com/paper/dgst-discriminator-guided-scene-text-detector

Multivariate Boosted Trees and Applications to Forecasting and Control

Title Multivariate Boosted Trees and Applications to Forecasting and Control
Authors Lorenzo Nespoli, Vasco Medici
Abstract Gradient boosted trees are competition-winning, general-purpose, non-parametric regressors, which exploit sequential model fitting and gradient descent to minimize a specific loss function. The most popular implementations are tailored to univariate regression and classification tasks, precluding the possibility of capturing multivariate target cross-correlations and applying conditional penalties to the predictions. In this paper, we present a computationally efficient algorithm for fitting multivariate boosted trees. We show that multivariate trees can outperform their univariate counterpart when the predictions are correlated. Furthermore, the algorithm allows to arbitrarily regularize the predictions, so that properties like smoothness, consistency and functional relations can be enforced. We present applications and numerical results related to forecasting and control.
Published 2020-03-08
URL https://arxiv.org/abs/2003.03835v1
PDF https://arxiv.org/pdf/2003.03835v1.pdf
PWC https://paperswithcode.com/paper/multivariate-boosted-trees-and-applications

The Discrete Gaussian for Differential Privacy

Title The Discrete Gaussian for Differential Privacy
Authors Clément Canonne, Gautam Kamath, Thomas Steinke
Abstract We show how to efficiently provide differentially private answers to counting queries (or integer-valued low-sensitivity queries) by adding discrete Gaussian noise, with essentially the same privacy and accuracy as the continuous Gaussian. The use of a discrete distribution is necessary in practice, as finite computers cannot represent samples from continuous distributions and numerical errors may destroy the privacy guarantee.
Published 2020-03-31
URL https://arxiv.org/abs/2004.00010v1
PDF https://arxiv.org/pdf/2004.00010v1.pdf
PWC https://paperswithcode.com/paper/the-discrete-gaussian-for-differential

Music2Dance: DanceNet for Music-driven Dance Generation

Title Music2Dance: DanceNet for Music-driven Dance Generation
Authors Wenlin Zhuang, Congyi Wang, Siyu Xia, Jinxiang Chai, Yangang Wang
Abstract Synthesize human motions from music, i.e., music to dance, is appealing and attracts lots of research interests in recent years. It is challenging due to not only the requirement of realistic and complex human motions for dance, but more importantly, the synthesized motions should be consistent with the style, rhythm and melody of the music. In this paper, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals to generate 3D dance motions with high realism and diversity. To boost the performance of our proposed model, we capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset. Experiments have demonstrated that the proposed method can achieve the state-of-the-art results.
Published 2020-02-02
URL https://arxiv.org/abs/2002.03761v2
PDF https://arxiv.org/pdf/2002.03761v2.pdf
PWC https://paperswithcode.com/paper/music2dance-music-driven-dance-generation

Extending Automated Deduction for Commonsense Reasoning

Title Extending Automated Deduction for Commonsense Reasoning
Authors Tanel Tammet
Abstract Commonsense reasoning has long been considered as one of the holy grails of artificial intelligence. Most of the recent progress in the field has been achieved by novel machine learning algorithms for natural language processing. However, without incorporating logical reasoning, these algorithms remain arguably shallow. With some notable exceptions, developers of practical automated logic-based reasoners have mostly avoided focusing on the problem. The paper argues that the methods and algorithms used by existing automated reasoners for classical first-order logic can be extended towards commonsense reasoning. Instead of devising new specialized logics we propose a framework of extensions to the mainstream resolution-based search methods to make these capable of performing search tasks for practical commonsense reasoning with reasonable efficiency. The proposed extensions mostly rely on operating on ordinary proof trees and are devised to handle commonsense knowledge bases containing inconsistencies, default rules, taxonomies, topics, relevance, confidence and similarity measures. We claim that machine learning is best suited for the construction of commonsense knowledge bases while the extended logic-based methods would be well-suited for actually answering queries from these knowledge bases.
Published 2020-03-29
URL https://arxiv.org/abs/2003.13159v1
PDF https://arxiv.org/pdf/2003.13159v1.pdf
PWC https://paperswithcode.com/paper/extending-automated-deduction-for-commonsense

Tiny Eats: Eating Detection on a Microcontroller

Title Tiny Eats: Eating Detection on a Microcontroller
Authors Maria T. Nyamukuru, Kofi M. Odame
Abstract There is a growing interest in low power highly efficient wearable devices for automatic dietary monitoring (ADM) [1]. The success of deep neural networks in audio event classification problems makes them ideal for this task. Deep neural networks are, however, not only computationally intensive and energy inefficient but also require a large amount of memory. To address these challenges, we propose a shallow gated recurrent unit (GRU) architecture suitable for resource-constrained applications. This paper describes the implementation of the Tiny Eats GRU, a shallow GRU neural network, on a low power micro-controller, Arm Cortex M0+, to classify eating episodes. Tiny Eats GRU is a hybrid of the traditional GRU [2] and eGRU [3] to make it small and fast enough to fit on the Arm Cortex M0+ with comparable accuracy to the traditional GRU. The Tiny Eats GRU utilizes only 4% of the Arm Cortex M0+ memory and identifies eating or non-eating episodes with 6 ms latency and accuracy of 95.15%.
Published 2020-03-14
URL https://arxiv.org/abs/2003.06699v1
PDF https://arxiv.org/pdf/2003.06699v1.pdf
PWC https://paperswithcode.com/paper/tiny-eats-eating-detection-on-a
comments powered by Disqus