April 3, 2020

3651 words 18 mins read

Paper Group ANR 23

Paper Group ANR 23

A Deep learning Approach to Generate Contrast-Enhanced Computerised Tomography Angiography without the Use of Intravenous Contrast Agents. Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation. Keyword-based Topic Modeling and Keyword Selection. Bimodal Distribution Removal and Genetic Algorithm in Neural Network …

A Deep learning Approach to Generate Contrast-Enhanced Computerised Tomography Angiography without the Use of Intravenous Contrast Agents

Title A Deep learning Approach to Generate Contrast-Enhanced Computerised Tomography Angiography without the Use of Intravenous Contrast Agents
Authors Anirudh Chandrashekar, Ashok Handa, Natesh Shivakumar, Pierfrancesco Lapolla, Vicente Grau, Regent Lee
Abstract Contrast-enhanced computed tomography angiograms (CTAs) are widely used in cardiovascular imaging to obtain a non-invasive view of arterial structures. However, contrast agents are associated with complications at the injection site as well as renal toxicity leading to contrast-induced nephropathy (CIN) and renal failure. We hypothesised that the raw data acquired from a non-contrast CT contains sufficient information to differentiate blood and other soft tissue components. We utilised deep learning methods to define the subtleties between soft tissue components in order to simulate contrast enhanced CTAs without contrast agents. Twenty-six patients with paired non-contrast and CTA images were randomly selected from an approved clinical study. Non-contrast axial slices within the AAA from 10 patients (n = 100) were sampled for the underlying Hounsfield unit (HU) distribution at the lumen, intra-luminal thrombus and interface locations. Sampling of HUs in these regions revealed significant differences between all regions (p<0.001 for all comparisons), confirming the intrinsic differences in the radiomic signatures between these regions. To generate a large training dataset, paired axial slices from the training set (n=13) were augmented to produce a total of 23,551 2-D images. We trained a 2-D Cycle Generative Adversarial Network (cycleGAN) for this non-contrast to contrast (NC2C) transformation task. The accuracy of the cycleGAN output was assessed by comparison to the contrast image. This pipeline is able to differentiate between visually incoherent soft tissue regions in non-contrast CT images. The CTAs generated from the non-contrast images bear strong resemblance to the ground truth. Here we describe a novel application of Generative Adversarial Network for CT image processing. This is poised to disrupt clinical pathways requiring contrast enhanced CT imaging.
Published 2020-03-02
URL https://arxiv.org/abs/2003.01223v1
PDF https://arxiv.org/pdf/2003.01223v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-approach-to-generate-contrast

Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation

Title Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation
Authors Steven Kleinegesse, Michael U. Gutmann
Abstract Implicit stochastic models, where the data-generation distribution is intractable but sampling is possible, are ubiquitous in the natural sciences. The models typically have free parameters that need to be inferred from data collected in scientific experiments. A fundamental question is how to design the experiments so that the collected data are most useful. The field of Bayesian experimental design advocates that, ideally, we should choose designs that maximise the mutual information (MI) between the data and the parameters. For implicit models, however, this approach is severely hampered by the high computational cost of computing posteriors and maximising MI, in particular when we have more than a handful of design variables to optimise. In this paper, we propose a new approach to Bayesian experimental design for implicit models that leverages recent advances in neural MI estimation to deal with these issues. We show that training a neural network to maximise a lower bound on MI allows us to jointly determine the optimal design and the posterior. Simulation studies illustrate that this gracefully extends Bayesian experimental design for implicit models to higher design dimensions.
Published 2020-02-19
URL https://arxiv.org/abs/2002.08129v1
PDF https://arxiv.org/pdf/2002.08129v1.pdf
PWC https://paperswithcode.com/paper/bayesian-experimental-design-for-implicit

Keyword-based Topic Modeling and Keyword Selection

Title Keyword-based Topic Modeling and Keyword Selection
Authors Xingyu Wang, Lida Zhang, Diego Klabjan
Abstract Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of knowing the forthcoming documents and the underlying topics. The future topics should mimic past topics of interest yet there should be some novelty in them. We develop a keyword-based topic model that dynamically selects a subset of keywords to be used to collect future documents. The generative process first selects keywords and then the underlying documents based on the specified keywords. The model is trained by using a variational lower bound and stochastic gradient optimization. The inference consists of finding a subset of keywords where given a subset the model predicts the underlying topic-word matrix for the unknown forthcoming documents. We compare the keyword topic model against a benchmark model using viral predictions of tweets combined with a topic model. The keyword-based topic model outperforms this sophisticated baseline model by 67%.
Published 2020-01-22
URL https://arxiv.org/abs/2001.07866v1
PDF https://arxiv.org/pdf/2001.07866v1.pdf
PWC https://paperswithcode.com/paper/keyword-based-topic-modeling-and-keyword

Bimodal Distribution Removal and Genetic Algorithm in Neural Network for Breast Cancer Diagnosis

Title Bimodal Distribution Removal and Genetic Algorithm in Neural Network for Breast Cancer Diagnosis
Authors Ke Quan
Abstract Diagnosis of breast cancer has been well studied in the past. Multiple linear programming models have been devised to approximate the relationship between cell features and tumour malignancy. However, these models are less capable in handling non-linear correlations. Neural networks instead are powerful in processing complex non-linear correlations. It is thus certainly beneficial to approach this cancer diagnosis problem with a model based on neural network. Particularly, introducing bias to neural network training process is deemed as an important means to increase training efficiency. Out of a number of popular proposed methods for introducing artificial bias, Bimodal Distribution Removal (BDR) presents ideal efficiency improvement results and fair simplicity in implementation. However, this paper examines the effectiveness of BDR against the target cancer diagnosis classification problem and shows that BDR process in fact negatively impacts classification performance. In addition, this paper also explores genetic algorithm as an efficient tool for feature selection and produced significantly better results comparing to baseline model that without any feature selection in place
Tasks Feature Selection
Published 2020-02-20
URL https://arxiv.org/abs/2002.08729v1
PDF https://arxiv.org/pdf/2002.08729v1.pdf
PWC https://paperswithcode.com/paper/bimodal-distribution-removal-and-genetic

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Title A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Authors Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao
Abstract Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking. In this paper, we develop a first-pass Recurrent Neural Network Transducer (RNN-T) model and a second-pass Listen, Attend, Spell (LAS) rescorer that surpasses a conventional model in both quality and latency. On the quality side, we incorporate a large number of utterances across varied domains to increase acoustic diversity and the vocabulary seen by the model. We also train with accented English speech to make the model more robust to different pronunciations. In addition, given the increased amount of training data, we explore a varied learning rate schedule. On the latency front, we explore using the end-of-sentence decision emitted by the RNN-T model to close the microphone, and also introduce various optimizations to improve the speed of LAS rescoring. Overall, we find that RNN-T+LAS offers a better WER and latency tradeoff compared to a conventional model. For example, for the same latency, RNN-T+LAS obtains a 8% relative improvement in WER, while being more than 400-times smaller in model size.
Published 2020-03-28
URL https://arxiv.org/abs/2003.12710v1
PDF https://arxiv.org/pdf/2003.12710v1.pdf
PWC https://paperswithcode.com/paper/a-streaming-on-device-end-to-end-model
Title NAViDAd: A No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder
Authors Helard Martinez, M. C. Farias, A. Hines
Abstract The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audio-visual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based audio-visual quality metric. The approach presented in this work is based on the assumption that autoencoders, fed with descriptive audio and video features, might produce a set of features that is able to describe the complex audio and video interactions. Based on this hypothesis, we propose a No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd). The model visual features are natural scene statistics (NSS) and spatial-temporal measures of the video component. Meanwhile, the audio features are obtained by computing the spectrogram representation of the audio component. The model is formed by a 2-layer framework that includes a deep autoencoder layer and a classification layer. These two layers are stacked and trained to build the deep neural network model. The model is trained and tested using a large set of stimuli, containing representative audio and video artifacts. The model performed well when tested against the UnB-AV and the LiveNetflix-II databases. %Results shows that this type of approach produces quality scores that are highly correlated to subjective quality scores.
Published 2020-01-30
URL https://arxiv.org/abs/2001.11406v2
PDF https://arxiv.org/pdf/2001.11406v2.pdf
PWC https://paperswithcode.com/paper/navidad-a-no-reference-audio-visual-quality

RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference

Title RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference
Authors Oindrila Saha, Aditya Kusupati, Harsha Vardhan Simhadri, Manik Varma, Prateek Jain
Abstract Pooling operators are key components in most Convolutional Neural Networks (CNNs) as they serve to downsample images, aggregate feature information, and increase receptive field. However, standard pooling operators reduce the feature size gradually to avoid significant loss in information via gross aggregation. Consequently, CNN architectures tend to be deep, computationally expensive and challenging to deploy on RAM constrained devices. We introduce RNNPool, a novel pooling operator based on Recurrent Neural Networks (RNNs), that efficiently aggregate features over large patches of an image and rapidly downsamples its size. Our empirical evaluation indicates that an RNNPool layer(s) can effectively replace multiple blocks in a variety of architectures such as MobileNets (Sandler et al., 2018), DenseNet (Huang et al., 2017) and can be used for several vision tasks like image classification and face detection. That is, RNNPool can significantly decrease computational complexity and peak RAM usage for inference, while retaining comparable accuracy. Further, we use RNNPool to construct a novel real-time face detection method that achieves state-of-the-art MAP within computational budget afforded by a tiny Cortex M4 microcontroller with ~256 KB RAM.
Tasks Face Detection, Image Classification
Published 2020-02-27
URL https://arxiv.org/abs/2002.11921v1
PDF https://arxiv.org/pdf/2002.11921v1.pdf
PWC https://paperswithcode.com/paper/rnnpool-efficient-non-linear-pooling-for-ram

Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality

Title Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality
Authors Andrew Lensen, Mengjie Zhang, Bing Xue
Abstract Manifold learning techniques have become increasingly valuable as data continues to grow in size. By discovering a lower-dimensional representation (embedding) of the structure of a dataset, manifold learning algorithms can substantially reduce the dimensionality of a dataset while preserving as much information as possible. However, state-of-the-art manifold learning algorithms are opaque in how they perform this transformation. Understanding the way in which the embedding relates to the original high-dimensional space is critical in exploratory data analysis. We previously proposed a Genetic Programming method that performed manifold learning by evolving mappings that are transparent and interpretable. This method required the dimensionality of the embedding to be known a priori, which makes it hard to use when little is known about a dataset. In this paper, we substantially extend our previous work, by introducing a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality. Our proposed approach is competitive with a range of baseline and state-of-the-art manifold learning methods, while also providing a range (front) of solutions that give different trade-offs between quality and dimensionality. Furthermore, the learned models are shown to often be simple and efficient, utilising only a small number of features in an interpretable manner.
Published 2020-01-05
URL https://arxiv.org/abs/2001.01331v1
PDF https://arxiv.org/pdf/2001.01331v1.pdf
PWC https://paperswithcode.com/paper/multi-objective-genetic-programming-for

Parameterized Objectives and Algorithms for Clustering Bipartite Graphs and Hypergraphs

Title Parameterized Objectives and Algorithms for Clustering Bipartite Graphs and Hypergraphs
Authors Nate Veldt, Anthony Wirth, David F. Gleich
Abstract Graph clustering objective functions with tunable resolution parameters make it possible to detect different types of clustering structure in the same graph. These objectives also provide a unifying view of other non-parametric objectives, which often can be captured as special cases. Previous research has largely focused on parametric objectives for standard graphs, in which all nodes are of the same type, and edges model pairwise relationships. In our work, we introduced parameterized objective functions and approximation algorithms specifically for clustering bipartite graphs and hypergraphs, based on correlation clustering. This enables us to develop principled approaches for clustering datasets with different node types (bipartite graphs) or multiway relationships (hypergraphs). Our hypergraph objective is related to higher-order notions of modularity and normalized cut, and is amenable to approximation algorithms via hypergraph expansion techniques. Our bipartite objective generalizes standard bipartite correlation clustering, and in a certain parameter regime is equivalent to bicluster deletion, i.e., removing a minimum number of edges to separate a bipartite graph into disjoint bicliques. The problem in general is NP-hard, but we show that in a certain parameter regime it is equivalent to a bipartite matching problem, meaning that it is polynomial time solvable in this regime. For other regimes, we provide approximation guarantees based on LP-rounding. Our results include the first constant factor approximation algorithm for bicluster deletion. We illustrate the flexibility of our framework in several experiments. This includes clustering a food web and an email network based on higher-order motif structure, detecting clusters of retail products in product review hypergraph, and evaluating our algorithms across a range of parameter settings on several real world bipartite graphs.
Tasks Graph Clustering
Published 2020-02-21
URL https://arxiv.org/abs/2002.09460v1
PDF https://arxiv.org/pdf/2002.09460v1.pdf
PWC https://paperswithcode.com/paper/parameterized-objectives-and-algorithms-for

Embedding Graph Auto-Encoder with Joint Clustering via Adjacency Sharing

Title Embedding Graph Auto-Encoder with Joint Clustering via Adjacency Sharing
Authors Xuelong Li, Hongyuan Zhang, Rui Zhang
Abstract Graph convolution networks have attracted many attentions and several graph auto-encoder based clustering models are developed for attributed graph clustering. However, most existing approaches separate clustering and optimization of graph auto-encoder into two individual steps. In this paper, we propose a graph convolution network based clustering model, namely, Embedding Graph Auto-Encoder with JOint Clustering via Adjacency Sharing (\textit{EGAE-JOCAS}). As for the embedded model, we develop a novel joint clustering method, which combines relaxed k-means and spectral clustering and is applicable for the learned embedding. The proposed joint clustering shares the same adjacency within graph convolution layers. Two parts are optimized simultaneously through performing SGD and taking close-form solutions alternatively to ensure a rapid convergence. Moreover, our model is free to incorporate any mechanisms (e.g., attention) into graph auto-encoder. Extensive experiments are conducted to prove the superiority of EGAE-JOCAS. Sufficient theoretical analyses are provided to support the results.
Tasks Graph Clustering
Published 2020-02-20
URL https://arxiv.org/abs/2002.08643v1
PDF https://arxiv.org/pdf/2002.08643v1.pdf
PWC https://paperswithcode.com/paper/embedding-graph-auto-encoder-with-joint

Dynamic Federated Learning

Title Dynamic Federated Learning
Authors Elsa Rizk, Stefan Vlaski, Ali H. Sayed
Abstract Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. While many federated learning architectures process data in an online manner, and are hence adaptive by nature, most performance analyses assume static optimization problems and offer no guarantees in the presence of drifts in the problem solution or data characteristics. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm. The results clarify the trade-off between convergence and tracking performance.
Published 2020-02-20
URL https://arxiv.org/abs/2002.08782v1
PDF https://arxiv.org/pdf/2002.08782v1.pdf
PWC https://paperswithcode.com/paper/dynamic-federated-learning

Human Grasp Classification for Reactive Human-to-Robot Handovers

Title Human Grasp Classification for Reactive Human-to-Robot Handovers
Authors Wei Yang, Chris Paxton, Maya Cakmak, Dieter Fox
Abstract Transfer of objects between humans and robots is a critical capability for collaborative robots. Although there has been a recent surge of interest in human-robot handovers, most prior research focus on robot-to-human handovers. Further, work on the equally critical human-to-robot handovers often assumes humans can place the object in the robot’s gripper. In this paper, we propose an approach for human-to-robot handovers in which the robot meets the human halfway, by classifying the human’s grasp of the object and quickly planning a trajectory accordingly to take the object from the human’s hand according to their intent. To do this, we collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses, and learn a deep model on this dataset to classify the hand grasps into one of these categories. We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position, and replans as necessary when the handover is interrupted. Through a systematic evaluation, we demonstrate that our system results in more fluent handovers versus two baselines. We also present findings from a user study (N = 9) demonstrating the effectiveness and usability of our approach with naive users in different scenarios. More results and videos can be found at http://wyang.me/handovers.
Published 2020-03-12
URL https://arxiv.org/abs/2003.06000v1
PDF https://arxiv.org/pdf/2003.06000v1.pdf
PWC https://paperswithcode.com/paper/human-grasp-classification-for-reactive-human

fastai: A Layered API for Deep Learning

Title fastai: A Layered API for Deep Learning
Authors Jeremy Howard, Sylvain Gugger
Abstract fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4-5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We have used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching. NB: This paper covers fastai v2, which is currently in pre-release at http://dev.fast.ai/
Published 2020-02-11
URL https://arxiv.org/abs/2002.04688v2
PDF https://arxiv.org/pdf/2002.04688v2.pdf
PWC https://paperswithcode.com/paper/fastai-a-layered-api-for-deep-learning

Cooperative Highway Work Zone Merge Control based on Reinforcement Learning in A Connected and Automated Environment

Title Cooperative Highway Work Zone Merge Control based on Reinforcement Learning in A Connected and Automated Environment
Authors Tianzhu Ren, Yuanchang Xie, Liming Jiang
Abstract Given the aging infrastructure and the anticipated growing number of highway work zones in the United States, it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicle in the closed lane learns how to optimally adjust its longitudinal position to find a safe gap in the open lane using an off-policy soft actor critic (SAC) reinforcement learning (RL) algorithm, considering the traffic conditions in its surrounding. The learning results are captured in convolutional neural networks and used to control individual vehicles in the testing phase. By adding the metering zones and taking the locations, speeds, and accelerations of surrounding vehicles into account, cooperation among vehicles is implicitly considered. This RL-based model is trained and evaluated using a microscopic traffic simulator. The results show that this cooperative RL-based merge control significantly outperforms popular strategies such as late merge and early merge in terms of both mobility and safety measures.
Published 2020-01-21
URL https://arxiv.org/abs/2001.08581v1
PDF https://arxiv.org/pdf/2001.08581v1.pdf
PWC https://paperswithcode.com/paper/cooperative-highway-work-zone-merge-control

Dropout: Explicit Forms and Capacity Control

Title Dropout: Explicit Forms and Capacity Control
Authors Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro
Abstract We investigate the capacity control provided by dropout in various machine learning problems. First, we study dropout for matrix completion, where it induces a data-dependent regularizer that, in expectation, equals the weighted trace-norm of the product of the factors. In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks. These developments enable us to give concrete generalization error bounds for the dropout algorithm in both matrix completion as well as training deep neural networks. We evaluate our theoretical findings on real-world datasets, including MovieLens, MNIST, and Fashion-MNIST.
Tasks Matrix Completion
Published 2020-03-06
URL https://arxiv.org/abs/2003.03397v1
PDF https://arxiv.org/pdf/2003.03397v1.pdf
PWC https://paperswithcode.com/paper/dropout-explicit-forms-and-capacity-control-1
comments powered by Disqus