January 27, 2020

2947 words 14 mins read

Paper Group ANR 1186

Paper Group ANR 1186

Copula & Marginal Flows: Disentangling the Marginal from its Joint. Seeing Behind Things: Extending Semantic Segmentation to Occluded Regions. Queueing Analysis of GPU-Based Inference Servers with Dynamic Batching: A Closed-Form Characterization. Experimental Exploration of Compact Convolutional Neural Network Architectures for Non-temporal Real-ti …

Copula & Marginal Flows: Disentangling the Marginal from its Joint

Title Copula & Marginal Flows: Disentangling the Marginal from its Joint
Authors Magnus Wiese, Robert Knobloch, Ralf Korn
Abstract Deep generative networks such as GANs and normalizing flows flourish in the context of high-dimensional tasks such as image generation. However, so far exact modeling or extrapolation of distributional properties such as the tail asymptotics generated by a generative network is not available. In this paper, we address this issue for the first time in the deep learning literature by making two novel contributions. First, we derive upper bounds for the tails that can be expressed by a generative network and demonstrate Lp-space related properties. There we show specifically that in various situations an optimal generative network does not exist. Second, we introduce and propose copula and marginal generative flows (CM flows) which allow for an exact modeling of the tail and any prior assumption on the CDF up to an approximation of the uniform distribution. Our numerical results support the use of CM flows.
Tasks Image Generation
Published 2019-07-07
URL https://arxiv.org/abs/1907.03361v1
PDF https://arxiv.org/pdf/1907.03361v1.pdf
PWC https://paperswithcode.com/paper/copula-marginal-flows-disentangling-the
Repo
Framework

Seeing Behind Things: Extending Semantic Segmentation to Occluded Regions

Title Seeing Behind Things: Extending Semantic Segmentation to Occluded Regions
Authors Pulak Purkait, Christopher Zach, Ian Reid
Abstract Semantic segmentation and instance level segmentation made substantial progress in recent years due to the emergence of deep neural networks (DNNs). A number of deep architectures with Convolution Neural Networks (CNNs) were proposed that surpass the traditional machine learning approaches for segmentation by a large margin. These architectures predict the directly observable semantic category of each pixel by usually optimizing a cross entropy loss. In this work we push the limit of semantic segmentation towards predicting semantic labels of directly visible as well as occluded objects or objects parts, where the network’s input is a single depth image. We group the semantic categories into one background and multiple foreground object groups, and we propose a modification of the standard cross-entropy loss to cope with the settings. In our experiments we demonstrate that a CNN trained by minimizing the proposed loss is able to predict semantic categories for visible and occluded object parts without requiring to increase the network size (compared to a standard segmentation task). The results are validated on a newly generated dataset (augmented from SUNCG) dataset.
Tasks Semantic Segmentation
Published 2019-06-07
URL https://arxiv.org/abs/1906.02885v2
PDF https://arxiv.org/pdf/1906.02885v2.pdf
PWC https://paperswithcode.com/paper/seeing-behind-things-extending-semantic
Repo
Framework

Queueing Analysis of GPU-Based Inference Servers with Dynamic Batching: A Closed-Form Characterization

Title Queueing Analysis of GPU-Based Inference Servers with Dynamic Batching: A Closed-Form Characterization
Authors Yoshiaki Inoue
Abstract GPU-accelerated computing is a key technology to realize high-speed inference servers using deep neural networks (DNNs). An important characteristic of GPU-based inference is that the computational efficiency, in terms of the processing speed and energy consumption, drastically increases by processing multiple jobs together in a batch. In this paper, we formulate GPU-based inference servers as a batch service queueing model with batch-size dependent processing times. We first show that the energy efficiency of the server monotonically increases with the arrival rate of inference jobs, which suggests that it is energy-efficient to operate the inference server under a utilization level as high as possible within a latency requirement of inference jobs. We then derive a closed-form upper bound for the mean latency, which provides a simple characterization of the latency performance. Through simulation and numerical experiments, we show that the exact value of the mean latency is well approximated by this upper bound.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.06322v1
PDF https://arxiv.org/pdf/1912.06322v1.pdf
PWC https://paperswithcode.com/paper/queueing-analysis-of-gpu-based-inference
Repo
Framework

Experimental Exploration of Compact Convolutional Neural Network Architectures for Non-temporal Real-time Fire Detection

Title Experimental Exploration of Compact Convolutional Neural Network Architectures for Non-temporal Real-time Fire Detection
Authors Ganesh Samarth C. A., Neelanjan Bhowmik, Toby P. Breckon
Abstract In this work we explore different Convolutional Neural Network (CNN) architectures and their variants for non-temporal binary fire detection and localization in video or still imagery. We consider the performance of experimentally defined, reduced complexity deep CNN architectures for this task and evaluate the effects of different optimization and normalization techniques applied to different CNN architectures (spanning the Inception, ResNet and EfficientNet architectural concepts). Contrary to contemporary trends in the field, our work illustrates a maximum overall accuracy of 0.96 for full frame binary fire detection and 0.94 for superpixel localization using an experimentally defined reduced CNN architecture based on the concept of InceptionV4. We notably achieve a lower false positive rate of 0.06 compared to prior work in the field presenting an efficient, robust and real-time solution for fire region detection.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.09010v1
PDF https://arxiv.org/pdf/1911.09010v1.pdf
PWC https://paperswithcode.com/paper/experimental-exploration-of-compact
Repo
Framework

A Note on Our Submission to Track 4 of iDASH 2019

Title A Note on Our Submission to Track 4 of iDASH 2019
Authors Marcel Keller, Ke Sun
Abstract iDASH is a competition soliciting implementations of cryptographic schemes of interest in the context of biology. In 2019, one track asked for multi-party computation implementations of training of a machine learning model suitable for two datasets from cancer research. In this note, we describe our solution submitted to the competition. We found that the training can be run on three AWS c5.9xlarge instances in less then one minute using MPC tolerating one semi-honest corruption, and less than ten seconds at a slightly lower accuracy.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.11680v1
PDF https://arxiv.org/pdf/1910.11680v1.pdf
PWC https://paperswithcode.com/paper/a-note-on-our-submission-to-track-4-of-idash
Repo
Framework

Recommendations for Datasets for Source Code Summarization

Title Recommendations for Datasets for Source Code Summarization
Authors Alexander LeClair, Collin McMillan
Abstract Source Code Summarization is the task of writing short, natural language descriptions of source code. The main use for these descriptions is in software documentation e.g. the one-sentence Java method descriptions in JavaDocs. Code summarization is rapidly becoming a popular research problem, but progress is restrained due to a lack of suitable datasets. In addition, a lack of community standards for creating datasets leads to confusing and unreproducible research results – we observe swings in performance of more than 33% due only to changes in dataset design. In this paper, we make recommendations for these standards from experimental results. We release a dataset based on prior work of over 2.1m pairs of Java methods and one sentence method descriptions from over 28k Java projects. We describe the dataset and point out key differences from natural language data, to guide and support future researchers.
Tasks Code Summarization
Published 2019-04-04
URL http://arxiv.org/abs/1904.02660v1
PDF http://arxiv.org/pdf/1904.02660v1.pdf
PWC https://paperswithcode.com/paper/recommendations-for-datasets-for-source-code
Repo
Framework

SynSin: End-to-end View Synthesis from a Single Image

Title SynSin: End-to-end View Synthesis from a Single Image
Authors Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson
Abstract Single image view synthesis allows for the generation of new views of a scene given a single input image. This is challenging, as it requires comprehensively understanding the 3D scene from a single image. As a result, current methods typically use multiple images, train on ground-truth depth, or are limited to synthetic data. We propose a novel end-to-end model for this task; it is trained on real images without any ground-truth 3D information. To this end, we introduce a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view. The projected features are decoded by our refinement network to inpaint missing regions and generate a realistic output image. The 3D component inside of our generative model allows for interpretable manipulation of the latent feature space at test time, e.g. we can animate trajectories from a single image. Unlike prior work, we can generate high resolution images and generalise to other input resolutions. We outperform baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08804v1
PDF https://arxiv.org/pdf/1912.08804v1.pdf
PWC https://paperswithcode.com/paper/synsin-end-to-end-view-synthesis-from-a
Repo
Framework

Large-scale 3D point cloud representations via graph inception networks with applications to autonomous driving

Title Large-scale 3D point cloud representations via graph inception networks with applications to autonomous driving
Authors Siheng Chen, Sufeng. Niu, Tian Lan, Baoan Liu
Abstract We present a novel graph-neural-network-based system to effectively represent large-scale 3D point clouds with the applications to autonomous driving. Many previous works studied the representations of 3D point clouds based on two approaches, voxelization, which causes discretization errors and learning, which is hard to capture huge variations in large-scale scenarios. In this work, we combine voxelization and learning: we discretize the 3D space into voxels and propose novel graph inception networks to represent 3D points in each voxel. This combination makes the system avoid discretization errors and work for large-scale scenarios. The entire system for large-scale 3D point clouds acts like the blocked discrete cosine transform for 2D images; we thus call it the point cloud neural transform (PCT). We further apply the proposed PCT to represent real-time LiDAR sweeps produced by self-driving cars and the PCT with graph inception networks significantly outperforms its competitors.
Tasks Autonomous Driving, Self-Driving Cars
Published 2019-06-26
URL https://arxiv.org/abs/1906.11359v1
PDF https://arxiv.org/pdf/1906.11359v1.pdf
PWC https://paperswithcode.com/paper/large-scale-3d-point-cloud-representations
Repo
Framework

Restoration of marker occluded hematoxylin and eosin stained whole slide histology images using generative adversarial networks

Title Restoration of marker occluded hematoxylin and eosin stained whole slide histology images using generative adversarial networks
Authors Bairavi Venkatesh, Tosha Shah, Antong Chen, Soheil Ghafurian
Abstract It is common for pathologists to annotate specific regions of the tissue, such as tumor, directly on the glass slide with markers. Although this practice was helpful prior to the advent of histology whole slide digitization, it often occludes important details which are increasingly relevant to immuno-oncology due to recent advancements in digital pathology imaging techniques. The current work uses a generative adversarial network with cycle loss to remove these annotations while still maintaining the underlying structure of the tissue by solving an image-to-image translation problem. We train our network on up to 300 whole slide images with marker inks and show that 70% of the corrected image patches are indistinguishable from originally uncontaminated image tissue to a human expert. This portion increases 97% when we replace the human expert with a deep residual network. We demonstrated the fidelity of the method to the original image by calculating the correlation between image gradient magnitudes. We observed a revival of up to 94,000 nuclei per slide in our dataset, the majority of which were located on tissue border.
Tasks Image-to-Image Translation
Published 2019-10-14
URL https://arxiv.org/abs/1910.06428v1
PDF https://arxiv.org/pdf/1910.06428v1.pdf
PWC https://paperswithcode.com/paper/restoration-of-marker-occluded-hematoxylin
Repo
Framework

Robust Automated Thalamic Nuclei Segmentation using a Multi-planar Cascaded Convolutional Neural Network

Title Robust Automated Thalamic Nuclei Segmentation using a Multi-planar Cascaded Convolutional Neural Network
Authors Mohammad S Majdi, Mahesh B Keerthivasan, Brian K Rutt, Natalie M Zahr, Jeffrey J Rodriguez, Manojkumar Saranathan
Abstract Purpose: To develop a fast, accurate, and robust convolutional neural network (CNN) based method for segmentation of thalamic nuclei. Methods: A cascaded multi-planar scheme with a modified residual U-Net architecture was used to segment thalamic nuclei on clinical datasets acquired using the white-matter-nulled Magnetization Prepared Rapid Gradient Echo (MPRAGE) sequence. A single network was optimized for healthy controls and disease types (multiple sclerosis, essential tremor) and magnetic field strengths (3T and 7T). Another network was developed to use conventional MPRAGE data. Clinical utility was assessed by comparing a cohort of MS patients to healthy subjects. Results: Segmentation of each thalamus into 12 nuclei was achieved in under 4 minutes. For 7T WMn-MPRAGE, the proposed method outperformed current state-of-the-art with statistically significant improvements in Dice ranging from 1.2% to 5.3% for MS and from 2.6% to 38.8% for ET patients. Comparable accuracy (Dice/VSI) was achieved between 7T and 3T data, attesting to the robustness of the method. For conventional MPRAGE, Dice of > 0.7 was achieved for larger nuclei and > 0.6 for the smaller nuclei. Atrophy of five thalamic nuclei and the whole thalamus was observed for MS patients compared to healthy control subjects, after controlling for intracranial volume and age (p<0.004). Conclusion: The proposed segmentation method is fast, accurate, and generalizes across disease types and field strengths and shows great potential for improving our understanding of thalamic nuclei involvement in neurological diseases and healthy aging. KEYWORDS Deep learning, convolutional neural network, transfer learning, thalamic nuclei segmentation
Tasks Transfer Learning
Published 2019-12-16
URL https://arxiv.org/abs/1912.07209v1
PDF https://arxiv.org/pdf/1912.07209v1.pdf
PWC https://paperswithcode.com/paper/robust-automated-thalamic-nuclei-segmentation
Repo
Framework

Deep Learning in the Automotive Industry: Recent Advances and Application Examples

Title Deep Learning in the Automotive Industry: Recent Advances and Application Examples
Authors Kanwar Bharat Singh, Mustafa Ali Arat
Abstract One of the most exciting technology breakthroughs in the last few years has been the rise of deep learning. State-of-the-art deep learning models are being widely deployed in academia and industry, across a variety of areas, from image analysis to natural language processing. These models have grown from fledgling research subjects to mature techniques in real-world use. The increasing scale of data, computational power and the associated algorithmic innovations are the main drivers for the progress we see in this field. These developments also have a huge potential for the automotive industry and therefore the interest in deep learning-based technology is growing. A lot of the product innovations, such as self-driving cars, parking and lane-change assist or safety functions, such as autonomous emergency braking, are powered by deep learning algorithms. Deep learning is poised to offer gains in performance and functionality for most ADAS (Advanced Driver Assistance System) solutions. Virtual sensing for vehicle dynamics application, vehicle inspection/heath monitoring, automated driving and data-driven product development are key areas that are expected to get the most attention. This article provides an overview of the recent advances and some associated challenges in deep learning techniques in the context of automotive applications.
Tasks Self-Driving Cars
Published 2019-06-20
URL https://arxiv.org/abs/1906.08834v2
PDF https://arxiv.org/pdf/1906.08834v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-in-the-automotive-industry-1
Repo
Framework

LioNets: Local Interpretation of Neural Networks through Penultimate Layer Decoding

Title LioNets: Local Interpretation of Neural Networks through Penultimate Layer Decoding
Authors Ioannis Mollas, Nikolaos Bassiliades, Grigorios Tsoumakas
Abstract Technological breakthroughs on smart homes, self-driving cars, health care and robotic assistants, in addition to reinforced law regulations, have critically influenced academic research on explainable machine learning. A sufficient number of researchers have implemented ways to explain indifferently any black box model for classification tasks. A drawback of building agnostic explanators is that the neighbourhood generation process is universal and consequently does not guarantee true adjacency between the generated neighbours and the instance. This paper explores a methodology on providing explanations for a neural network’s decisions, in a local scope, through a process that actively takes into consideration the neural network’s architecture on creating an instance’s neighbourhood, that assures the adjacency among the generated neighbours and the instance.
Tasks Self-Driving Cars
Published 2019-06-15
URL https://arxiv.org/abs/1906.06566v3
PDF https://arxiv.org/pdf/1906.06566v3.pdf
PWC https://paperswithcode.com/paper/lionets-local-interpretation-of-neural
Repo
Framework

Lidar based Detection and Classification of Pedestrians and Vehicles Using Machine Learning Methods

Title Lidar based Detection and Classification of Pedestrians and Vehicles Using Machine Learning Methods
Authors Farzad Shafiei Dizaji
Abstract The goal of this paper is to classify objects mapped by LiDAR sensor into different classes such as vehicles, pedestrians and bikers. Utilizing a LiDAR-based object detector and Neural Networks-based classifier, a novel real-time object detection is presented essentially with respect to aid self-driving vehicles in recognizing and classifying other objects encountered in the course of driving and proceed accordingly. We discuss our work using machine learning methods to tackle a common high-level problem found in machine learning applications for self-driving cars: the classification of pointcloud data obtained from a 3D LiDAR sensor.
Tasks Object Detection, Real-Time Object Detection, Self-Driving Cars
Published 2019-06-12
URL https://arxiv.org/abs/1906.11899v1
PDF https://arxiv.org/pdf/1906.11899v1.pdf
PWC https://paperswithcode.com/paper/lidar-based-detection-and-classification-of
Repo
Framework

Key Ingredients of Self-Driving Cars

Title Key Ingredients of Self-Driving Cars
Authors Rui Fan, Jianhao Jiao, Haoyang Ye, Yang Yu, Ioannis Pitas, Ming Liu
Abstract Over the past decade, many research articles have been published in the area of autonomous driving. However, most of them focus only on a specific technological area, such as visual environment perception, vehicle control, etc. Furthermore, due to fast advances in the self-driving car technology, such articles become obsolete very fast. In this paper, we give a brief but comprehensive overview on key ingredients of autonomous cars (ACs), including driving automation levels, AC sensors, AC software, open source datasets, industry leaders, AC applications and existing challenges.
Tasks Autonomous Driving, Self-Driving Cars
Published 2019-06-07
URL https://arxiv.org/abs/1906.02939v2
PDF https://arxiv.org/pdf/1906.02939v2.pdf
PWC https://paperswithcode.com/paper/key-ingredients-of-self-driving-cars
Repo
Framework

Y-GAN: A Generative Adversarial Network for Depthmap Estimation from Multi-camera Stereo Images

Title Y-GAN: A Generative Adversarial Network for Depthmap Estimation from Multi-camera Stereo Images
Authors Miguel Alonso Jr
Abstract Depth perception is a key component for autonomous systems that interact in the real world, such as delivery robots, warehouse robots, and self-driving cars. Tasks in autonomous robotics such as 3D object recognition, simultaneous localization and mapping (SLAM), path planning and navigation, require some form of 3D spatial information. Depth perception is a long-standing research problem in computer vision and robotics and has had a long history. Many approaches using deep learning, ranging from structure from motion, shape-from-X, monocular, binocular, and multi-view stereo, have yielded acceptable results. However, there are several shortcomings of these methods such as requiring expensive hardware, needing supervised training data, no ground truth data for comparison, and disregard for occlusion. In order to address these shortcomings, this work proposes a new deep convolutional generative adversarial network architecture, called Y-GAN, that uses data from three cameras to estimate a depth map for each frame in a multi-camera video stream.
Tasks 3D Object Recognition, Object Recognition, Self-Driving Cars, Simultaneous Localization and Mapping
Published 2019-06-03
URL https://arxiv.org/abs/1906.00932v1
PDF https://arxiv.org/pdf/1906.00932v1.pdf
PWC https://paperswithcode.com/paper/190600932
Repo
Framework
comments powered by Disqus