January 27, 2020

3294 words 16 mins read

Paper Group ANR 1312

Quick, Stat!: A Statistical Analysis of the Quick, Draw! Dataset. Deep Learning for Visual Recognition of Environmental Enteropathy and Celiac Disease. Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching. A Convolutional Neural Network for Language-Agnostic Source Code Summarization. What Do Single-view 3D Rec …

Quick, Stat!: A Statistical Analysis of the Quick, Draw! Dataset


Title	Quick, Stat!: A Statistical Analysis of the Quick, Draw! Dataset
Authors	Raul Fernandez-Fernandez, Juan G. Victores, David Estevez, Carlos Balaguer
Abstract	The Quick, Draw! Dataset is a Google dataset with a collection of 50 million drawings, divided in 345 categories, collected from the users of the game Quick, Draw!. In contrast with most of the existing image datasets, in the Quick, Draw! Dataset, drawings are stored as time series of pencil positions instead of a bitmap matrix composed by pixels. This aspect makes this dataset the largest doodle dataset available at the time. The Quick, Draw! Dataset is presented as a great opportunity to researchers for developing and studying machine learning techniques. Due to the size of this dataset and the nature of its source, there is a scarce of information about the quality of the drawings contained. In this paper, a statistical analysis of three of the classes contained in the Quick, Draw! Dataset is depicted: mountain, book and whale. The goal is to give to the reader a first impression of the data collected in this dataset. For the analysis of the quality of the drawings, a Classification Neural Network was trained to obtain a classification score. Using this classification score and the parameters provided by the dataset, a statistical analysis of the quality and nature of the drawings contained in this dataset is provided.
Tasks	Time Series
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06417v2
PDF	https://arxiv.org/pdf/1907.06417v2.pdf
PWC	https://paperswithcode.com/paper/quick-stat-a-statistical-analysis-of-the
Repo
Framework

Deep Learning for Visual Recognition of Environmental Enteropathy and Celiac Disease


Title	Deep Learning for Visual Recognition of Environmental Enteropathy and Celiac Disease
Authors	Aman Shrivastava, Karan Kant, Saurav Sengupta, Sung-Jun Kang, Marium Khan, Asad Ali, Sean R. Moore, Beatrice C. Amadi, Paul Kelly, Donald E. Brown, Sana Syed
Abstract	Physicians use biopsies to distinguish between different but histologically similar enteropathies. The range of syndromes and pathologies that could cause different gastrointestinal conditions makes this a difficult problem. Recently, deep learning has been used successfully in helping diagnose cancerous tissues in histopathological images. These successes motivated the research presented in this paper, which describes a deep learning approach that distinguishes between Celiac Disease (CD) and Environmental Enteropathy (EE) and normal tissue from digitized duodenal biopsies. Experimental results show accuracies of over 90% for this approach. We also look into interpreting the neural network model using Gradient-weighted Class Activation Mappings and filter activations on input images to understand the visual explanations for the decisions made by the model.
Tasks
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03272v1
PDF	https://arxiv.org/pdf/1908.03272v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-visual-recognition-of
Repo
Framework

Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching


Title	Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching
Authors	Tom Zahavy, Shie Mannor
Abstract	We study the neural-linear bandit model for solving sequential decision-making problems with high dimensional side information. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with “old” features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this phenomenon, which we term catastrophic forgetting. We evaluate our method on a variety of real-world data sets, including regression, classification, and sentiment analysis, and observe that our algorithm is resilient to catastrophic forgetting and achieves superior performance.
Tasks	Decision Making, Efficient Exploration, Multi-Armed Bandits, Sentiment Analysis
Published	2019-01-24
URL	https://arxiv.org/abs/1901.08612v2
PDF	https://arxiv.org/pdf/1901.08612v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-linear-bandits-overcoming
Repo
Framework

A Convolutional Neural Network for Language-Agnostic Source Code Summarization


Title	A Convolutional Neural Network for Language-Agnostic Source Code Summarization
Authors	Jessica Moore, Ben Gelman, David Slater
Abstract	Descriptive comments play a crucial role in the software engineering process. They decrease development time, enable better bug detection, and facilitate the reuse of previously written code. However, comments are commonly the last of a software developer’s priorities and are thus either insufficient or missing entirely. Automatic source code summarization may therefore have the ability to significantly improve the software development process. We introduce a novel encoder-decoder model that summarizes source code, effectively writing a comment to describe the code’s functionality. We make two primary innovations beyond current source code summarization models. First, our encoder is fully language-agnostic and requires no complex input preprocessing. Second, our decoder has an open vocabulary, enabling it to predict any word, even ones not seen in training. We demonstrate results comparable to state-of-the-art methods on a single-language data set and provide the first results on a data set consisting of multiple programming languages.
Tasks	Code Summarization
Published	2019-03-29
URL	http://arxiv.org/abs/1904.00805v1
PDF	http://arxiv.org/pdf/1904.00805v1.pdf
PWC	https://paperswithcode.com/paper/a-convolutional-neural-network-for-language
Repo
Framework

What Do Single-view 3D Reconstruction Networks Learn?


Title	What Do Single-view 3D Reconstruction Networks Learn?
Authors	Maxim Tatarchenko, Stephan R. Richter, René Ranftl, Zhuwen Li, Vladlen Koltun, Thomas Brox
Abstract	Convolutional networks for single-view object reconstruction have shown impressive performance and have become a popular subject of research. All existing techniques are united by the idea of having an encoder-decoder network that performs non-trivial reasoning about the 3D structure of the output space. In this work, we set up two alternative approaches that perform image classification and retrieval respectively. These simple baselines yield better results than state-of-the-art methods, both qualitatively and quantitatively. We show that encoder-decoder methods are statistically indistinguishable from these baselines, thus indicating that the current state of the art in single-view object reconstruction does not actually perform reconstruction but image classification. We identify aspects of popular experimental procedures that elicit this behavior and discuss ways to improve the current state of research.
Tasks	3D Reconstruction, Image Classification, Object Reconstruction, Single-View 3D Reconstruction
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03678v1
PDF	https://arxiv.org/pdf/1905.03678v1.pdf
PWC	https://paperswithcode.com/paper/190503678
Repo
Framework

Meta-Learning Neural Bloom Filters


Title	Meta-Learning Neural Bloom Filters
Authors	Jack W Rae, Sergey Bartunov, Timothy P Lillicrap
Abstract	There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression. In this setting, a neural data structure is instantiated by training a network over many epochs of its inputs until convergence. In applications where inputs arrive at high throughput, or are ephemeral, training a network from scratch is not practical. This motivates the need for few-shot neural data structures. In this paper we explore the learning of approximate set membership over a set of data in one-shot via meta-learning. We propose a novel memory architecture, the Neural Bloom Filter, which is able to achieve significant compression gains over classical Bloom Filters and existing memory-augmented neural networks.
Tasks	Meta-Learning
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04304v1
PDF	https://arxiv.org/pdf/1906.04304v1.pdf
PWC	https://paperswithcode.com/paper/meta-learning-neural-bloom-filters-1
Repo
Framework

Automatic acute ischemic stroke lesion segmentation using semi-supervised learning


Title	Automatic acute ischemic stroke lesion segmentation using semi-supervised learning
Authors	Bin Zhao, Hong Wu, Guohua Liu, Chen Cao, Song Jin, Zhiyang Liu, Shuxue Ding
Abstract	Ischemic stroke is a common disease in the elderly population, which can cause long-term disability and even death. However, the time window for treatment of ischemic stroke in its acute stage is very short. To fast localize and quantitively evaluate the acute ischemic stroke (AIS) lesions, many deep-learning-based lesion segmentation methods have been proposed in the literature, where a deep convolutional neural network (CNN) was trained on hundreds of fully labeled subjects with accurate annotations of AIS lesions. Despite that high segmentation accuracy can be achieved, the accurate labels should be annotated by experienced clinicians, and it is therefore very time-consuming to obtain a large number of fully labeled subjects. In this paper, we propose a semi-supervised method to automatically segment AIS lesions in diffusion weighted images and apparent diffusion coefficient maps. By using a large number of weakly labeled subjects and a small number of fully labeled subjects, our proposed method is able to accurately detect and segment the AIS lesions. In particular, our proposed method consists of three parts: 1) a double-path classification net (DPC-Net) trained in a weakly-supervised way is used to detect the suspicious regions of AIS lesions; 2) a pixel-level K-Means clustering algorithm is used to identify the hyperintensive regions on the DWIs; and 3) a region-growing algorithm combines the outputs of the DPC-Net and the K-Means to obtain the final precise lesion segmentation. In our experiment, we use 460 weakly labeled subjects and 15 fully labeled subjects to train and fine-tune the proposed method. By evaluating on a clinical dataset with 150 fully labeled subjects, our proposed method achieves a mean dice coefficient of 0.639, and a lesion-wise F1 score of 0.799.
Tasks	Ischemic Stroke Lesion Segmentation, Lesion Segmentation
Published	2019-08-10
URL	https://arxiv.org/abs/1908.03735v1
PDF	https://arxiv.org/pdf/1908.03735v1.pdf
PWC	https://paperswithcode.com/paper/automatic-acute-ischemic-stroke-lesion
Repo
Framework

Hyperbolic Discounting and Learning over Multiple Horizons


Title	Hyperbolic Discounting and Learning over Multiple Horizons
Authors	William Fedus, Carles Gelada, Yoshua Bengio, Marc G. Bellemare, Hugo Larochelle
Abstract	Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process. The discount factor values future rewards by an exponential scheme that leads to theoretical convergence guarantees of the Bellman equation. However, evidence from psychology, economics and neuroscience suggests that humans and animals instead have hyperbolic time-preferences. In this work we revisit the fundamentals of discounting in RL and bridge this disconnect by implementing an RL agent that acts via hyperbolic discounting. We demonstrate that a simple approach approximates hyperbolic discount functions while still using familiar temporal-difference learning techniques in RL. Additionally, and independent of hyperbolic discounting, we make a surprising discovery that simultaneously learning value functions over multiple time-horizons is an effective auxiliary task which often improves over a strong value-based RL agent, Rainbow.
Tasks
Published	2019-02-19
URL	http://arxiv.org/abs/1902.06865v3
PDF	http://arxiv.org/pdf/1902.06865v3.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-discounting-and-learning-over
Repo
Framework

Deep Neural Networks Predicting Oil Movement in a Development Unit


Title	Deep Neural Networks Predicting Oil Movement in a Development Unit
Authors	Pavel Temirchev, Maxim Simonov, Ruslan Kostoev, Evgeny Burnaev, Ivan Oseledets, Alexey Akhmetov, Andrey Margarit, Alexander Sitnikov, Dmitry Koroteev
Abstract	We present a novel technique for assessing the dynamics of multiphase fluid flow in the oil reservoir. We demonstrate an efficient workflow for handling the 3D reservoir simulation data in a way which is orders of magnitude faster than the conventional routine. The workflow (we call it “Metamodel”) is based on a projection of the system dynamics into a latent variable space, using Variational Autoencoder model, where Recurrent Neural Network predicts the dynamics. We show that being trained on multiple results of the conventional reservoir modelling, the Metamodel does not compromise the accuracy of the reservoir dynamics reconstruction in a significant way. It allows forecasting not only the flow rates from the wells, but also the dynamics of pressure and fluid saturations within the reservoir. The results open a new perspective in the optimization of oilfield development as the scenario screening could be accelerated sufficiently.
Tasks
Published	2019-01-08
URL	https://arxiv.org/abs/1901.02549v2
PDF	https://arxiv.org/pdf/1901.02549v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-predicting-oil-movement
Repo
Framework


Title	Learning joint lesion and tissue segmentation from task-specific hetero-modal datasets
Authors	Reuben Dorent, Wenqi Li, Jinendra Ekanayake, Sebastien Ourselin, Tom Vercauteren
Abstract	Brain tissue segmentation from multimodal MRI is a key building block of many neuroscience analysis pipelines. It could also play an important role in many clinical imaging scenarios. Established tissue segmentation approaches have however not been developed to cope with large anatomical changes resulting from pathology. The effect of the presence of brain lesions, for example, on their performance is thus currently uncontrolled and practically unpredictable. Contrastingly, with the advent of deep neural networks (DNNs), segmentation of brain lesions has matured significantly and is achieving performance levels making it of interest for clinical use. However, few existing approaches allow for jointly segmenting normal tissue and brain lesions. Developing a DNN for such joint task is currently hampered by the fact that annotated datasets typically address only one specific task and rely on a task-specific hetero-modal imaging protocol. In this work, we propose a novel approach to build a joint tissue and lesion segmentation model from task-specific hetero-modal and partially annotated datasets. Starting from a variational formulation of the joint problem, we show how the expected risk can be decomposed and optimised empirically. We exploit an upper-bound of the risk to deal with missing imaging modalities. For each task, our approach reaches comparable performance than task-specific and fully-supervised models.
Tasks	Lesion Segmentation
Published	2019-07-07
URL	https://arxiv.org/abs/1907.03327v1
PDF	https://arxiv.org/pdf/1907.03327v1.pdf
PWC	https://paperswithcode.com/paper/learning-joint-lesion-and-tissue-segmentation
Repo
Framework

Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing


Title	Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
Authors	Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, Junshan Zhang
Abstract	With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation systems to video/audio surveillance. More recently, with the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the AI frontiers to the network edge so as to fully unleash the potential of the edge big data. To meet this demand, edge computing, an emerging paradigm that pushes computing tasks and services from the network core to the network edge, has been widely recognized as a promising solution. The resulted new inter-discipline, edge AI or edge intelligence, is beginning to receive a tremendous amount of interest. However, research on edge intelligence is still in its infancy stage, and a dedicated venue for exchanging the recent advances of edge intelligence is highly desired by both the computer system and artificial intelligence communities. To this end, we conduct a comprehensive survey of the recent research efforts on edge intelligence. Specifically, we first review the background and motivation for artificial intelligence running at the network edge. We then provide an overview of the overarching architectures, frameworks and emerging key technologies for deep learning model towards training/inference at the network edge. Finally, we discuss future research opportunities on edge intelligence. We believe that this survey will elicit escalating attentions, stimulate fruitful discussions and inspire further research ideas on edge intelligence.
Tasks	Recommendation Systems
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10083v1
PDF	https://arxiv.org/pdf/1905.10083v1.pdf
PWC	https://paperswithcode.com/paper/edge-intelligence-paving-the-last-mile-of
Repo
Framework

DeepMRSeg: A convolutional deep neural network for anatomy and abnormality segmentation on MR images


Title	DeepMRSeg: A convolutional deep neural network for anatomy and abnormality segmentation on MR images
Authors	Jimit Doshi, Guray Erus, Mohamad Habes, Christos Davatzikos
Abstract	Segmentation has been a major task in neuroimaging. A large number of automated methods have been developed for segmenting healthy and diseased brain tissues. In recent years, deep learning techniques have attracted a lot of attention as a result of their high accuracy in different segmentation problems. We present a new deep learning based segmentation method, DeepMRSeg, that can be applied in a generic way to a variety of segmentation tasks. The proposed architecture combines recent advances in the field of biomedical image segmentation and computer vision. We use a modified UNet architecture that takes advantage of multiple convolution filter sizes to achieve multi-scale feature extraction adaptive to the desired segmentation task. Importantly, our method operates on minimally processed raw MRI scan. We validated our method on a wide range of segmentation tasks, including white matter lesion segmentation, segmentation of deep brain structures and hippocampus segmentation. We provide code and pre-trained models to allow researchers apply our method on their own datasets.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02110v1
PDF	https://arxiv.org/pdf/1907.02110v1.pdf
PWC	https://paperswithcode.com/paper/deepmrseg-a-convolutional-deep-neural-network
Repo
Framework

Recurrent Aggregation Learning for Multi-View Echocardiographic Sequences Segmentation


Title	Recurrent Aggregation Learning for Multi-View Echocardiographic Sequences Segmentation
Authors	Ming Li, Weiwei Zhang, Guang Yang, Chengjia Wang, Heye Zhang, Huafeng Liu, Wei Zheng, Shuo Li
Abstract	Multi-view echocardiographic sequences segmentation is crucial for clinical diagnosis. However, this task is challenging due to limited labeled data, huge noise, and large gaps across views. Here we propose a recurrent aggregation learning method to tackle this challenging task. By pyramid ConvBlocks, multi-level and multi-scale features are extracted efficiently. Hierarchical ConvLSTMs next fuse these features and capture spatial-temporal information in multi-level and multi-scale space. We further introduce a double-branch aggregation mechanism for segmentation and classification which are mutually promoted by deep aggregation of multi-level and multi-scale features. The segmentation branch provides information to guide the classification while the classification branch affords multi-view regularization to refine segmentations and further lessen gaps across views. Our method is built as an end-to-end framework for segmentation and classification. Adequate experiments on our multi-view dataset (9000 labeled images) and the CAMUS dataset (1800 labeled images) corroborate that our method achieves not only superior segmentation and classification accuracy but also prominent temporal stability.
Tasks
Published	2019-07-24
URL	https://arxiv.org/abs/1907.11292v1
PDF	https://arxiv.org/pdf/1907.11292v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-aggregation-learning-for-multi-view
Repo
Framework

It’s All About The Scale – Efficient Text Detection Using Adaptive Scaling


Title	It’s All About The Scale – Efficient Text Detection Using Adaptive Scaling
Authors	Elad Richardson, Yaniv Azar, Or Avioz, Niv Geron, Tomer Ronen, Zach Avraham, Stav Shapiro
Abstract	“Text can appear anywhere”. This property requires us to carefully process all the pixels in an image in order to accurately localize all text instances. In particular, for the more difficult task of localizing small text regions, many methods use an enlarged image or even several rescaled ones as their input. This significantly increases the processing time of the entire image and needlessly enlarges background regions. If we were to have a prior telling us the coarse location of text instances in the image and their approximate scale, we could have adaptively chosen which regions to process and how to rescale them, thus significantly reducing the processing time. To estimate this prior we propose a segmentation-based network with an additional “scale predictor”, an output channel that predicts the scale of each text segment. The network is applied on a scaled down image to efficiently approximate the desired prior, without processing all the pixels of the original image. The approximated prior is then used to create a compact image containing only text regions, resized to a canonical scale, which is fed again to the segmentation network for fine-grained detection. We show that our approach offers a powerful alternative to fixed scaling schemes, achieving an equivalent accuracy to larger input scales while processing far fewer pixels. Qualitative and quantitative results are presented on the ICDAR15 and ICDAR17 MLT benchmarks to validate our approach.
Tasks
Published	2019-07-28
URL	https://arxiv.org/abs/1907.12122v1
PDF	https://arxiv.org/pdf/1907.12122v1.pdf
PWC	https://paperswithcode.com/paper/its-all-about-the-scale-efficient-text
Repo
Framework

Chargrid-OCR: End-to-end Trainable Optical Character Recognition for Printed Documents using Instance Segmentation


Title	Chargrid-OCR: End-to-end Trainable Optical Character Recognition for Printed Documents using Instance Segmentation
Authors	Christian Reisswig, Anoop R Katti, Marco Spinaci, Johannes Höhne
Abstract	We present an end-to-end trainable approach for Optical Character Recognition (OCR) on printed documents. Specifically, we propose a model that predicts a) a two-dimensional character grid (\emph{chargrid}) representation of a document image as a semantic segmentation task and b) character boxes for delineating character instances as an object detection task. For training the model, we build two large-scale datasets without resorting to any manual annotation - synthetic documents with clean labels and real documents with noisy labels. We demonstrate experimentally that our method, trained on the combination of these datasets, (i) outperforms previous state-of-the-art approaches in accuracy (ii) is easily parallelizable on GPU and is, therefore, significantly faster and (iii) is easy to train and adapt to a new domain.
Tasks	Instance Segmentation, Object Detection, Optical Character Recognition, Semantic Segmentation
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04469v4
PDF	https://arxiv.org/pdf/1909.04469v4.pdf
PWC	https://paperswithcode.com/paper/chargrid-ocr-end-to-end-trainable-optical
Repo
Framework