January 25, 2020

3402 words 16 mins read

Paper Group ANR 1716

Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression. Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering. Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion. DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retino …

Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression


Title	Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression
Authors	Bidur Khanal, Lavsen Dahal, Prashant Adhikari, Bishesh Khanal
Abstract	Correct evaluation and treatment of Scoliosis require accurate estimation of spinal curvature. Current gold standard is to manually estimate Cobb Angles in spinal X-ray images which is time consuming and has high inter-rater variability. We propose an automatic method with a novel framework that first detects vertebrae as objects followed by a landmark detector that estimates the 4 landmark corners of each vertebra separately. Cobb Angles are calculated using the slope of each vertebra obtained from the predicted landmarks. For inference on test data, we perform pre and post processings that include cropping, outlier rejection and smoothing of the predicted landmarks. The results were assessed in AASCE MICCAI challenge 2019 which showed a promise with a SMAPE score of 25.69 on the challenge test set.
Tasks
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14202v1
PDF	https://arxiv.org/pdf/1910.14202v1.pdf
PWC	https://paperswithcode.com/paper/automatic-cobb-angle-detection-using-vertebra
Repo
Framework

Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering


Title	Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering
Authors	Xiaoyu Qin, Kai Ming Ting, Ye Zhu, Vincent CS Lee
Abstract	A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on density-based clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00378v1
PDF	https://arxiv.org/pdf/1907.00378v1.pdf
PWC	https://paperswithcode.com/paper/nearest-neighbour-induced-isolation
Repo
Framework

Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion


Title	Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion
Authors	Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
Abstract	Deep convolutional neural networks (DCNNs) are powerful models that yield impressive results at object classification. However, recent work has shown that they do not generalize well to partially occluded objects and to mask attacks. In contrast to DCNNs, compositional models are robust to partial occlusion, however, they are not as discriminative as deep models. In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks. Our model is learned in two steps. First, a standard DCNN is trained for image classification. Subsequently, we cluster the DCNN features into dictionaries. We show that the dictionary components resemble object part detectors and learn the spatial distribution of parts for each object class. We propose mixtures of compositional models to account for large changes in the spatial activation patterns (e.g. due to changes in the 3D pose of an object). At runtime, an image is first classified by the DCNN in a feedforward manner. The prediction uncertainty is used to detect partially occluded objects, which in turn are classified by the compositional model. Our experimental results demonstrate that combining compositional models and DCNNs resolves a fundamental problem of current deep learning approaches to computer vision: The combined model recognizes occluded objects, even when it has not been exposed to occluded objects during training, while at the same time maintaining high discriminative performance for non-occluded objects.
Tasks	Image Classification, Object Classification
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11826v4
PDF	https://arxiv.org/pdf/1905.11826v4.pdf
PWC	https://paperswithcode.com/paper/compositional-convolutional-networks-for
Repo
Framework

DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retinopathy grading in eye fundus images


Title	DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retinopathy grading in eye fundus images
Authors	Teresa Araújo, Guilherme Aresta, Luís Mendonça, Susana Penas, Carolina Maia, Ângela Carneiro, Ana Maria Mendonça, Aurélio Campilho
Abstract	Diabetic retinopathy (DR) grading is crucial in determining the patients’ adequate treatment and follow up, but the screening process can be tiresome and prone to errors. Deep learning approaches have shown promising performance as computer-aided diagnosis(CAD) systems, but their black-box behaviour hinders the clinical application. We propose DR$\vert$GRADUATE, a novel deep learning-based DR grading CAD system that supports its decision by providing a medically interpretable explanation and an estimation of how uncertain that prediction is, allowing the ophthalmologist to measure how much that decision should be trusted. We designed DR$\vert$GRADUATE taking into account the ordinal nature of the DR grading problem. A novel Gaussian-sampling approach built upon a Multiple Instance Learning framework allow DR$\vert$GRADUATE to infer an image grade associated with an explanation map and a prediction uncertainty while being trained only with image-wise labels. DR$\vert$GRADUATE was trained on the Kaggle training set and evaluated across multiple datasets. In DR grading, a quadratic-weighted Cohen’s kappa (QWK) between 0.71 and 0.84 was achieved in five different datasets. We show that high QWK values occur for images with low prediction uncertainty, thus indicating that this uncertainty is a valid measure of the predictions’ quality. Further, bad quality images are generally associated with higher uncertainties, showing that images not suitable for diagnosis indeed lead to less trustworthy predictions. Additionally, tests on unfamiliar medical image data types suggest that DR$\vert$GRADUATE allows outlier detection. The attention maps generally highlight regions of interest for diagnosis. These results show the great potential of DR$\vert$GRADUATE as a second-opinion system in DR severity grading.
Tasks	Multiple Instance Learning, Outlier Detection
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11777v1
PDF	https://arxiv.org/pdf/1910.11777v1.pdf
PWC	https://paperswithcode.com/paper/drvertgraduate-uncertainty-aware-deep
Repo
Framework

Adaptive Regularization via Residual Smoothing in Deep Learning Optimization


Title	Adaptive Regularization via Residual Smoothing in Deep Learning Optimization
Authors	Junghee Cho, Junseok Kwon, Byung-Woo Hong
Abstract	We present an adaptive regularization algorithm that can be effectively applied to the optimization problem in deep learning framework. Our regularization algorithm aims to take into account the fitness of data to the current state of model in the determination of regularity to achieve better generalization. The degree of regularization at each element in the target space of the neural network architecture is determined based on the residual at each optimization iteration in an adaptive way. Our adaptive regularization algorithm is designed to apply a diffusion process driven by the heat equation with spatially varying diffusivity depending on the probability density function following a certain distribution of residual. Our data-driven regularity is imposed by adaptively smoothing a simplified objective function in which the explicit regularization term is omitted in an alternating manner between the evaluation of residual and the determination of the degree of its regularity. The effectiveness of our algorithm is empirically demonstrated by the numerical experiments in the application of image classification problems, indicating that our algorithm outperforms other commonly used optimization algorithms in terms of generalization using popular deep learning models and benchmark datasets.
Tasks	Image Classification
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09750v2
PDF	https://arxiv.org/pdf/1907.09750v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-regularization-via-residual
Repo
Framework

Automatic segmentation and determining radiodensity of the liver in a large-scale CT database


Title	Automatic segmentation and determining radiodensity of the liver in a large-scale CT database
Authors	N. S. Kulberg, A. B. Elizarov, V. P. Novik, V. A. Gombolevsky, A. P. Gonchar, A. L. Alliua, V. Yu. Bosin, A. V. Vladzymyrsky, S. P. Morozov
Abstract	This study proposes an automatic technique for liver segmentation in computed tomography (CT) images. Localization of the liver volume is based on the correlation with an optimized set of liver templates developed by the authors that allows clear geometric interpretation. Radiodensity values are calculated based on the boundaries of the segmented liver, which allows identifying liver abnormalities. The performance of the technique was evaluated on 700 CT images from dataset of the Unified Radiological Information System (URIS) of Moscow. Despite the decrease in accuracy, the technique is applicable to CT volumes with a partially visible region of the liver. The technique can be used to process CT images obtained in various patient positions in a wide range of exposition parameters. It is capable in dealing with low dose CT scans in real large-scale medical database with over 1 million of studies.
Tasks	Computed Tomography (CT), Liver Segmentation
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13290v1
PDF	https://arxiv.org/pdf/1912.13290v1.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-and-determining
Repo
Framework

Semi-Supervised Histology Classification using Deep Multiple Instance Learning and Contrastive Predictive Coding


Title	Semi-Supervised Histology Classification using Deep Multiple Instance Learning and Contrastive Predictive Coding
Authors	Ming Y. Lu, Richard J. Chen, Jingwen Wang, Debora Dillon, Faisal Mahmood
Abstract	Convolutional neural networks can be trained to perform histology slide classification using weak annotations with multiple instance learning (MIL). However, given the paucity of labeled histology data, direct application of MIL can easily suffer from overfitting and the network is unable to learn rich feature representations due to the weak supervisory signal. We propose to overcome such limitations with a two-stage semi-supervised approach that combines the power of data-efficient self-supervised feature learning via contrastive predictive coding (CPC) and the interpretability and flexibility of regularized attention-based MIL. We apply our two-stage CPC + MIL semi-supervised pipeline to the binary classification of breast cancer histology images. Across five random splits, we report state-of-the-art performance with a mean validation accuracy of 95% and an area under the ROC curve of 0.968. We further evaluate the quality of features learned via CPC relative to simple transfer learning and show that strong classification performance using CPC features can be efficiently leveraged under the MIL framework even with the feature encoder frozen.
Tasks	Classification Of Breast Cancer Histology Images, Multiple Instance Learning, Transfer Learning
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10825v3
PDF	https://arxiv.org/pdf/1910.10825v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-histology-classification
Repo
Framework

DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM


Title	DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM
Authors	Ryo Hachiuma, Christian Pirchheim, Dieter Schmalstieg, Hideo Saito
Abstract	We present DetectFusion, an RGB-D SLAM system that runs in real-time and can robustly handle semantically known and unknown objects that can move dynamically in the scene. Our system detects, segments and assigns semantic class labels to known objects in the scene, while tracking and reconstructing them even when they move independently in front of the monocular camera. In contrast to related work, we achieve real-time computational performance on semantic instance segmentation with a novel method combining 2D object detection and 3D geometric segmentation. In addition, we propose a method for detecting and segmenting the motion of semantically unknown objects, thus further improving the accuracy of camera tracking and map reconstruction. We show that our method performs on par or better than previous work in terms of localization and object reconstruction accuracy, while achieving about 20 FPS even if the objects are segmented in each frame.
Tasks	Instance Segmentation, Object Detection, Object Reconstruction, Semantic Segmentation
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09127v1
PDF	https://arxiv.org/pdf/1907.09127v1.pdf
PWC	https://paperswithcode.com/paper/detectfusion-detecting-and-segmenting-both
Repo
Framework

A Deep Neural Network for Finger Counting and Numerosity Estimation


Title	A Deep Neural Network for Finger Counting and Numerosity Estimation
Authors	Leszek Pecyna, Angelo Cangelosi, Alessandro Di Nuovo
Abstract	In this paper, we present neuro-robotics models with a deep artificial neural network capable of generating finger counting positions and number estimation. We first train the model in an unsupervised manner where each layer is treated as a Restricted Boltzmann Machine or an autoencoder. Such a model is further trained in a supervised way. This type of pre-training is tested on our baseline model and two methods of pre-training are compared. The network is extended to produce finger counting positions. The performance in number estimation of such an extended model is evaluated. We test the hypothesis if the subitizing process can be obtained by one single model used also for estimation of higher numerosities. The results confirm the importance of unsupervised training in our enumeration task and show some similarities to human behaviour in the case of subitizing.
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.05270v2
PDF	https://arxiv.org/pdf/1907.05270v2.pdf
PWC	https://paperswithcode.com/paper/a-deep-neural-network-for-finger-counting-and
Repo
Framework

Adaptive Learning of Aggregate Analytics under Dynamic Workloads


Title	Adaptive Learning of Aggregate Analytics under Dynamic Workloads
Authors	Fotis Savva, Christos Anagnostopoulos, Peter Triantafillou
Abstract	Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very costly in terms of query response time, resource consumption, and money in cloud deployments, especially when base data are stored across geographically distributed data centers. Therefore, we introduce an adaptive Machine Learning mechanism which is light-weight, stored client-side, can estimate the answers of a variety of aggregate queries and can avoid the big data backend. The estimations are performed in milliseconds are inexpensive and accurate as the mechanism learns from past analytical-query patterns. However, as analytic queries are ad-hoc and analysts’ interests change over time we develop solutions that can swiftly and accurately detect such changes and adapt to new query patterns. The capabilities of our approach are demonstrated using extensive evaluation with real and synthetic datasets.
Tasks	Decision Making
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04772v2
PDF	https://arxiv.org/pdf/1908.04772v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-learning-of-aggregate-analytics
Repo
Framework

AttentionRNN: A Structured Spatial Attention Mechanism


Title	AttentionRNN: A Structured Spatial Attention Mechanism
Authors	Siddhesh Khandelwal, Leonid Sigal
Abstract	Visual attention mechanisms have proven to be integrally important constituent components of many modern deep neural architectures. They provide an efficient and effective way to utilize visual information selectively, which has shown to be especially valuable in multi-modal learning tasks. However, all prior attention frameworks lack the ability to explicitly model structural dependencies among attention variables, making it difficult to predict consistent attention masks. In this paper we develop a novel structured spatial attention mechanism which is end-to-end trainable and can be integrated with any feed-forward convolutional neural network. This proposed AttentionRNN layer explicitly enforces structure over the spatial attention variables by sequentially predicting attention values in the spatial mask in a bi-directional raster-scan and inverse raster-scan order. As a result, each attention value depends not only on local image or contextual information, but also on the previously predicted attention values. Our experiments show consistent quantitative and qualitative improvements on a variety of recognition tasks and datasets; including image categorization, question answering and image generation.
Tasks	Image Categorization, Image Generation, Question Answering
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09400v1
PDF	https://arxiv.org/pdf/1905.09400v1.pdf
PWC	https://paperswithcode.com/paper/attentionrnn-a-structured-spatial-attention
Repo
Framework

Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos


Title	Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos
Authors	Gnana Praveen R, Eric Granger, Patrick Cardinal
Abstract	Automatic pain assessment has an important potential diagnostic value for populations that are incapable of articulating their pain experiences. As one of the dominating nonverbal channels for eliciting pain expression events, facial expressions has been widely investigated for estimating the pain intensity of individual. However, using state-of-the-art deep learning (DL) models in real-world pain estimation applications poses several challenges related to the subjective variations of facial expressions, operational capture conditions, and lack of representative training videos with labels. Given the cost of annotating intensity levels for every video frame, we propose a weakly-supervised domain adaptation (WSDA) technique that allows for training 3D CNNs for spatio-temporal pain intensity estimation using weakly labeled videos, where labels are provided on a periodic basis. In particular, WSDA integrates multiple instance learning into an adversarial deep domain adaptation framework to train an Inflated 3D-CNN (I3D) model such that it can accurately estimate pain intensities in the target operational domain. The training process relies on weak target loss, along with domain loss and source loss for domain adaptation of the I3D model. Experimental results obtained using labeled source domain RECOLA videos and weakly-labeled target domain UNBC-McMaster videos indicate that the proposed deep WSDA approach can achieve significantly higher level of sequence (bag)-level and frame (instance)-level pain localization accuracy than related state-of-the-art approaches.
Tasks	Domain Adaptation, Multiple Instance Learning
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08173v2
PDF	https://arxiv.org/pdf/1910.08173v2.pdf
PWC	https://paperswithcode.com/paper/deep-weakly-supervised-domain-adaptation-for
Repo
Framework

HadaNets: Flexible Quantization Strategies for Neural Networks


Title	HadaNets: Flexible Quantization Strategies for Neural Networks
Authors	Yash Akhauri
Abstract	On-board processing elements on UAVs are currently inadequate for training and inference of Deep Neural Networks. This is largely due to the energy consumption of memory accesses in such a network. HadaNets introduce a flexible train-from-scratch tensor quantization scheme by pairing a full precision tensor to a binary tensor in the form of a Hadamard product. Unlike wider reduced precision neural network models, we preserve the train-time parameter count, thus out-performing XNOR-Nets without a train-time memory penalty. Such training routines could see great utility in semi-supervised online learning tasks. Our method also offers advantages in model compression, as we reduce the model size of ResNet-18 by 7.43 times with respect to a full precision model without utilizing any other compression techniques. We also demonstrate a ‘Hadamard Binary Matrix Multiply’ kernel, which delivers a 10-fold increase in performance over full precision matrix multiplication with a similarly optimized kernel.
Tasks	Model Compression, Quantization
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10759v1
PDF	https://arxiv.org/pdf/1905.10759v1.pdf
PWC	https://paperswithcode.com/paper/hadanets-flexible-quantization-strategies-for
Repo
Framework

Dynamic Graph Embedding via LSTM History Tracking


Title	Dynamic Graph Embedding via LSTM History Tracking
Authors	Shima Khoshraftar, Sedigheh Mahdavi, Aijun An, Yonggang Hu, Junfeng Liu
Abstract	Many real world networks are very large and constantly change over time. These dynamic networks exist in various domains such as social networks, traffic networks and biological interactions. To handle large dynamic networks in downstream applications such as link prediction and anomaly detection, it is essential for such networks to be transferred into a low dimensional space. Recently, network embedding, a technique that converts a large graph into a low-dimensional representation, has become increasingly popular due to its strength in preserving the structure of a network. Efficient dynamic network embedding, however, has not yet been fully explored. In this paper, we present a dynamic network embedding method that integrates the history of nodes over time into the current state of nodes. The key contribution of our work is 1) generating dynamic network embedding by combining both dynamic and static node information 2) tracking history of neighbors of nodes using LSTM 3) significantly decreasing the time and memory by training an autoencoder LSTM model using temporal walks rather than adjacency matrices of graphs which are the common practice. We evaluate our method in multiple applications such as anomaly detection, link prediction and node classification in datasets from various domains.
Tasks	Anomaly Detection, Graph Embedding, Link Prediction, Network Embedding, Node Classification
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01551v1
PDF	https://arxiv.org/pdf/1911.01551v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-graph-embedding-via-lstm-history
Repo
Framework

Mining urban lifestyles: urban computing, human behavior and recommender systems


Title	Mining urban lifestyles: urban computing, human behavior and recommender systems
Authors	Sharon Xu, Riccardo Di Clemente, Marta C. González
Abstract	In the last decade, the digital age has sharply redefined the way we study human behavior. With the advancement of data storage and sensing technologies, electronic records now encompass a diverse spectrum of human activity, ranging from location data, phone and email communication to Twitter activity and open-source contributions on Wikipedia and OpenStreetMap. In particular, the study of the shopping and mobility patterns of individual consumers has the potential to give deeper insight into the lifestyles and infrastructure of the region. Credit card records (CCRs) provide detailed insight into purchase behavior and have been found to have inherent regularity in consumer shopping patterns; call detail records (CDRs) present new opportunities to understand human mobility, analyze wealth, and model social network dynamics. In this chapter, we jointly model the lifestyles of individuals, a more challenging problem with higher variability when compared to the aggregated behavior of city regions. Using collective matrix factorization, we propose a unified dual view of lifestyles. Understanding these lifestyles will not only inform commercial opportunities, but also help policymakers and nonprofit organizations understand the characteristics and needs of the entire region, as well as of the individuals within that region. The applications of this range from targeted advertisements and promotions to the diffusion of digital financial services among low-income groups.
Tasks	Recommendation Systems
Published	2019-11-04
URL	https://arxiv.org/abs/1911.05464v1
PDF	https://arxiv.org/pdf/1911.05464v1.pdf
PWC	https://paperswithcode.com/paper/mining-urban-lifestyles-urban-computing-human
Repo
Framework