Paper Group ANR 1716
Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression. Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering. Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion. DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retino …
Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression
Title | Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression |
Authors | Bidur Khanal, Lavsen Dahal, Prashant Adhikari, Bishesh Khanal |
Abstract | Correct evaluation and treatment of Scoliosis require accurate estimation of spinal curvature. Current gold standard is to manually estimate Cobb Angles in spinal X-ray images which is time consuming and has high inter-rater variability. We propose an automatic method with a novel framework that first detects vertebrae as objects followed by a landmark detector that estimates the 4 landmark corners of each vertebra separately. Cobb Angles are calculated using the slope of each vertebra obtained from the predicted landmarks. For inference on test data, we perform pre and post processings that include cropping, outlier rejection and smoothing of the predicted landmarks. The results were assessed in AASCE MICCAI challenge 2019 which showed a promise with a SMAPE score of 25.69 on the challenge test set. |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14202v1 |
https://arxiv.org/pdf/1910.14202v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-cobb-angle-detection-using-vertebra |
Repo | |
Framework | |
Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering
Title | Nearest-Neighbour-Induced Isolation Similarity and its Impact on Density-Based Clustering |
Authors | Xiaoyu Qin, Kai Ming Ting, Ye Zhu, Vincent CS Lee |
Abstract | A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on density-based clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot. |
Tasks | |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00378v1 |
https://arxiv.org/pdf/1907.00378v1.pdf | |
PWC | https://paperswithcode.com/paper/nearest-neighbour-induced-isolation |
Repo | |
Framework | |
Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion
Title | Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion |
Authors | Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille |
Abstract | Deep convolutional neural networks (DCNNs) are powerful models that yield impressive results at object classification. However, recent work has shown that they do not generalize well to partially occluded objects and to mask attacks. In contrast to DCNNs, compositional models are robust to partial occlusion, however, they are not as discriminative as deep models. In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks. Our model is learned in two steps. First, a standard DCNN is trained for image classification. Subsequently, we cluster the DCNN features into dictionaries. We show that the dictionary components resemble object part detectors and learn the spatial distribution of parts for each object class. We propose mixtures of compositional models to account for large changes in the spatial activation patterns (e.g. due to changes in the 3D pose of an object). At runtime, an image is first classified by the DCNN in a feedforward manner. The prediction uncertainty is used to detect partially occluded objects, which in turn are classified by the compositional model. Our experimental results demonstrate that combining compositional models and DCNNs resolves a fundamental problem of current deep learning approaches to computer vision: The combined model recognizes occluded objects, even when it has not been exposed to occluded objects during training, while at the same time maintaining high discriminative performance for non-occluded objects. |
Tasks | Image Classification, Object Classification |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11826v4 |
https://arxiv.org/pdf/1905.11826v4.pdf | |
PWC | https://paperswithcode.com/paper/compositional-convolutional-networks-for |
Repo | |
Framework | |
DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retinopathy grading in eye fundus images
Title | DR$\vert$GRADUATE: uncertainty-aware deep learning-based diabetic retinopathy grading in eye fundus images |
Authors | Teresa Araújo, Guilherme Aresta, Luís Mendonça, Susana Penas, Carolina Maia, Ângela Carneiro, Ana Maria Mendonça, Aurélio Campilho |
Abstract | Diabetic retinopathy (DR) grading is crucial in determining the patients’ adequate treatment and follow up, but the screening process can be tiresome and prone to errors. Deep learning approaches have shown promising performance as computer-aided diagnosis(CAD) systems, but their black-box behaviour hinders the clinical application. We propose DR$\vert$GRADUATE, a novel deep learning-based DR grading CAD system that supports its decision by providing a medically interpretable explanation and an estimation of how uncertain that prediction is, allowing the ophthalmologist to measure how much that decision should be trusted. We designed DR$\vert$GRADUATE taking into account the ordinal nature of the DR grading problem. A novel Gaussian-sampling approach built upon a Multiple Instance Learning framework allow DR$\vert$GRADUATE to infer an image grade associated with an explanation map and a prediction uncertainty while being trained only with image-wise labels. DR$\vert$GRADUATE was trained on the Kaggle training set and evaluated across multiple datasets. In DR grading, a quadratic-weighted Cohen’s kappa (QWK) between 0.71 and 0.84 was achieved in five different datasets. We show that high QWK values occur for images with low prediction uncertainty, thus indicating that this uncertainty is a valid measure of the predictions’ quality. Further, bad quality images are generally associated with higher uncertainties, showing that images not suitable for diagnosis indeed lead to less trustworthy predictions. Additionally, tests on unfamiliar medical image data types suggest that DR$\vert$GRADUATE allows outlier detection. The attention maps generally highlight regions of interest for diagnosis. These results show the great potential of DR$\vert$GRADUATE as a second-opinion system in DR severity grading. |
Tasks | Multiple Instance Learning, Outlier Detection |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11777v1 |
https://arxiv.org/pdf/1910.11777v1.pdf | |
PWC | https://paperswithcode.com/paper/drvertgraduate-uncertainty-aware-deep |
Repo | |
Framework | |
Adaptive Regularization via Residual Smoothing in Deep Learning Optimization
Title | Adaptive Regularization via Residual Smoothing in Deep Learning Optimization |
Authors | Junghee Cho, Junseok Kwon, Byung-Woo Hong |
Abstract | We present an adaptive regularization algorithm that can be effectively applied to the optimization problem in deep learning framework. Our regularization algorithm aims to take into account the fitness of data to the current state of model in the determination of regularity to achieve better generalization. The degree of regularization at each element in the target space of the neural network architecture is determined based on the residual at each optimization iteration in an adaptive way. Our adaptive regularization algorithm is designed to apply a diffusion process driven by the heat equation with spatially varying diffusivity depending on the probability density function following a certain distribution of residual. Our data-driven regularity is imposed by adaptively smoothing a simplified objective function in which the explicit regularization term is omitted in an alternating manner between the evaluation of residual and the determination of the degree of its regularity. The effectiveness of our algorithm is empirically demonstrated by the numerical experiments in the application of image classification problems, indicating that our algorithm outperforms other commonly used optimization algorithms in terms of generalization using popular deep learning models and benchmark datasets. |
Tasks | Image Classification |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09750v2 |
https://arxiv.org/pdf/1907.09750v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-regularization-via-residual |
Repo | |
Framework | |
Automatic segmentation and determining radiodensity of the liver in a large-scale CT database
Title | Automatic segmentation and determining radiodensity of the liver in a large-scale CT database |
Authors | N. S. Kulberg, A. B. Elizarov, V. P. Novik, V. A. Gombolevsky, A. P. Gonchar, A. L. Alliua, V. Yu. Bosin, A. V. Vladzymyrsky, S. P. Morozov |
Abstract | This study proposes an automatic technique for liver segmentation in computed tomography (CT) images. Localization of the liver volume is based on the correlation with an optimized set of liver templates developed by the authors that allows clear geometric interpretation. Radiodensity values are calculated based on the boundaries of the segmented liver, which allows identifying liver abnormalities. The performance of the technique was evaluated on 700 CT images from dataset of the Unified Radiological Information System (URIS) of Moscow. Despite the decrease in accuracy, the technique is applicable to CT volumes with a partially visible region of the liver. The technique can be used to process CT images obtained in various patient positions in a wide range of exposition parameters. It is capable in dealing with low dose CT scans in real large-scale medical database with over 1 million of studies. |
Tasks | Computed Tomography (CT), Liver Segmentation |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/1912.13290v1 |
https://arxiv.org/pdf/1912.13290v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-and-determining |
Repo | |
Framework | |
Semi-Supervised Histology Classification using Deep Multiple Instance Learning and Contrastive Predictive Coding
Title | Semi-Supervised Histology Classification using Deep Multiple Instance Learning and Contrastive Predictive Coding |
Authors | Ming Y. Lu, Richard J. Chen, Jingwen Wang, Debora Dillon, Faisal Mahmood |
Abstract | Convolutional neural networks can be trained to perform histology slide classification using weak annotations with multiple instance learning (MIL). However, given the paucity of labeled histology data, direct application of MIL can easily suffer from overfitting and the network is unable to learn rich feature representations due to the weak supervisory signal. We propose to overcome such limitations with a two-stage semi-supervised approach that combines the power of data-efficient self-supervised feature learning via contrastive predictive coding (CPC) and the interpretability and flexibility of regularized attention-based MIL. We apply our two-stage CPC + MIL semi-supervised pipeline to the binary classification of breast cancer histology images. Across five random splits, we report state-of-the-art performance with a mean validation accuracy of 95% and an area under the ROC curve of 0.968. We further evaluate the quality of features learned via CPC relative to simple transfer learning and show that strong classification performance using CPC features can be efficiently leveraged under the MIL framework even with the feature encoder frozen. |
Tasks | Classification Of Breast Cancer Histology Images, Multiple Instance Learning, Transfer Learning |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10825v3 |
https://arxiv.org/pdf/1910.10825v3.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-histology-classification |
Repo | |
Framework | |
DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM
Title | DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM |
Authors | Ryo Hachiuma, Christian Pirchheim, Dieter Schmalstieg, Hideo Saito |
Abstract | We present DetectFusion, an RGB-D SLAM system that runs in real-time and can robustly handle semantically known and unknown objects that can move dynamically in the scene. Our system detects, segments and assigns semantic class labels to known objects in the scene, while tracking and reconstructing them even when they move independently in front of the monocular camera. In contrast to related work, we achieve real-time computational performance on semantic instance segmentation with a novel method combining 2D object detection and 3D geometric segmentation. In addition, we propose a method for detecting and segmenting the motion of semantically unknown objects, thus further improving the accuracy of camera tracking and map reconstruction. We show that our method performs on par or better than previous work in terms of localization and object reconstruction accuracy, while achieving about 20 FPS even if the objects are segmented in each frame. |
Tasks | Instance Segmentation, Object Detection, Object Reconstruction, Semantic Segmentation |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09127v1 |
https://arxiv.org/pdf/1907.09127v1.pdf | |
PWC | https://paperswithcode.com/paper/detectfusion-detecting-and-segmenting-both |
Repo | |
Framework | |
A Deep Neural Network for Finger Counting and Numerosity Estimation
Title | A Deep Neural Network for Finger Counting and Numerosity Estimation |
Authors | Leszek Pecyna, Angelo Cangelosi, Alessandro Di Nuovo |
Abstract | In this paper, we present neuro-robotics models with a deep artificial neural network capable of generating finger counting positions and number estimation. We first train the model in an unsupervised manner where each layer is treated as a Restricted Boltzmann Machine or an autoencoder. Such a model is further trained in a supervised way. This type of pre-training is tested on our baseline model and two methods of pre-training are compared. The network is extended to produce finger counting positions. The performance in number estimation of such an extended model is evaluated. We test the hypothesis if the subitizing process can be obtained by one single model used also for estimation of higher numerosities. The results confirm the importance of unsupervised training in our enumeration task and show some similarities to human behaviour in the case of subitizing. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.05270v2 |
https://arxiv.org/pdf/1907.05270v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-neural-network-for-finger-counting-and |
Repo | |
Framework | |
Adaptive Learning of Aggregate Analytics under Dynamic Workloads
Title | Adaptive Learning of Aggregate Analytics under Dynamic Workloads |
Authors | Fotis Savva, Christos Anagnostopoulos, Peter Triantafillou |
Abstract | Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very costly in terms of query response time, resource consumption, and money in cloud deployments, especially when base data are stored across geographically distributed data centers. Therefore, we introduce an adaptive Machine Learning mechanism which is light-weight, stored client-side, can estimate the answers of a variety of aggregate queries and can avoid the big data backend. The estimations are performed in milliseconds are inexpensive and accurate as the mechanism learns from past analytical-query patterns. However, as analytic queries are ad-hoc and analysts’ interests change over time we develop solutions that can swiftly and accurately detect such changes and adapt to new query patterns. The capabilities of our approach are demonstrated using extensive evaluation with real and synthetic datasets. |
Tasks | Decision Making |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04772v2 |
https://arxiv.org/pdf/1908.04772v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-learning-of-aggregate-analytics |
Repo | |
Framework | |
AttentionRNN: A Structured Spatial Attention Mechanism
Title | AttentionRNN: A Structured Spatial Attention Mechanism |
Authors | Siddhesh Khandelwal, Leonid Sigal |
Abstract | Visual attention mechanisms have proven to be integrally important constituent components of many modern deep neural architectures. They provide an efficient and effective way to utilize visual information selectively, which has shown to be especially valuable in multi-modal learning tasks. However, all prior attention frameworks lack the ability to explicitly model structural dependencies among attention variables, making it difficult to predict consistent attention masks. In this paper we develop a novel structured spatial attention mechanism which is end-to-end trainable and can be integrated with any feed-forward convolutional neural network. This proposed AttentionRNN layer explicitly enforces structure over the spatial attention variables by sequentially predicting attention values in the spatial mask in a bi-directional raster-scan and inverse raster-scan order. As a result, each attention value depends not only on local image or contextual information, but also on the previously predicted attention values. Our experiments show consistent quantitative and qualitative improvements on a variety of recognition tasks and datasets; including image categorization, question answering and image generation. |
Tasks | Image Categorization, Image Generation, Question Answering |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.09400v1 |
https://arxiv.org/pdf/1905.09400v1.pdf | |
PWC | https://paperswithcode.com/paper/attentionrnn-a-structured-spatial-attention |
Repo | |
Framework | |
Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos
Title | Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos |
Authors | Gnana Praveen R, Eric Granger, Patrick Cardinal |
Abstract | Automatic pain assessment has an important potential diagnostic value for populations that are incapable of articulating their pain experiences. As one of the dominating nonverbal channels for eliciting pain expression events, facial expressions has been widely investigated for estimating the pain intensity of individual. However, using state-of-the-art deep learning (DL) models in real-world pain estimation applications poses several challenges related to the subjective variations of facial expressions, operational capture conditions, and lack of representative training videos with labels. Given the cost of annotating intensity levels for every video frame, we propose a weakly-supervised domain adaptation (WSDA) technique that allows for training 3D CNNs for spatio-temporal pain intensity estimation using weakly labeled videos, where labels are provided on a periodic basis. In particular, WSDA integrates multiple instance learning into an adversarial deep domain adaptation framework to train an Inflated 3D-CNN (I3D) model such that it can accurately estimate pain intensities in the target operational domain. The training process relies on weak target loss, along with domain loss and source loss for domain adaptation of the I3D model. Experimental results obtained using labeled source domain RECOLA videos and weakly-labeled target domain UNBC-McMaster videos indicate that the proposed deep WSDA approach can achieve significantly higher level of sequence (bag)-level and frame (instance)-level pain localization accuracy than related state-of-the-art approaches. |
Tasks | Domain Adaptation, Multiple Instance Learning |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08173v2 |
https://arxiv.org/pdf/1910.08173v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-weakly-supervised-domain-adaptation-for |
Repo | |
Framework | |
HadaNets: Flexible Quantization Strategies for Neural Networks
Title | HadaNets: Flexible Quantization Strategies for Neural Networks |
Authors | Yash Akhauri |
Abstract | On-board processing elements on UAVs are currently inadequate for training and inference of Deep Neural Networks. This is largely due to the energy consumption of memory accesses in such a network. HadaNets introduce a flexible train-from-scratch tensor quantization scheme by pairing a full precision tensor to a binary tensor in the form of a Hadamard product. Unlike wider reduced precision neural network models, we preserve the train-time parameter count, thus out-performing XNOR-Nets without a train-time memory penalty. Such training routines could see great utility in semi-supervised online learning tasks. Our method also offers advantages in model compression, as we reduce the model size of ResNet-18 by 7.43 times with respect to a full precision model without utilizing any other compression techniques. We also demonstrate a ‘Hadamard Binary Matrix Multiply’ kernel, which delivers a 10-fold increase in performance over full precision matrix multiplication with a similarly optimized kernel. |
Tasks | Model Compression, Quantization |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10759v1 |
https://arxiv.org/pdf/1905.10759v1.pdf | |
PWC | https://paperswithcode.com/paper/hadanets-flexible-quantization-strategies-for |
Repo | |
Framework | |
Dynamic Graph Embedding via LSTM History Tracking
Title | Dynamic Graph Embedding via LSTM History Tracking |
Authors | Shima Khoshraftar, Sedigheh Mahdavi, Aijun An, Yonggang Hu, Junfeng Liu |
Abstract | Many real world networks are very large and constantly change over time. These dynamic networks exist in various domains such as social networks, traffic networks and biological interactions. To handle large dynamic networks in downstream applications such as link prediction and anomaly detection, it is essential for such networks to be transferred into a low dimensional space. Recently, network embedding, a technique that converts a large graph into a low-dimensional representation, has become increasingly popular due to its strength in preserving the structure of a network. Efficient dynamic network embedding, however, has not yet been fully explored. In this paper, we present a dynamic network embedding method that integrates the history of nodes over time into the current state of nodes. The key contribution of our work is 1) generating dynamic network embedding by combining both dynamic and static node information 2) tracking history of neighbors of nodes using LSTM 3) significantly decreasing the time and memory by training an autoencoder LSTM model using temporal walks rather than adjacency matrices of graphs which are the common practice. We evaluate our method in multiple applications such as anomaly detection, link prediction and node classification in datasets from various domains. |
Tasks | Anomaly Detection, Graph Embedding, Link Prediction, Network Embedding, Node Classification |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01551v1 |
https://arxiv.org/pdf/1911.01551v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-graph-embedding-via-lstm-history |
Repo | |
Framework | |
Mining urban lifestyles: urban computing, human behavior and recommender systems
Title | Mining urban lifestyles: urban computing, human behavior and recommender systems |
Authors | Sharon Xu, Riccardo Di Clemente, Marta C. González |
Abstract | In the last decade, the digital age has sharply redefined the way we study human behavior. With the advancement of data storage and sensing technologies, electronic records now encompass a diverse spectrum of human activity, ranging from location data, phone and email communication to Twitter activity and open-source contributions on Wikipedia and OpenStreetMap. In particular, the study of the shopping and mobility patterns of individual consumers has the potential to give deeper insight into the lifestyles and infrastructure of the region. Credit card records (CCRs) provide detailed insight into purchase behavior and have been found to have inherent regularity in consumer shopping patterns; call detail records (CDRs) present new opportunities to understand human mobility, analyze wealth, and model social network dynamics. In this chapter, we jointly model the lifestyles of individuals, a more challenging problem with higher variability when compared to the aggregated behavior of city regions. Using collective matrix factorization, we propose a unified dual view of lifestyles. Understanding these lifestyles will not only inform commercial opportunities, but also help policymakers and nonprofit organizations understand the characteristics and needs of the entire region, as well as of the individuals within that region. The applications of this range from targeted advertisements and promotions to the diffusion of digital financial services among low-income groups. |
Tasks | Recommendation Systems |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.05464v1 |
https://arxiv.org/pdf/1911.05464v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-urban-lifestyles-urban-computing-human |
Repo | |
Framework | |