April 1, 2020

3546 words 17 mins read

Paper Group ANR 393

Bi-directional Dermoscopic Feature Learning and Multi-scale Consistent Decision Fusion for Skin Lesion Segmentation. Stochastic Regret Minimization in Extensive-Form Games. Softmax Splatting for Video Frame Interpolation. Automatic lesion segmentation and Pathological Myopia classification in fundus images. Weakly Supervised Lesion Co-segmentation …

Bi-directional Dermoscopic Feature Learning and Multi-scale Consistent Decision Fusion for Skin Lesion Segmentation


Title	Bi-directional Dermoscopic Feature Learning and Multi-scale Consistent Decision Fusion for Skin Lesion Segmentation
Authors	Xiaohong Wang, Xudong Jiang, Henghui Ding, Jun Liu
Abstract	Accurate segmentation of skin lesion from dermoscopic images is a crucial part of computer-aided diagnosis of melanoma. It is challenging due to the fact that dermoscopic images from different patients have non-negligible lesion variation, which causes difficulties in anatomical structure learning and consistent skin lesion delineation. In this paper, we propose a novel bi-directional dermoscopic feature learning (biDFL) framework to model the complex correlation between skin lesions and their informative context. By controlling feature information passing through two complementary directions, a substantially rich and discriminative feature representation is achieved. Specifically, we place biDFL module on the top of a CNN network to enhance high-level parsing performance. Furthermore, we propose a multi-scale consistent decision fusion (mCDF) that is capable of selectively focusing on the informative decisions generated from multiple classification layers. By analysis of the consistency of the decision at each position, mCDF automatically adjusts the reliability of decisions and thus allows a more insightful skin lesion delineation. The comprehensive experimental results show the effectiveness of the proposed method on skin lesion segmentation, achieving state-of-the-art performance consistently on two publicly available dermoscopic image databases.
Tasks	Lesion Segmentation
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08694v1
PDF	https://arxiv.org/pdf/2002.08694v1.pdf
PWC	https://paperswithcode.com/paper/bi-directional-dermoscopic-feature-learning
Repo
Framework

Stochastic Regret Minimization in Extensive-Form Games


Title	Stochastic Regret Minimization in Extensive-Form Games
Authors	Gabriele Farina, Christian Kroer, Tuomas Sandholm
Abstract	Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly-stronger theoretical on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on three games, where some variants of our methods outperform MCCFR.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08493v1
PDF	https://arxiv.org/pdf/2002.08493v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-regret-minimization-in-extensive
Repo
Framework

Softmax Splatting for Video Frame Interpolation


Title	Softmax Splatting for Video Frame Interpolation
Authors	Simon Niklaus, Feng Liu
Abstract	Differentiable image sampling in the form of backward warping has seen broad adoption in tasks like depth estimation and optical flow prediction. In contrast, how to perform forward warping has seen less attention, partly due to additional challenges such as resolving the conflict of mapping multiple pixels to the same target location in a differentiable way. We propose softmax splatting to address this paradigm shift and show its effectiveness on the application of frame interpolation. Specifically, given two input frames, we forward-warp the frames and their feature pyramid representations based on an optical flow estimate using softmax splatting. In doing so, the softmax splatting seamlessly handles cases where multiple source pixels map to the same target location. We then use a synthesis network to predict the interpolation result from the warped representations. Our softmax splatting allows us to not only interpolate frames at an arbitrary time but also to fine tune the feature pyramid and the optical flow. We show that our synthesis approach, empowered by softmax splatting, achieves new state-of-the-art results for video frame interpolation.
Tasks	Depth Estimation, Optical Flow Estimation, Video Frame Interpolation
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05534v1
PDF	https://arxiv.org/pdf/2003.05534v1.pdf
PWC	https://paperswithcode.com/paper/softmax-splatting-for-video-frame
Repo
Framework

Automatic lesion segmentation and Pathological Myopia classification in fundus images


Title	Automatic lesion segmentation and Pathological Myopia classification in fundus images
Authors	Cefas Rodrigues Freire, Julio Cesar da Costa Moura, Daniele Montenegro da Silva Barros, Ricardo Alexsandro de Medeiros Valentim
Abstract	In this paper we present algorithms to diagnosis Pathological Myopia (PM) and detection of retinal structures and lesions such asOptic Disc (OD), Fovea, Atrophy and Detachment. All these tasks were performed in fundus imaging from PM patients and they are requirements to participate in the Pathologic Myopia Challenge (PALM). The challenge was organized as a half day Challenge, a Satellite Event of The IEEE International Symposium on Biomedical Imaging in Venice Italy.Our method applies different Deep Learning techniques for each task. Transfer learning is applied in all tasks using Xception as the baseline model. Also, some key ideas of YOLO architecture are used in the Optic Disc segmentation algorithm pipeline. We have evaluated our model’s performance according the challenge rules in terms of AUC-ROC, F1-Score, Mean Dice Score and Mean Euclidean Distance. For initial activities our method has shown satisfactory results.
Tasks	Lesion Segmentation, Transfer Learning
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06382v1
PDF	https://arxiv.org/pdf/2002.06382v1.pdf
PWC	https://paperswithcode.com/paper/automatic-lesion-segmentation-and
Repo
Framework

Weakly Supervised Lesion Co-segmentation on CT Scans


Title	Weakly Supervised Lesion Co-segmentation on CT Scans
Authors	Vatsal Agarwal, Youbao Tang, Jing Xiao, Ronald M. Summers
Abstract	Lesion segmentation in medical imaging serves as an effective tool for assessing tumor sizes and monitoring changes in growth. However, not only is manual lesion segmentation time-consuming, but it is also expensive and requires expert radiologist knowledge. Therefore many hospitals rely on a loose substitute called response evaluation criteria in solid tumors (RECIST). Although these annotations are far from precise, they are widely used throughout hospitals and are found in their picture archiving and communication systems (PACS). Therefore, these annotations have the potential to serve as a robust yet challenging means of weak supervision for training full lesion segmentation models. In this work, we propose a weakly-supervised co-segmentation model that first generates pseudo-masks from the RECIST slices and uses these as training labels for an attention-based convolutional neural network capable of segmenting common lesions from a pair of CT scans. To validate and test the model, we utilize the DeepLesion dataset, an extensive CT-scan lesion dataset that contains 32,735 PACS bookmarked images. Extensive experimental results demonstrate the efficacy of our co-segmentation approach for lesion segmentation with a mean Dice coefficient of 90.3%.
Tasks	Lesion Segmentation
Published	2020-01-24
URL	https://arxiv.org/abs/2001.09174v1
PDF	https://arxiv.org/pdf/2001.09174v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-lesion-co-segmentation-on
Repo
Framework

Deep Learning Based Unsupervised and Semi-supervised Classification for Keratoconus


Title	Deep Learning Based Unsupervised and Semi-supervised Classification for Keratoconus
Authors	Nicole Hallett, Kai Yi, Josef Dick, Christopher Hodge, Gerard Sutton, Yu Guang Wang, Jingjing You
Abstract	The transparent cornea is the window of the eye, facilitating the entry of light rays and controlling focusing the movement of the light within the eye. The cornea is critical, contributing to 75% of the refractive power of the eye. Keratoconus is a progressive and multifactorial corneal degenerative disease affecting 1 in 2000 individuals worldwide. Currently, there is no cure for keratoconus other than corneal transplantation for advanced stage keratoconus or corneal cross-linking, which can only halt KC progression. The ability to accurately identify subtle KC or KC progression is of vital clinical significance. To date, there has been little consensus on a useful model to classify KC patients, which therefore inhibits the ability to predict disease progression accurately. In this paper, we utilised machine learning to analyse data from 124 KC patients, including topographical and clinical variables. Both supervised multilayer perceptron and unsupervised variational autoencoder models were used to classify KC patients with reference to the existing Amsler-Krumeich (A-K) classification system. Both methods result in high accuracy, with the unsupervised method showing better performance. The result showed that the unsupervised method with a selection of 29 variables could be a powerful tool to provide an automatic classification tool for clinicians. These outcomes provide a platform for additional analysis for the progression and treatment of keratoconus.
Tasks
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11653v1
PDF	https://arxiv.org/pdf/2001.11653v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-unsupervised-and-semi
Repo
Framework

Segmentation of Retinal Low-Cost Optical Coherence Tomography Images using Deep Learning


Title	Segmentation of Retinal Low-Cost Optical Coherence Tomography Images using Deep Learning
Authors	Timo Kepp, Helge Sudkamp, Claus von der Burchard, Hendrik Schenke, Peter Koch, Gereon Hüttmann, Johann Roider, Mattias P. Heinrich, Heinz Handels
Abstract	The treatment of age-related macular degeneration (AMD) requires continuous eye exams using optical coherence tomography (OCT). The need for treatment is determined by the presence or change of disease-specific OCT-based biomarkers. Therefore, the monitoring frequency has a significant influence on the success of AMD therapy. However, the monitoring frequency of current treatment schemes is not individually adapted to the patient and therefore often insufficient. While a higher monitoring frequency would have a positive effect on the success of treatment, in practice it can only be achieved with a home monitoring solution. One of the key requirements of a home monitoring OCT system is a computer-aided diagnosis to automatically detect and quantify pathological changes using specific OCT-based biomarkers. In this paper, for the first time, retinal scans of a novel self-examination low-cost full-field OCT (SELF-OCT) are segmented using a deep learning-based approach. A convolutional neural network (CNN) is utilized to segment the total retina as well as pigment epithelial detachments (PED). It is shown that the CNN-based approach can segment the retina with high accuracy, whereas the segmentation of the PED proves to be challenging. In addition, a convolutional denoising autoencoder (CDAE) refines the CNN prediction, which has previously learned retinal shape information. It is shown that the CDAE refinement can correct segmentation errors caused by artifacts in the OCT image.
Tasks	Denoising
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08480v1
PDF	https://arxiv.org/pdf/2001.08480v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-retinal-low-cost-optical
Repo
Framework

Learning Bounds for Moment-Based Domain Adaptation


Title	Learning Bounds for Moment-Based Domain Adaptation
Authors	Werner Zellinger, Bernhard A Moser, Susanne Saminger-Platz
Abstract	Domain adaptation algorithms are designed to minimize the misclassification risk of a discriminative model for a target domain with little training data by adapting a model from a source domain with a large amount of training data. Standard approaches measure the adaptation discrepancy based on distance measures between the empirical probability distributions in the source and target domain. In this setting, we address the problem of deriving learning bounds under practice-oriented general conditions on the underlying probability distributions. As a result, we obtain learning bounds for domain adaptation based on finitely many moments and smoothness conditions.
Tasks	Domain Adaptation
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08260v1
PDF	https://arxiv.org/pdf/2002.08260v1.pdf
PWC	https://paperswithcode.com/paper/learning-bounds-for-moment-based-domain
Repo
Framework

Poly-time universality and limitations of deep learning


Title	Poly-time universality and limitations of deep learning
Authors	Emmanuel Abbe, Colin Sandon
Abstract	The goal of this paper is to characterize function distributions that deep learning can or cannot learn in poly-time. A universality result is proved for SGD-based deep learning and a non-universality result is proved for GD-based deep learning; this also gives a separation between SGD-based deep learning and statistical query algorithms: (1) {\it Deep learning with SGD is efficiently universal.} Any function distribution that can be learned from samples in poly-time can also be learned by a poly-size neural net trained with SGD on a poly-time initialization with poly-steps, poly-rate and possibly poly-noise. Therefore deep learning provides a universal learning paradigm: it was known that the approximation and estimation errors could be controlled with poly-size neural nets, using ERM that is NP-hard; this new result shows that the optimization error can also be controlled with SGD in poly-time. The picture changes for GD with large enough batches: (2) {\it Result (1) does not hold for GD:} Neural nets of poly-size trained with GD (full gradients or large enough batches) on any initialization with poly-steps, poly-range and at least poly-noise cannot learn any function distribution that has super-polynomial {\it cross-predictability,} where the cross-predictability gives a measure of ``average’’ function correlation – relations and distinctions to the statistical dimension are discussed. In particular, GD with these constraints can learn efficiently monomials of degree $k$ if and only if $k$ is constant. Thus (1) and (2) point to an interesting contrast: SGD is universal even with some poly-noise while full GD or SQ algorithms are not (e.g., parities). \|
Tasks
Published	2020-01-07
URL	https://arxiv.org/abs/2001.02992v1
PDF	https://arxiv.org/pdf/2001.02992v1.pdf
PWC	https://paperswithcode.com/paper/poly-time-universality-and-limitations-of
Repo
Framework

ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search


Title	ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search
Authors	XuZhang, ChenjunZhou, BoGu
Abstract	How to discover and evaluate the true strength of models quickly and accurately is one of the key challenges in Neural Architecture Search (NAS). To cope with this problem, we propose an Architecture-Driven Weight Prediction (ADWP) approach for neural architecture search (NAS). In our approach, we first design an architecture-intensive search space and then train a HyperNetwork by inputting stochastic encoding architecture parameters. In the trained HyperNetwork, weights of convolution kernels can be well predicted for neural architectures in the search space. Consequently, the target architectures can be evaluated efficiently without any finetuning, thus enabling us to search fortheoptimalarchitectureinthespaceofgeneralnetworks (macro-search). Through real experiments, we evaluate the performance of the models discovered by the proposed AD-WPNAS and results show that one search procedure can be completed in 4.0 GPU hours on CIFAR-10. Moreover, the discovered model obtains a test error of 2.41% with only 1.52M parameters which is superior to the best existing models.
Tasks	Neural Architecture Search
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01335v1
PDF	https://arxiv.org/pdf/2003.01335v1.pdf
PWC	https://paperswithcode.com/paper/adwpnas-architecture-driven-weight-prediction
Repo
Framework

WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection


Title	WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection
Authors	Noé Cecillon, Vincent Labatut, Richard Dufour, Georges Linares
Abstract	With the spread of online social networks, it is more and more difficult to monitor all the user-generated content. Automating the moderation process of the inappropriate exchange content on Internet has thus become a priority task. Methods have been proposed for this purpose, but it can be challenging to find a suitable dataset to train and develop them. This issue is especially true for approaches based on information derived from the structure and the dynamic of the conversation. In this work, we propose an original framework, based on the Wikipedia Comment corpus, with comment-level abuse annotations of different types. The major contribution concerns the reconstruction of conversations, by comparison to existing corpora, which focus only on isolated messages (i.e. taken out of their conversational context). This large corpus of more than 380k annotated messages opens perspectives for online abuse detection and especially for context-based approaches. We also propose, in addition to this corpus, a complete benchmarking platform to stimulate and fairly compare scientific works around the problem of content abuse detection, trying to avoid the recurring problem of result replication. Finally, we apply two classification methods to our dataset to demonstrate its potential.
Tasks	Abuse Detection
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06190v1
PDF	https://arxiv.org/pdf/2003.06190v1.pdf
PWC	https://paperswithcode.com/paper/wac-a-corpus-of-wikipedia-conversations-for
Repo
Framework

A Flexible Framework for Large Graph Learning


Title	A Flexible Framework for Large Graph Learning
Authors	Dalong Yang, Chuan Chen, Youhao Zheng, Zibin Zheng
Abstract	Graph Convolutional Network (GCN) has shown strong effectiveness in graph learning tasks. However, GCN faces challenges in flexibility due to the fact of requiring the full graph Laplacian available in the training phase. Moreover, with the depth of layers increases, the computational and memory cost of GCN grows explosively on account of the recursive neighborhood expansion, which leads to a limitation in processing large graphs. To tackle these issues, we take advantage of image processing in agility and present Node2Img, a flexible architecture for large-scale graph learning. Node2Img maps the nodes to “images” (i.e. grid-like data in Euclidean space) which can be the inputs of Convolutional Neural Network (CNN). Instead of leveraging the fixed whole network as a batch to train the model, Node2Img supports a more efficacious framework in practice, where the batch size can be set elastically and the data in the same batch can be calculated parallelly. Specifically, by ranking each node’s influence through degree, Node2Img selects the most influential first-order as well as second-order neighbors with central node fusion information to construct the grid-like data. For further improving the efficiency of downstream tasks, a simple CNN-based neural network is employed to capture the significant information from the Euclidean grids. Additionally, the attention mechanism is implemented, which enables implicitly specifying the different weights for neighboring nodes with different influences. Extensive experiments on real graphs’ transductive and inductive learning tasks demonstrate the superiority of the proposed Node2Img model against the state-of-the-art GCN-based approaches.
Tasks
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09638v1
PDF	https://arxiv.org/pdf/2003.09638v1.pdf
PWC	https://paperswithcode.com/paper/a-flexible-framework-for-large-graph-learning
Repo
Framework

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos


Title	STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
Authors	Ali Athar, Sabarinath Mahadevan, Aljoša Ošep, Laura Leal-Taixé, Bastian Leibe
Abstract	Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in individual frames, and then associate these detections over time. Hence, these methods are often non-end-to-end trainable and highly tailored to specific tasks. In this paper, we propose a different approach that is well-suited to a variety of tasks involving instance segmentation in videos. In particular, we model a video clip as a single 3D spatio-temporal volume, and propose a novel approach that segments and tracks instances across space and time in a single stage. Our problem formulation is centered around the idea of spatio-temporal embeddings which are trained to cluster pixels belonging to a specific object instance over an entire video clip. To this end, we introduce (i) novel mixing functions that enhance the feature representation of spatio-temporal embeddings, and (ii) a single-stage, proposal-free network that can reason about temporal context. Our network is trained end-to-end to learn spatio-temporal embeddings as well as parameters required to cluster these embeddings, thus simplifying inference. Our method achieves state-of-the-art results across multiple datasets and tasks.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08429v1
PDF	https://arxiv.org/pdf/2003.08429v1.pdf
PWC	https://paperswithcode.com/paper/stem-seg-spatio-temporal-embeddings-for
Repo
Framework

Relational Deep Feature Learning for Heterogeneous Face Recognition


Title	Relational Deep Feature Learning for Heterogeneous Face Recognition
Authors	MyeongAh Cho, Taeoh Kim, Ig-Jae Kim, Sangyoun Lee
Abstract	Heterogeneous Face Recognition (HFR) is a task that matches faces across two different domains such as VIS (visible light), NIR (near-infrared), or the sketch domain. In contrast to face recognition in visual spectrum, because of the domain discrepancy, this task requires to extract domain-invariant feature or common space projection learning. To bridge this domain gap, we propose a graph-structured module that focuses on facial relational information to reduce the fundamental differences in domain characteristics. Since relational information is domain independent, our Relational Graph Module (RGM) performs relation modeling from node vectors that represent facial components such as lips, nose, and chin. Propagation of the generated relational graph then reduces the domain difference by transitioning from spatially correlated CNN (convolutional neural network) features to inter-dependent relational features. In addition, we propose a Node Attention Unit (NAU) that performs node-wise recalibration to focus on the more informative nodes arising from the relation-based propagation. Furthermore, we suggest a novel conditional-margin loss function (C-Softmax) for efficient projection learning on the common latent space of the embedding vector. Our module can be plugged into any pre-trained face recognition network to help overcome the limitations of a small HFR database. The proposed method shows superior performance on three different HFR databases CAISA NIR-VIS 2.0, IIIT-D Sketch, and BUAA-VisNir in various pre-trained networks. Furthermore, we explore our C-Softmax loss boosts HFR performance and also apply our loss to the large-scale visual face database LFW(Labeled Faces in Wild) by learning inter-class margins adaptively.
Tasks	Face Recognition, Heterogeneous Face Recognition
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00697v1
PDF	https://arxiv.org/pdf/2003.00697v1.pdf
PWC	https://paperswithcode.com/paper/relational-deep-feature-learning-for
Repo
Framework

Permutation Inference for Canonical Correlation Analysis


Title	Permutation Inference for Canonical Correlation Analysis
Authors	Anderson M. Winkler, Olivier Renaud, Stephen M. Smith, Thomas E. Nichols
Abstract	Canonical correlation analysis (CCA) has become a key tool for population neuroimaging, allowing investigation of associations between many imaging and non-imaging measurements. As age, sex and other variables are often a source of variability not of direct interest, previous work has used CCA on residuals from a model that removes these effects, then proceeded directly to permutation inference. We show that a simple permutation test, as typically used to identify significant modes of shared variation on such data adjusted for nuisance variables, produces inflated error rates. The reason is that residualisation introduces dependencies among the observations that violate the exchangeability assumption. Even in the absence of nuisance variables, however, a simple permutation test for CCA also leads to excess error rates for all canonical correlations other than the first. The reason is that a simple permutation scheme does not ignore the variability already explained by previous canonical variables. Here we propose solutions for both problems: in the case of nuisance variables, we show that transforming the residuals to a lower dimensional basis where exchangeability holds results in a valid permutation test; for more general cases, with or without nuisance variables, we propose estimating the canonical correlations in a stepwise manner, removing at each iteration the variance already explained, while dealing with different number of variables in both sides. We also discuss how to address the multiplicity of tests, proposing an admissible test that is not conservative, and provide a complete algorithm for permutation inference for CCA.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10046v2
PDF	https://arxiv.org/pdf/2002.10046v2.pdf
PWC	https://paperswithcode.com/paper/permutation-inference-for-canonical
Repo
Framework