January 29, 2020

3225 words 16 mins read

Paper Group ANR 628

Exploiting Clinically Available Delineations for CNN-based Segmentation in Radiotherapy Treatment Planning. Shooting Labels: 3D Semantic Labeling by Virtual Reality. Constant Curvature Graph Convolutional Networks. ResNetX: a more disordered and deeper network architecture. Improving Node Classification by Co-training Node Pair Classification: A No …

Exploiting Clinically Available Delineations for CNN-based Segmentation in Radiotherapy Treatment Planning


Title	Exploiting Clinically Available Delineations for CNN-based Segmentation in Radiotherapy Treatment Planning
Authors	Louis D. van Harten, Jelmer M. Wolterink, Joost J. C. Verhoeff, Ivana Išgum
Abstract	Convolutional neural networks (CNNs) have been widely and successfully used for medical image segmentation. However, CNNs are typically considered to require large numbers of dedicated expert-segmented training volumes, which may be limiting in practice. This work investigates whether clinically obtained segmentations which are readily available in picture archiving and communication systems (PACS) could provide a possible source of data to train a CNN for segmentation of organs-at-risk (OARs) in radiotherapy treatment planning. In such data, delineations of structures deemed irrelevant to the target clinical use may be lacking. To overcome this issue, we use multi-label instead of multi-class segmentation. We empirically assess how many clinical delineations would be sufficient to train a CNN for the segmentation of OARs and find that increasing the training set size beyond a limited number of images leads to sharply diminishing returns. Moreover, we find that by using multi-label segmentation, missing structures in the reference standard do not have a negative effect on overall segmentation accuracy. These results indicate that segmentations obtained in a clinical workflow can be used to train an accurate OAR segmentation model.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04967v1
PDF	https://arxiv.org/pdf/1911.04967v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-clinically-available-delineations
Repo
Framework

Shooting Labels: 3D Semantic Labeling by Virtual Reality


Title	Shooting Labels: 3D Semantic Labeling by Virtual Reality
Authors	Pierluigi Zama Ramirez, Claudio Paternesi, Daniele De Gregorio, Luigi Di Stefano
Abstract	Availability of a few, large-size, annotated datasets, like ImageNet, Pascal VOC and COCO, has lead deep learning to revolutionize computer vision research by achieving astonishing results in several vision tasks. We argue that new tools to facilitate generation of annotated datasets may help spreading data-driven AI throughout applications and domains. In this work we propose Shooting Labels, the first 3D labeling tool for dense 3D semantic segmentation which exploits Virtual Reality to render the labeling task as easy and fun as playing a video-game. Our tool allows for semantically labeling large scale environments very expeditiously, whatever the nature of the 3D data at hand (e.g. pointclouds, mesh). Furthermore, Shooting Labels efficiently integrates multi-users annotations to improve the labeling accuracy automatically and compute a label uncertainty map. Besides, within our framework the 3D annotations can be projected into 2D images, thereby speeding up also a notoriously slow and expensive task such as pixel-wise semantic labeling. We demonstrate the accuracy and efficiency of our tool in two different scenarios: an indoor workspace provided by Matterport3D and a large-scale outdoor environment reconstructed from 1000+ KITTI images.
Tasks	3D Semantic Segmentation, Semantic Segmentation
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05021v1
PDF	https://arxiv.org/pdf/1910.05021v1.pdf
PWC	https://paperswithcode.com/paper/shooting-labels-3d-semantic-labeling-by
Repo
Framework

Constant Curvature Graph Convolutional Networks


Title	Constant Curvature Graph Convolutional Networks
Authors	Gregor Bachmann, Gary Bécigneul, Octavian-Eugen Ganea
Abstract	Interest has been rising lately towards methods representing data in non-Euclidean spaces, e.g. hyperbolic or spherical, that provide specific inductive biases useful for certain real-world data properties, e.g. scale-free, hierarchical or cyclical. However, the popular graph neural networks are currently limited in modeling data only via Euclidean geometry and associated vector space operations. Here, we bridge this gap by proposing mathematically grounded generalizations of graph convolutional networks (GCN) to (products of) constant curvature spaces. We do this by i) introducing a unified formalism that can interpolate smoothly between all geometries of constant curvature, ii) leveraging gyro-barycentric coordinates that generalize the classic Euclidean concept of the center of mass. Our class of models smoothly recover their Euclidean counterparts when the curvature goes to zero from either side. Empirically, we outperform Euclidean GCNs in the tasks of node classification and distortion minimization for symbolic data exhibiting non-Euclidean behavior, according to their discrete curvature.
Tasks	Node Classification
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05076v2
PDF	https://arxiv.org/pdf/1911.05076v2.pdf
PWC	https://paperswithcode.com/paper/constant-curvature-graph-convolutional-1
Repo
Framework

ResNetX: a more disordered and deeper network architecture


Title	ResNetX: a more disordered and deeper network architecture
Authors	Wenfeng Feng, Xin Zhang, Guangpeng Zhao
Abstract	Designing efficient network structures has always been the core content of neural network research. ResNet and its variants have proved to be efficient in architecture. However, how to theoretically character the influence of network structure on performance is still vague. With the help of techniques in complex networks, We here provide a natural yet efficient extension to ResNet by folding its backbone chain. Our architecture has two structural features when being mapped to directed acyclic graphs: First is a higher degree of the disorder compared with ResNet, which let ResNetX explore a larger number of feature maps with different sizes of receptive fields. Second is a larger proportion of shorter paths compared to ResNet, which improves the direct flow of information through the entire network. Our architecture exposes a new dimension, namely “fold depth”, in addition to existing dimensions of depth, width, and cardinality. Our architecture is a natural extension to ResNet, and can be integrated with existing state-of-the-art methods with little effort. Image classification results on CIFAR-10 and CIFAR-100 benchmarks suggested that our new network architecture performs better than ResNet.
Tasks	Image Classification
Published	2019-12-18
URL	https://arxiv.org/abs/1912.12165v1
PDF	https://arxiv.org/pdf/1912.12165v1.pdf
PWC	https://paperswithcode.com/paper/resnetx-a-more-disordered-and-deeper-network
Repo
Framework

Improving Node Classification by Co-training Node Pair Classification: A Novel Training Framework for General Graph Neural Networks


Title	Improving Node Classification by Co-training Node Pair Classification: A Novel Training Framework for General Graph Neural Networks
Authors	Deli Chen, Xiaoqian Liu, Yankai Lin, Peng Li, Jie Zhou, Qi Su, Xu Sun
Abstract	Semi-supervised learning is a widely used training framework for graph node classification. However, there are two problems existing in this learning method: (1) the original graph topology may not be perfectly aligned with the node classification task; (2) the supervision information in the training set has not been fully used. To tackle these two problems, we design a new task: node pair classification, to assist in training GNN models for the target node classification task. We further propose a novel training framework named Adaptive Co-training, which jointly trains the node classification and the node pair classification after the optimization of graph topology. Extensive experimental results on four representative GNN models have demonstrated that our proposed training framework significantly outperforms baseline methods across three benchmark graph datasets.
Tasks	Node Classification
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03904v1
PDF	https://arxiv.org/pdf/1911.03904v1.pdf
PWC	https://paperswithcode.com/paper/improving-node-classification-by-co-training
Repo
Framework

Using massive health insurance claims data to predict very high-cost claimants: a machine learning approach


Title	Using massive health insurance claims data to predict very high-cost claimants: a machine learning approach
Authors	José M. Maisog, Wenhong Li, Yanchun Xu, Brian Hurley, Hetal Shah, Ryan Lemberg, Tina Borden, Stephen Bandeian, Melissa Schline, Roxanna Cross, Alan Spiro, Russ Michael, Alexander Gutfraind
Abstract	Due to escalating healthcare costs, accurately predicting which patients will incur high costs is an important task for payers and providers of healthcare. High-cost claimants (HiCCs) are patients who have annual costs above $$250,000$ and who represent just 0.16% of the insured population but currently account for 9% of all healthcare costs. In this study, we aimed to develop a high-performance algorithm to predict HiCCs to inform a novel care management system. Using health insurance claims from 48 million people and augmented with census data, we applied machine learning to train binary classification models to calculate the personal risk of HiCC. To train the models, we developed a platform starting with 6,006 variables across all clinical and demographic dimensions and constructed over one hundred candidate models. The best model achieved an area under the receiver operating characteristic curve of 91.2%. The model exceeds the highest published performance (84%) and remains high for patients with no prior history of high-cost status (89%), who have less than a full year of enrollment (87%), or lack pharmacy claims data (88%). It attains an area under the precision-recall curve of 23.1%, and precision of 74% at a threshold of 0.99. A care management program enrolling 500 people with the highest HiCC risk is expected to treat 199 true HiCCs and generate a net savings of $$7.3$ million per year. Our results demonstrate that high-performing predictive models can be constructed using claims data and publicly available data alone, even for rare high-cost claimants exceeding $$250,000$. Our model demonstrates the transformational power of machine learning and artificial intelligence in care management, which would allow healthcare payers and providers to introduce the next generation of care management programs.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13032v1
PDF	https://arxiv.org/pdf/1912.13032v1.pdf
PWC	https://paperswithcode.com/paper/using-massive-health-insurance-claims-data-to
Repo
Framework

Deep, robust and single shot 3D multi-person human pose estimation in complex images


Title	Deep, robust and single shot 3D multi-person human pose estimation in complex images
Authors	Abdallah Benzine, Bertrand Luvison, Quoc Cuong Pham, Catherine Achard
Abstract	In this paper, we propose a new single shot method for multi-person 3D human pose estimation in complex images. The model jointly learns to locate the human joints in the image, to estimate their 3D coordinates and to group these predictions into full human skeletons. The proposed method deals with a variable number of people and does not need bounding boxes to estimate the 3D poses. It leverages and extends the Stacked Hourglass Network and its multi-scale feature learning to manage multi-person situations. Thus, we exploit a robust 3D human pose formulation to fully describe several 3D human poses even in case of strong occlusions or crops. Then, joint grouping and human pose estimation for an arbitrary number of people are performed using the associative embedding method. Our approach significantly outperforms the state of the art on the challenging CMU Panoptic. Furthermore, it leads to good results on the complex and synthetic images from the newly proposed JTA Dataset.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03391v1
PDF	https://arxiv.org/pdf/1911.03391v1.pdf
PWC	https://paperswithcode.com/paper/deep-robust-and-single-shot-3d-multi-person
Repo
Framework

Exponentiated Gradient Meets Gradient Descent


Title	Exponentiated Gradient Meets Gradient Descent
Authors	Udaya Ghai, Elad Hazan, Yoram Singer
Abstract	The (stochastic) gradient descent and the multiplicative update method are probably the most popular algorithms in machine learning. We introduce and study a new regularization which provides a unification of the additive and multiplicative updates. This regularization is derived from an hyperbolic analogue of the entropy function, which we call hypentropy. It is motivated by a natural extension of the multiplicative update to negative numbers. The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices. While the latter is only applicable to positive semi-definite matrices, the spectral hypentropy method can naturally be used with general rectangular matrices. We analyze the new family of updates by deriving tight regret bounds. We study empirically the applicability of the new update for settings such as multiclass learning, in which the parameters constitute a general rectangular matrix.
Tasks
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01903v1
PDF	http://arxiv.org/pdf/1902.01903v1.pdf
PWC	https://paperswithcode.com/paper/exponentiated-gradient-meets-gradient-descent
Repo
Framework

Consensus-based Optimization for 3D Human Pose Estimation in Camera Coordinates


Title	Consensus-based Optimization for 3D Human Pose Estimation in Camera Coordinates
Authors	Diogo C Luvizon, Hedi Tabia, David Picard
Abstract	3D human pose estimation is frequently seen as the task of estimating 3D poses relative to the root body joint. Alternatively, in this paper, we propose a 3D human pose estimation method in camera coordinates, which allows effective combination of 2D annotated data and 3D poses, as well as a straightforward multi-view generalization. To that end, we cast the problem into a different perspective, where 3D poses are predicted in the image plane, in pixels, and the absolute depth is estimated in millimeters. Based on this, we propose a consensus-based optimization algorithm for multi-view predictions from uncalibrated images, which requires a single monocular training procedure. Our method improves the state-of-the-art on well known 3D human pose datasets, reducing the prediction error by 32% in the most common benchmark. In addition, we also reported our results in absolute pose position error, achieving 80mm for monocular estimations and 51mm for multi-view, on average.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09245v2
PDF	https://arxiv.org/pdf/1911.09245v2.pdf
PWC	https://paperswithcode.com/paper/consensus-based-optimization-for-3d-human
Repo
Framework

Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization


Title	Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization
Authors	Anas Barakat, Pascal Bianchi
Abstract	Although ADAM is a very popular algorithm for optimizing the weights of neural networks, it has been recently shown that it can diverge even in simple convex optimization examples. Several variants of ADAM have been proposed to circumvent this convergence issue. In this work, we study the ADAM algorithm for smooth nonconvex optimization under a boundedness assumption on the adaptive learning rate. The bound on the adaptive step size depends on the Lipschitz constant of the gradient of the objective function and provides safe theoretical adaptive step sizes. Under this boundedness assumption, we show a novel first order convergence rate result in both deterministic and stochastic contexts. Furthermore, we establish convergence rates of the function value sequence using the Kurdyka-Lojasiewicz property.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07596v1
PDF	https://arxiv.org/pdf/1911.07596v1.pdf
PWC	https://paperswithcode.com/paper/convergence-analysis-of-a-momentum-algorithm-1
Repo
Framework

A unified spectra analysis workflow for the assessment of microbial contamination of ready to eat green salads: Comparative study and application of non-invasive sensors


Title	A unified spectra analysis workflow for the assessment of microbial contamination of ready to eat green salads: Comparative study and application of non-invasive sensors
Authors	Panagiotis Tsakanikas, Lemonia Christina Fengou, Evanthia Manthou, Alexandra Lianou, Efstathios Z. Panagou, George John E. Nychas
Abstract	The present study provides a comparative assessment of non-invasive sensors as means of estimating the microbial contamination and time-on-shelf (i.e. storage time) of leafy green vegetables, using a novel unified spectra analysis workflow. Two fresh ready-to-eat green salads were used in the context of this study for the purpose of evaluating the efficiency and practical application of the presented workflow: rocket and baby spinach salads. The employed analysis workflow consisted of robust data normalization, powerful feature selection based on random forests regression, and selection of the number of partial least squares regression coefficients in the training process by estimating the knee-point on the explained variance plot. Training processes were based on microbiological and spectral data derived during storage of green salad samples at isothermal conditions (4, 8 and 12C), whereas testing was performed on data during storage under dynamic temperature conditions (simulating real-life temperature fluctuations in the food supply chain). Since an increasing interest in the use of non-invasive sensors in food quality assessment has been made evident in recent years, the unified spectra analysis workflow described herein, by being based on the creation/usage of limited sized featured sets, could be very useful in food-specific low-cost sensor development.
Tasks	Feature Selection
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08998v2
PDF	http://arxiv.org/pdf/1903.08998v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-spectra-analysis-workflow-for-the
Repo
Framework

Convergence Rate of $\mathcal{O}(1/k)$ for Optimistic Gradient and Extra-gradient Methods in Smooth Convex-Concave Saddle Point Problems


Title	Convergence Rate of $\mathcal{O}(1/k)$ for Optimistic Gradient and Extra-gradient Methods in Smooth Convex-Concave Saddle Point Problems
Authors	Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil
Abstract	We study the iteration complexity of the optimistic gradient descent-ascent (OGDA) method and the extra-gradient (EG) method for finding a saddle point of a convex-convex unconstrained min-max problem. To do so, we first show that both OGDA and EG can be interpreted as approximate variants of the proximal point method. We then exploit this interpretation to show that both algorithms produce iterates that remain within a bounded set. We further show that the function value of the averaged iterates generated by both of these algorithms converge with a rate of $\mathcal{O}(1/k)$ to the function value at the saddle point. Our theoretical analysis is of interest as it provides a simple convergence analysis for the EG algorithm in terms of function value without using compactness assumption. Moreover, it provides the first convergence rate estimate for OGDA in the general convex-concave setting.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01115v2
PDF	https://arxiv.org/pdf/1906.01115v2.pdf
PWC	https://paperswithcode.com/paper/proximal-point-approximations-achieving-a
Repo
Framework

PoseLifter: Absolute 3D human pose lifting network from a single noisy 2D human pose


Title	PoseLifter: Absolute 3D human pose lifting network from a single noisy 2D human pose
Authors	Ju Yong Chang, Gyeongsik Moon, Kyoung Mu Lee
Abstract	This study presents a new network (i.e., PoseLifter) that can lift a 2D human pose to an absolute 3D pose in a camera coordinate system. The proposed network estimates the absolute 3D location of a target subject and generates an improved 3D relative pose estimation compared with existing pose-lifting methods. Using the PoseLifter with a 2D pose estimator in a cascade fashion can estimate a 3D human pose from a single RGB image. In this case, we empirically prove that using realistic 2D poses synthesized with the real error distribution of 2D body joints considerably improves the performance of our PoseLifter. The proposed method is applied to public datasets to achieve state-of-the-art 2D-to-3D pose lifting and 3D human pose estimation.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2019-10-26
URL	https://arxiv.org/abs/1910.12029v2
PDF	https://arxiv.org/pdf/1910.12029v2.pdf
PWC	https://paperswithcode.com/paper/absposelifter-absolute-3d-human-pose-lifting
Repo
Framework

Imaging with highly incomplete and corrupted data


Title	Imaging with highly incomplete and corrupted data
Authors	Miguel Moscoso, Alexei Novikov, George Papanicolaou, Chrysoula Tsogka
Abstract	We consider the problem of imaging sparse scenes from a few noisy data using an $l_1$-minimization approach. This problem can be cast as a linear system of the form $A , \rho =b$, where $A$ is an $N\times K$ measurement matrix. We assume that the dimension of the unknown sparse vector $\rho \in {\mathbb{C}}^K$ is much larger than the dimension of the data vector $b \in {\mathbb{C}}^N$, i.e, $K \gg N$. We provide a theoretical framework that allows us to examine under what conditions the $\ell_1$-minimization problem admits a solution that is close to the exact one in the presence of noise. Our analysis shows that $l_1$-minimization is not robust for imaging with noisy data when high resolution is required. To improve the performance of $l_1$-minimization we propose to solve instead the augmented linear system $ [A , , C] \rho =b$, where the $N \times \Sigma$ matrix $C$ is a noise collector. It is constructed so as its column vectors provide a frame on which the noise of the data, a vector of dimension $N$, can be well approximated. Theoretically, the dimension $\Sigma$ of the noise collector should be $e^N$ which would make its use not practical. However, our numerical results illustrate that robust results in the presence of noise can be obtained with a large enough number of columns $\Sigma \approx 10 K$.
Tasks
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01479v1
PDF	https://arxiv.org/pdf/1908.01479v1.pdf
PWC	https://paperswithcode.com/paper/imaging-with-highly-incomplete-and-corrupted
Repo
Framework

Submodular Function Minimization and Polarity


Title	Submodular Function Minimization and Polarity
Authors	Alper Atamturk, Vishnu Narayanan
Abstract	Using polarity, we give an outer polyhedral approximation for the epigraph of set functions. For a submodular function, we prove that the corresponding polar relaxation is exact; hence, it is equivalent to the Lov'asz extension. The polar approach provides an alternative proof for the convex hull description of the epigraph of a submodular function. Computational experiments show that the inequalities from outer approximations can be effective as cutting planes for solving submodular as well as non-submodular set function minimization problems.
Tasks
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13238v2
PDF	https://arxiv.org/pdf/1912.13238v2.pdf
PWC	https://paperswithcode.com/paper/submodular-function-minimization-and-polarity
Repo
Framework