October 16, 2019

3220 words 16 mins read

Paper Group ANR 1029

The Visual Centrifuge: Model-Free Layered Video Representations. Transfer Learning of Artist Group Factors to Musical Genre Classification. Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks. GANs for Medical Image Analysis. Tensor Matched Kronecker-Structured Subspace Detection for Mis …

The Visual Centrifuge: Model-Free Layered Video Representations


Title	The Visual Centrifuge: Model-Free Layered Video Representations
Authors	Jean-Baptiste Alayrac, João Carreira, Andrew Zisserman
Abstract	True video understanding requires making sense of non-lambertian scenes where the color of light arriving at the camera sensor encodes information about not just the last object it collided with, but about multiple mediums – colored windows, dirty mirrors, smoke or rain. Layered video representations have the potential of accurately modelling realistic scenes but have so far required stringent assumptions on motion, lighting and shape. Here we propose a learning-based approach for multi-layered video representation: we introduce novel uncertainty-capturing 3D convolutional architectures and train them to separate blended videos. We show that these models then generalize to single videos, where they exhibit interesting abilities: color constancy, factoring out shadows and separating reflections. We present quantitative and qualitative results on real world videos.
Tasks	Color Constancy, Video Understanding
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01461v2
PDF	http://arxiv.org/pdf/1812.01461v2.pdf
PWC	https://paperswithcode.com/paper/the-visual-centrifuge-model-free-layered
Repo
Framework

Transfer Learning of Artist Group Factors to Musical Genre Classification


Title	Transfer Learning of Artist Group Factors to Musical Genre Classification
Authors	Jaehun Kim, Minz Won, Xavier Serra, Cynthia C. S. Liem
Abstract	The automated recognition of music genres from audio information is a challenging problem, as genre labels are subjective and noisy. Artist labels are less subjective and less noisy, while certain artists may relate more strongly to certain genres. At the same time, at prediction time, it is not guaranteed that artist labels are available for a given audio segment. Therefore, in this work, we propose to apply the transfer learning framework, learning artist-related information which will be used at inference time for genre classification. We consider different types of artist-related information, expressed through artist group factors, which will allow for more efficient learning and stronger robustness to potential label noise. Furthermore, we investigate how to achieve the highest validation accuracy on the given FMA dataset, by experimenting with various kinds of transfer methods, including single-task transfer, multi-task transfer and finally multi-task learning.
Tasks	Multi-Task Learning, Transfer Learning
Published	2018-05-05
URL	http://arxiv.org/abs/1805.02043v2
PDF	http://arxiv.org/pdf/1805.02043v2.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-of-artist-group-factors-to
Repo
Framework

Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks


Title	Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks
Authors	Amy Jin, Serena Yeung, Jeffrey Jopling, Jonathan Krause, Dan Azagury, Arnold Milstein, Li Fei-Fei
Abstract	Five billion people in the world lack access to quality surgical care. Surgeon skill varies dramatically, and many surgical patients suffer complications and avoidable harm. Improving surgical training and feedback would help to reduce the rate of complications, half of which have been shown to be preventable. To do this, it is essential to assess operative skill, a process that currently requires experts and is manual, time consuming, and subjective. In this work, we introduce an approach to automatically assess surgeon performance by tracking and analyzing tool movements in surgical videos, leveraging region-based convolutional neural networks. In order to study this problem, we also introduce a new dataset, m2cai16-tool-locations, which extends the m2cai16-tool dataset with spatial bounds of tools. While previous methods have addressed tool presence detection, ours is the first to not only detect presence but also spatially localize surgical tools in real-world laparoscopic surgical videos. We show that our method both effectively detects the spatial bounds of tools as well as significantly outperforms existing methods on tool presence detection. We further demonstrate the ability of our method to assess surgical quality through analysis of tool usage patterns, movement range, and economy of motion.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08774v2
PDF	http://arxiv.org/pdf/1802.08774v2.pdf
PWC	https://paperswithcode.com/paper/tool-detection-and-operative-skill-assessment
Repo
Framework

GANs for Medical Image Analysis


Title	GANs for Medical Image Analysis
Authors	Salome Kazeminia, Christoph Baur, Arjan Kuijper, Bram van Ginneken, Nassir Navab, Shadi Albarqouni, Anirban Mukhopadhyay
Abstract	Generative Adversarial Networks (GANs) and their extensions have carved open many exciting ways to tackle well known and challenging medical image analysis problems such as medical image de-noising, reconstruction, segmentation, data simulation, detection or classification. Furthermore, their ability to synthesize images at unprecedented levels of realism also gives hope that the chronic scarcity of labeled data in the medical field can be resolved with the help of these generative models. In this review paper, a broad overview of recent literature on GANs for medical applications is given, the shortcomings and opportunities of the proposed methods are thoroughly discussed and potential future work is elaborated. We review the most relevant papers published until the submission date. For quick access, important details such as the underlying method, datasets and performance are tabulated. An interactive visualization which categorizes all papers to keep the review alive, is available at http://livingreview.in.tum.de/GANs_for_Medical_Applications.
Tasks
Published	2018-09-13
URL	https://arxiv.org/abs/1809.06222v3
PDF	https://arxiv.org/pdf/1809.06222v3.pdf
PWC	https://paperswithcode.com/paper/gans-for-medical-image-analysis
Repo
Framework

Tensor Matched Kronecker-Structured Subspace Detection for Missing Information


Title	Tensor Matched Kronecker-Structured Subspace Detection for Missing Information
Authors	Ishan Jindal, Matthew Nokleby
Abstract	We consider the problem of detecting whether a tensor signal having many missing entities lies within a given low dimensional Kronecker-Structured (KS) subspace. This is a matched subspace detection problem. Tensor matched subspace detection problem is more challenging because of the intertwined signal dimensions. We solve this problem by projecting the signal onto the Kronecker structured subspace, which is a Kronecker product of different subspaces corresponding to each signal dimension. Under this framework, we define the KS subspaces and the orthogonal projection of the signal onto the KS subspace. We prove that reliable detection is possible as long as the cardinality of the missing signal is greater than the dimensions of the KS subspace by bounding the residual energy of the sampling signal with high probability.
Tasks
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10957v1
PDF	http://arxiv.org/pdf/1810.10957v1.pdf
PWC	https://paperswithcode.com/paper/tensor-matched-kronecker-structured-subspace
Repo
Framework

Model-Based Learning of Turbulent Flows using a Mobile Robot


Title	Model-Based Learning of Turbulent Flows using a Mobile Robot
Authors	Reza Khodayi-mehr, Michael M. Zavlanos
Abstract	We consider the problem of model-based learning of turbulent flows using mobile robots. Specifically, we use empirical data to improve on numerical solutions obtained from Reynolds-Averaged Navier Stokes (RANS) models. RANS models are computationally efficient but rely on assumptions that require experimental validation. Here we construct statistical models of the flow properties using Gaussian processes (GPs) and rely on the numerical solutions to inform their mean. Utilizing Bayesian inference, we incorporate measurements of the time-averaged velocity and turbulent intensity into these GPs. We account for model ambiguity and parameter uncertainty, via Bayesian model selection, and for measurement noise by systematically incorporating it in the GPs. To collect the measurements, we control a custom-built mobile robot through a sequence of waypoints that maximize the information content of the measurements. The end result is a posterior distribution of the flow field that better approximates the real flow and quantifies the uncertainty in its properties. We experimentally demonstrate a considerable improvement in the prediction of these properties compared to numerical solutions.
Tasks	Bayesian Inference, Gaussian Processes, Model Selection
Published	2018-12-10
URL	https://arxiv.org/abs/1812.03894v3
PDF	https://arxiv.org/pdf/1812.03894v3.pdf
PWC	https://paperswithcode.com/paper/model-based-learning-of-turbulent-flows-using
Repo
Framework

Recursive Optimization of Convex Risk Measures: Mean-Semideviation Models


Title	Recursive Optimization of Convex Risk Measures: Mean-Semideviation Models
Authors	Dionysios S. Kalogerias, Warren B. Powell
Abstract	We develop recursive, data-driven, stochastic subgradient methods for optimizing a new, versatile, and application-driven class of convex risk measures, termed here as mean-semideviations, strictly generalizing the well-known and popular mean-upper-semideviation. We introduce the MESSAGEp algorithm, which is an efficient compositional subgradient procedure for iteratively solving convex mean-semideviation risk-averse problems to optimality. We analyze the asymptotic behavior of the MESSAGEp algorithm under a flexible and structure-exploiting set of problem assumptions. In particular: 1) Under appropriate stepsize rules, we establish pathwise convergence of the MESSAGEp algorithm in a strong technical sense, confirming its asymptotic consistency. 2) Assuming a strongly convex cost, we show that, for fixed semideviation order $p>1$ and for $\epsilon\in\left[0,1\right)$, the MESSAGEp algorithm achieves a squared-${\cal L}{2}$ solution suboptimality rate of the order of ${\cal O}(n^{-\left(1-\epsilon\right)/2})$ iterations, where, for $\epsilon>0$, pathwise convergence is simultaneously guaranteed. This result establishes a rate of order arbitrarily close to ${\cal O}(n^{-1/2})$, while ensuring strongly stable pathwise operation. For $p\equiv1$, the rate order improves to ${\cal O}(n^{-2/3})$, which also suffices for pathwise convergence, and matches previous results. 3) Likewise, in the general case of a convex cost, we show that, for any $\epsilon\in\left[0,1\right)$, the MESSAGEp algorithm with iterate smoothing achieves an ${\cal L}{1}$ objective suboptimality rate of the order of ${\cal O}(n^{-\left(1-\epsilon\right)/\left(4\bf{1}_{\left{ p>1\right} }+4\right)})$ iterations. This result provides maximal rates of ${\cal O}(n^{-1/4})$, if $p\equiv1$, and ${\cal O}(n^{-1/8})$, if $p>1$, matching the state of the art, as well.
Tasks
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00636v5
PDF	http://arxiv.org/pdf/1804.00636v5.pdf
PWC	https://paperswithcode.com/paper/recursive-optimization-of-convex-risk
Repo
Framework

RealPoint3D: Point Cloud Generation from a Single Image with Complex Background


Title	RealPoint3D: Point Cloud Generation from a Single Image with Complex Background
Authors	Yan Xia, Yang Zhang, Dingfu Zhou, Xinyu Huang, Cheng Wang, Ruigang Yang
Abstract	3D point cloud generation by the deep neural network from a single image has been attracting more and more researchers’ attention. However, recently-proposed methods require the objects be captured with relatively clean backgrounds, fixed viewpoint, while this highly limits its application in the real environment. To overcome these drawbacks, we proposed to integrate the prior 3D shape knowledge into the network to guide the 3D generation. By taking additional 3D information, the proposed network can handle the 3D object generation from a single real image captured from any viewpoint and complex background. Specifically, giving a query image, we retrieve the nearest shape model from a pre-prepared 3D model database. Then, the image together with the retrieved shape model is fed into the proposed network to generate the fine-grained 3D point cloud. The effectiveness of our proposed framework has been verified on different kinds of datasets. Experimental results show that the proposed framework achieves state-of-the-art accuracy compared to other volumetric-based and point set generation methods. Furthermore, the proposed framework works well for real images in complex backgrounds with various view angles.
Tasks	Point Cloud Generation
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02743v1
PDF	http://arxiv.org/pdf/1809.02743v1.pdf
PWC	https://paperswithcode.com/paper/realpoint3d-point-cloud-generation-from-a
Repo
Framework

Gradient Boosting With Piece-Wise Linear Regression Trees


Title	Gradient Boosting With Piece-Wise Linear Regression Trees
Authors	Yu Shi, Jian Li, Zhize Li
Abstract	Gradient Boosted Decision Trees (GBDT) is a very successful ensemble learning algorithm widely used across a variety of applications. Recently, several variants of GBDT training algorithms and implementations have been designed and heavily optimized in some very popular open sourced toolkits including XGBoost, LightGBM and CatBoost. In this paper, we show that both the accuracy and efficiency of GBDT can be further enhanced by using more complex base learners. Specifically, we extend gradient boosting to use piecewise linear regression trees (PL Trees), instead of piecewise constant regression trees, as base learners. We show that PL Trees can accelerate convergence of GBDT and improve the accuracy. We also propose some optimization tricks to substantially reduce the training time of PL Trees, with little sacrifice of accuracy. Moreover, we propose several implementation techniques to speedup our algorithm on modern computer architectures with powerful Single Instruction Multiple Data (SIMD) parallelism. The experimental results show that GBDT with PL Trees can provide very competitive testing accuracy with comparable or less training time.
Tasks
Published	2018-02-15
URL	https://arxiv.org/abs/1802.05640v3
PDF	https://arxiv.org/pdf/1802.05640v3.pdf
PWC	https://paperswithcode.com/paper/gradient-boosting-with-piece-wise-linear
Repo
Framework

S4ND: Single-Shot Single-Scale Lung Nodule Detection


Title	S4ND: Single-Shot Single-Scale Lung Nodule Detection
Authors	Naji Khosravan, Ulas Bagci
Abstract	The state of the art lung nodule detection studies rely on computationally expensive multi-stage frameworks to detect nodules from CT scans. To address this computational challenge and provide better performance, in this paper we propose S4ND, a new deep learning based method for lung nodule detection. Our approach uses a single feed forward pass of a single network for detection and provides better performance when compared to the current literature. The whole detection pipeline is designed as a single $3D$ Convolutional Neural Network (CNN) with dense connections, trained in an end-to-end manner. S4ND does not require any further post-processing or user guidance to refine detection results. Experimentally, we compared our network with the current state-of-the-art object detection network (SSD) in computer vision as well as the state-of-the-art published method for lung nodule detection (3D DCNN). We used publically available $888$ CT scans from LUNA challenge dataset and showed that the proposed method outperforms the current literature both in terms of efficiency and accuracy by achieving an average FROC-score of $0.897$. We also provide an in-depth analysis of our proposed network to shed light on the unclear paradigms of tiny object detection.
Tasks	Lung Nodule Detection, Object Detection
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02279v2
PDF	http://arxiv.org/pdf/1805.02279v2.pdf
PWC	https://paperswithcode.com/paper/s4nd-single-shot-single-scale-lung-nodule
Repo
Framework

A family of OWA operators based on Faulhaber’s formulas


Title	A family of OWA operators based on Faulhaber’s formulas
Authors	Oscar Duarte, Sandra Téllez
Abstract	In this paper we develop a new family of Ordered Weighted Averaging (OWA) operators. Weight vector is obtained from a desired orness of the operator. Using Faulhaber’s formulas we obtain direct and simple expressions for the weight vector without any iteration loop. With the exception of one weight, the remaining follow a straight line relation. As a result, a fast and robust algorithm is developed. The resulting weight vector is suboptimal according with the Maximum Entropy criterion, but it is very close to the optimal. Comparisons are done with other procedures.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10545v1
PDF	http://arxiv.org/pdf/1801.10545v1.pdf
PWC	https://paperswithcode.com/paper/a-family-of-owa-operators-based-on-faulhabers
Repo
Framework

Towards A Unified Analysis of Random Fourier Features


Title	Towards A Unified Analysis of Random Fourier Features
Authors	Zhu Li, Jean-Francois Ton, Dino Oglic, Dino Sejdinovic
Abstract	Random Fourier features is a widely used, simple, and effective technique for scaling up kernel methods. The existing theoretical analysis of the approach, however, remains focused on specific learning tasks and typically gives pessimistic bounds which are at odds with the empirical results. We tackle these problems and provide the first unified risk analysis of learning with random Fourier features using the squared error and Lipschitz continuous loss functions. In our bounds, the trade-off between the computational cost and the expected risk convergence rate is problem specific and expressed in terms of the regularization parameter and the \emph{number of effective degrees of freedom}. We study both the standard random Fourier features method for which we improve the existing bounds on the number of features required to guarantee the corresponding minimax risk convergence rate of kernel ridge regression, as well as a data-dependent modification which samples features proportional to \emph{ridge leverage scores} and further reduces the required number of features. As ridge leverage scores are expensive to compute, we devise a simple approximation scheme which provably reduces the computational cost without loss of statistical efficiency.
Tasks
Published	2018-06-24
URL	https://arxiv.org/abs/1806.09178v4
PDF	https://arxiv.org/pdf/1806.09178v4.pdf
PWC	https://paperswithcode.com/paper/towards-a-unified-analysis-of-random-fourier
Repo
Framework

To understand deep learning we need to understand kernel learning


Title	To understand deep learning we need to understand kernel learning
Authors	Mikhail Belkin, Siyuan Ma, Soumik Mandal
Abstract	Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this “overfitting”, they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of deep learning. Using six real-world and two synthetic datasets, we establish experimentally that kernel machines trained to have zero classification or near zero regression error perform very well on test data, even when the labels are corrupted with a high level of noise. We proceed to give a lower bound on the norm of zero loss solutions for smooth kernels, showing that they increase nearly exponentially with data size. We point out that this is difficult to reconcile with the existing generalization bounds. Moreover, none of the bounds produce non-trivial results for interpolating solutions. Second, we show experimentally that (non-smooth) Laplacian kernels easily fit random labels, a finding that parallels results for ReLU neural networks. In contrast, fitting noisy data requires many more epochs for smooth Gaussian kernels. Similar performance of overfitted Laplacian and Gaussian classifiers on test, suggests that generalization is tied to the properties of the kernel function rather than the optimization process. Certain key phenomena of deep learning are manifested similarly in kernel methods in the modern “overfitted” regime. The combination of the experimental and theoretical results presented in this paper indicates a need for new theoretical ideas for understanding properties of classical kernel methods. We argue that progress on understanding deep learning will be difficult until more tractable “shallow” kernel methods are better understood.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01396v3
PDF	http://arxiv.org/pdf/1802.01396v3.pdf
PWC	https://paperswithcode.com/paper/to-understand-deep-learning-we-need-to
Repo
Framework

DenseRAN for Offline Handwritten Chinese Character Recognition


Title	DenseRAN for Offline Handwritten Chinese Character Recognition
Authors	Wenchao Wang, Jianshu Zhang, Jun Du, Zi-Rui Wang, Yixing Zhu
Abstract	Recently, great success has been achieved in offline handwritten Chinese character recognition by using deep learning methods. Chinese characters are mainly logographic and consist of basic radicals, however, previous research mostly treated each Chinese character as a whole without explicitly considering its internal two-dimensional structure and radicals. In this study, we propose a novel radical analysis network with densely connected architecture (DenseRAN) to analyze Chinese character radicals and its two-dimensional structures simultaneously. DenseRAN first encodes input image to high-level visual features by employing DenseNet as an encoder. Then a decoder based on recurrent neural networks is employed, aiming at generating captions of Chinese characters by detecting radicals and two-dimensional structures through attention mechanism. The manner of treating a Chinese character as a composition of two-dimensional structures and radicals can reduce the size of vocabulary and enable DenseRAN to possess the capability of recognizing unseen Chinese character classes, only if the corresponding radicals have been seen in training set. Evaluated on ICDAR-2013 competition database, the proposed approach significantly outperforms whole-character modeling approach with a relative character error rate (CER) reduction of 18.54%. Meanwhile, for the case of recognizing 3277 unseen Chinese characters in CASIA-HWDB1.2 database, DenseRAN can achieve a character accuracy of about 41% while the traditional whole-character method has no capability to handle them.
Tasks	Offline Handwritten Chinese Character Recognition
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04134v1
PDF	http://arxiv.org/pdf/1808.04134v1.pdf
PWC	https://paperswithcode.com/paper/denseran-for-offline-handwritten-chinese
Repo
Framework

The EcoLexicon English Corpus as an open corpus in Sketch Engine


Title	The EcoLexicon English Corpus as an open corpus in Sketch Engine
Authors	Pilar Leon-Arauz, Antonio San Martin, Arianne Reimerink
Abstract	The EcoLexicon English Corpus (EEC) is a 23.1-million-word corpus of contemporary environmental texts. It was compiled by the LexiCon research group for the development of EcoLexicon (Faber, Leon-Arauz & Reimerink 2016; San Martin et al. 2017), a terminological knowledge base on the environment. It is available as an open corpus in the well-known corpus query system Sketch Engine (Kilgarriff et al. 2014), which means that any user, even without a subscription, can freely access and query the corpus. In this paper, the EEC is introduced by de- scribing how it was built and compiled and how it can be queried and exploited, based both on the functionalities provided by Sketch Engine and on the parameters in which the texts in the EEC are classified.
Tasks
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05797v1
PDF	http://arxiv.org/pdf/1807.05797v1.pdf
PWC	https://paperswithcode.com/paper/the-ecolexicon-english-corpus-as-an-open
Repo
Framework