January 27, 2020

3518 words 17 mins read

Paper Group ANR 1112

Dense 3D Reconstruction for Visual Tunnel Inspection using Unmanned Aerial Vehicle. CNN-based Prostate Zonal Segmentation on T2-weighted MR Images: A Cross-dataset Study. Early detection of sepsis utilizing deep learning on electronic health record event sequences. Learning the Depths of Moving People by Watching Frozen People. The Alt-Right and Gl …

Dense 3D Reconstruction for Visual Tunnel Inspection using Unmanned Aerial Vehicle


Title	Dense 3D Reconstruction for Visual Tunnel Inspection using Unmanned Aerial Vehicle
Authors	Ramanpreet Singh Pahwa, Kennard Yanting Chan, Jiamin Bai, Vincensius Billy Saputra, Minh N. Do, Shaohui Foong
Abstract	Advances in Unmanned Aerial Vehicle (UAV) opens venues for application such as tunnel inspection. Owing to its versatility to fly inside the tunnels, it can quickly identify defects and potential problems related to safety. However, long tunnels, especially with repetitive or uniform structures pose a significant problem for UAV navigation. Furthermore, post-processing visual data from the camera mounted on the UAV is required to generate useful information for the inspection task. In this work, we design a UAV with a single rotating camera to accomplish the task. Compared to other platforms, our solution can fit the stringent requirement for tunnel inspection, in terms of battery life, size and weight. While the current state-of-the-art can estimate camera pose and 3D geometry from a sequence of images, they assume large overlap, small rotational motion, and many distinct matching points between images. These assumptions severely limit their effectiveness in tunnel-like scenarios where the camera has erratic or large rotational motion, such as the one mounted on the UAV. This paper presents a novel solution which exploits Structure-from-Motion, Bundle Adjustment, and available geometry priors to robustly estimate camera pose and automatically reconstruct a fully-dense 3D scene using the least possible number of images in various challenging tunnel-like environments. We validate our system with both Virtual Reality application and experimentation with a real dataset. The results demonstrate that the proposed reconstruction along with texture mapping allows for remote navigation and inspection of tunnel-like environments, even those which are inaccessible for humans.
Tasks	3D Reconstruction
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03603v1
PDF	https://arxiv.org/pdf/1911.03603v1.pdf
PWC	https://paperswithcode.com/paper/dense-3d-reconstruction-for-visual-tunnel
Repo
Framework

CNN-based Prostate Zonal Segmentation on T2-weighted MR Images: A Cross-dataset Study


Title	CNN-based Prostate Zonal Segmentation on T2-weighted MR Images: A Cross-dataset Study
Authors	Leonardo Rundo, Changhee Han, Jin Zhang, Ryuichiro Hataya, Yudai Nagano, Carmelo Militello, Claudio Ferretti, Marco S. Nobile, Andrea Tangherloni, Maria Carla Gilardi, Salvatore Vitabile, Hideki Nakayama, Giancarlo Mauri
Abstract	Prostate cancer is the most common cancer among US men. However, prostate imaging is still challenging despite the advances in multi-parametric Magnetic Resonance Imaging (MRI), which provides both morphologic and functional information pertaining to the pathological regions. Along with whole prostate gland segmentation, distinguishing between the Central Gland (CG) and Peripheral Zone (PZ) can guide towards differential diagnosis, since the frequency and severity of tumors differ in these regions; however, their boundary is often weak and fuzzy. This work presents a preliminary study on Deep Learning to automatically delineate the CG and PZ, aiming at evaluating the generalization ability of Convolutional Neural Networks (CNNs) on two multi-centric MRI prostate datasets. Especially, we compared three CNN-based architectures: SegNet, U-Net, and pix2pix. In such a context, the segmentation performances achieved with/without pre-training were compared in 4-fold cross-validation. In general, U-Net outperforms the other methods, especially when training and testing are performed on multiple datasets.
Tasks
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12571v1
PDF	http://arxiv.org/pdf/1903.12571v1.pdf
PWC	https://paperswithcode.com/paper/cnn-based-prostate-zonal-segmentation-on-t2
Repo
Framework

Early detection of sepsis utilizing deep learning on electronic health record event sequences


Title	Early detection of sepsis utilizing deep learning on electronic health record event sequences
Authors	Simon Meyer Lauritsen, Mads Ellersgaard Kalør, Emil Lund Kongsgaard, Katrine Meyer Lauritsen, Marianne Johansson Jørgensen, Jeppe Lange, Bo Thiesson
Abstract	The timeliness of detection of a sepsis event in progress is a crucial factor in the outcome for the patient. Machine learning models built from data in electronic health records can be used as an effective tool for improving this timeliness, but so far the potential for clinical implementations has been largely limited to studies in intensive care units. This study will employ a richer data set that will expand the applicability of these models beyond intensive care units. Furthermore, we will circumvent several important limitations that have been found in the literature: 1) Models are evaluated shortly before sepsis onset without considering interventions already initiated. 2) Machine learning models are built on a restricted set of clinical parameters, which are not necessarily measured in all departments. 3) Model performance is limited by current knowledge of sepsis, as feature interactions and time dependencies are hardcoded into the model. In this study, we present a model to overcome these shortcomings using a deep learning approach on a diverse multicenter data set. We used retrospective data from multiple Danish hospitals over a seven-year period. Our sepsis detection system is constructed as a combination of a convolutional neural network and a long short-term memory network. We suggest a retrospective assessment of interventions by looking at intravenous antibiotics and blood cultures preceding the prediction time. Results show performance ranging from AUROC 0.856 (3 hours before sepsis onset) to AUROC 0.756 (24 hours before sepsis onset). We present a deep learning system for early detection of sepsis that is able to learn characteristics of the key factors and interactions from the raw event sequence data itself, without relying on a labor-intensive feature extraction work.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02956v1
PDF	https://arxiv.org/pdf/1906.02956v1.pdf
PWC	https://paperswithcode.com/paper/early-detection-of-sepsis-utilizing-deep
Repo
Framework

Learning the Depths of Moving People by Watching Frozen People


Title	Learning the Depths of Moving People by Watching Frozen People
Authors	Zhengqi Li, Tali Dekel, Forrester Cole, Richard Tucker, Noah Snavely, Ce Liu, William T. Freeman
Abstract	We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving. Existing methods for recovering depth for dynamic, non-rigid objects from monocular video impose strong assumptions on the objects’ motion and may only recover sparse depth. In this paper, we take a data-driven approach and learn human depth priors from a new source of data: thousands of Internet videos of people imitating mannequins, i.e., freezing in diverse, natural poses, while a hand-held camera tours the scene. Because people are stationary, training data can be generated using multi-view stereo reconstruction. At inference time, our method uses motion parallax cues from the static areas of the scenes to guide the depth prediction. We demonstrate our method on real-world sequences of complex human actions captured by a moving hand-held camera, show improvement over state-of-the-art monocular depth prediction methods, and show various 3D effects produced using our predicted depth.
Tasks	Depth Estimation
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11111v1
PDF	http://arxiv.org/pdf/1904.11111v1.pdf
PWC	https://paperswithcode.com/paper/learning-the-depths-of-moving-people-by
Repo
Framework

The Alt-Right and Global Information Warfare


Title	The Alt-Right and Global Information Warfare
Authors	Emmi Bevensee, Alexander Reid Ross
Abstract	The Alt-Right is a neo-fascist white supremacist movement that is involved in violent extremism and shows signs of engagement in extensive disinformation campaigns. Using social media data mining, this study develops a deeper understanding of such targeted disinformation campaigns and the ways they spread. It also adds to the available literature on the endogenous and exogenous influences within the US far right, as well as motivating factors that drive disinformation campaigns, such as geopolitical strategy. This study is to be taken as a preliminary analysis to indicate future methods and follow-on research that will help develop an integrated approach to understanding the strategies and associations of the modern fascist movement.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02712v1
PDF	https://arxiv.org/pdf/1905.02712v1.pdf
PWC	https://paperswithcode.com/paper/the-alt-right-and-global-information-warfare
Repo
Framework

Fast Intent Classification for Spoken Language Understanding


Title	Fast Intent Classification for Spoken Language Understanding
Authors	Akshit Tyagi, Varun Sharma, Rahul Gupta, Lynn Samson, Nan Zhuang, Zihang Wang, Bill Campbell
Abstract	Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher latency and energy usage, particularly when operating on low complexity devices. To address the latency and computational complexity issues, we explore a BranchyNet scheme on an intent classification scheme within SLU systems. The BranchyNet scheme when applied to a high complexity model, adds exit points at various stages in the model allowing early decision making for a set of queries to the SLU model. We conduct experiments on the Facebook Semantic Parsing dataset with two candidate model architectures for intent classification. Our experiments show that the BranchyNet scheme provides gains in terms of computational complexity without compromising model accuracy. We also conduct analytical studies regarding the improvements in the computational cost, distribution of utterances that egress from various exit points and the impact of adding more complexity to models with the BranchyNet scheme.
Tasks	Decision Making, Intent Classification, Named Entity Recognition, Semantic Parsing, Spoken Language Understanding
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01728v2
PDF	https://arxiv.org/pdf/1912.01728v2.pdf
PWC	https://paperswithcode.com/paper/fast-intent-classification-for-spoken
Repo
Framework

Semi-supervised GAN for Classification of Multispectral Imagery Acquired by UAVs


Title	Semi-supervised GAN for Classification of Multispectral Imagery Acquired by UAVs
Authors	Hamideh Kerdegari, Manzoor Razaak, Vasileios Argyriou, Paolo Remagnino
Abstract	Unmanned aerial vehicles (UAV) are used in precision agriculture (PA) to enable aerial monitoring of farmlands. Intelligent methods are required to pinpoint weed infestations and make optimal choice of pesticide. UAV can fly a multispectral camera and collect data. However, the classification of multispectral images using supervised machine learning algorithms such as convolutional neural networks (CNN) requires large amount of training data. This is a common drawback in deep learning we try to circumvent making use of a semi-supervised generative adversarial networks (GAN), providing a pixel-wise classification for all the acquired multispectral images. Our algorithm consists of a generator network that provides photo-realistic images as extra training data to a multi-class classifier, acting as a discriminator and trained on small amounts of labeled data. The performance of the proposed method is evaluated on the weedNet dataset consisting of multispectral crop and weed images collected by a micro aerial vehicle (MAV). The results by the proposed semi-supervised GAN achieves high classification accuracy and demonstrates the potential of GAN-based methods for the challenging task of multispectral image classification.
Tasks	Image Classification
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10920v1
PDF	https://arxiv.org/pdf/1905.10920v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-gan-for-classification-of
Repo
Framework

From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project


Title	From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Authors	Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz
Abstract	AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy, but the rich variety of standardized exams has remained a landmark challenge. Even in 2016, the best AI system achieved merely 59.3% on an 8th Grade science exam challenge. This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam’s non-diagram, multiple choice (NDMC) questions. In addition, our Aristo system, building upon the success of recent language models, exceeded 83% on the corresponding Grade 12 Science Exam NDMC questions. The results, on unseen test questions, are robust across different test years and different variations of this kind of test. They demonstrate that modern NLP methods can result in mastery on this task. While not a full solution to general question-answering (the questions are multiple choice, and the domain is restricted to 8th Grade science), it represents a significant milestone for the field.
Tasks	Question Answering
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01958v2
PDF	https://arxiv.org/pdf/1909.01958v2.pdf
PWC	https://paperswithcode.com/paper/from-f-to-a-on-the-ny-regents-science-exams
Repo
Framework

Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation


Title	Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation
Authors	Seraphina Goldfarb-Tarrant, Haining Feng, Nanyun Peng
Abstract	Story composition is a challenging problem for machines and even for humans. We present a neural narrative generation system that interacts with humans to generate stories. Our system has different levels of human interaction, which enables us to understand at what stage of story-writing human collaboration is most productive, both to improving story quality and human engagement in the writing process. We compare different varieties of interaction in story-writing, story-planning, and diversity controls under time constraints, and show that increased types of human collaboration at both planning and writing stages results in a 10-50% improvement in story quality as compared to less interactive baselines. We also show an accompanying increase in user engagement and satisfaction with stories as compared to our own less interactive systems and to previous turn-taking approaches to interaction. Finally, we find that humans tasked with collaboratively improving a particular characteristic of a story are in fact able to do so, which has implications for future uses of human-in-the-loop systems.
Tasks
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02357v3
PDF	https://arxiv.org/pdf/1904.02357v3.pdf
PWC	https://paperswithcode.com/paper/plan-write-and-revise-an-interactive-system
Repo
Framework

Increasing Iterate Averaging for Solving Saddle-Point Problems


Title	Increasing Iterate Averaging for Solving Saddle-Point Problems
Authors	Yuan Gao, Christian Kroer, Donald Goldfarb
Abstract	Many problems in machine learning and game theory can be formulated as saddle-point problems, for which various first-order methods have been developed and proven efficient in practice. Under the general convex-concave assumption, most first-order methods only guarantee ergodic convergence, that is, convergence of the uniform averages of the iterates. However, numerically, the iterates themselves can sometimes converge much faster than the uniform averages. This observation motivates increasing averaging schemes that put more weight on later iterates, in contrast to the usual uniform averaging. We show that such increasing averaging schemes, applied to various first-order methods, are able to preserve the convergence of the averaged iterates with no additional assumptions or computational overhead. Extensive numerical experiments on various equilibrium computation and image denoising problems demonstrate the effectiveness of the increasing averaging schemes. In particular, the increasing averages consistently outperform the uniform averages in all test problems by orders of magnitude. When solving matrix games and extensive-form games, increasing averages consistently outperform the last iterate as well. For matrix games, a first-order method equipped with increasing averaging outperforms the highly competitive CFR$^+$ algorithm.
Tasks	Denoising, Image Denoising
Published	2019-03-26
URL	https://arxiv.org/abs/1903.10646v2
PDF	https://arxiv.org/pdf/1903.10646v2.pdf
PWC	https://paperswithcode.com/paper/first-order-methods-with-increasing-iterate
Repo
Framework

Skin Lesion Segmentation and Classification for ISIC 2018 by Combining Deep CNN and Handcrafted Features


Title	Skin Lesion Segmentation and Classification for ISIC 2018 by Combining Deep CNN and Handcrafted Features
Authors	Redha Ali, Russell C. Hardie, Manawaduge Supun De Silva, Temesguen Messay Kebede
Abstract	This short report describes our submission to the ISIC 2018 Challenge in Skin Lesion Analysis Towards Melanoma Detection for Task1 and Task 3. This work has been accomplished by a team of researchers at the University of Dayton Signal and Image Processing Lab. Our proposed approach is computationally efficient are combines information from both deep learning and handcrafted features. For Task3, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single method features. These features are utilized as inputs to a decision-making model that is based on a multiclass Support Vector Machine (SVM) classifier. The proposed technique is evaluated on online validation databases. Our score was 0.841 with SVM classifier on the validation dataset.
Tasks	Decision Making, Lesion Segmentation
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05730v1
PDF	https://arxiv.org/pdf/1908.05730v1.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-segmentation-and-classification-2
Repo
Framework

Concentration of kernel matrices with application to kernel spectral clustering


Title	Concentration of kernel matrices with application to kernel spectral clustering
Authors	Arash A. Amini, Zahra S. Razaee
Abstract	We study the concentration of random kernel matrices around their mean. We derive nonasymptotic exponential concentration inequalities for Lipschitz kernels assuming that the data points are independent draws from a class of multivariate distributions on $\mathbb R^d$, including the strongly log-concave distributions under affine transformations. A feature of our result is that the data points need not have identical distributions or zero mean, which is key in certain applications such as clustering. Our bound for the Lipschitz kernels is dimension-free and sharp up to constants. For comparison, we also derive the companion result for the Euclidean (inner product) kernel for a class of sub-Gaussian distributions. A notable difference between the two cases is that, in contrast to the Euclidean kernel, in the Lipschitz case, the concentration inequality does not depend on the mean of the underlying vectors. As an application of these inequalities, we derive a bound on the misclassification rate of a kernel spectral clustering (KSC) algorithm, under a perturbed nonparametric mixture model. We show an example where this bound establishes the high-dimensional consistency (as $d \to \infty$) of the KSC, when applied with a Gaussian kernel, to a noisy model of nested nonlinear manifolds.
Tasks
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03347v2
PDF	https://arxiv.org/pdf/1909.03347v2.pdf
PWC	https://paperswithcode.com/paper/concentration-of-kernel-matrices-with
Repo
Framework

Memory limitations are hidden in grammar


Title	Memory limitations are hidden in grammar
Authors	Carlos Gómez-Rodríguez, Morten H. Christiansen, Ramon Ferrer-i-Cancho
Abstract	The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent of non-linguistic cognitive constraints.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06629v2
PDF	https://arxiv.org/pdf/1908.06629v2.pdf
PWC	https://paperswithcode.com/paper/memory-limitations-are-hidden-in-grammar
Repo
Framework

Learning to regularize with a variational autoencoder for hydrologic inverse analysis


Title	Learning to regularize with a variational autoencoder for hydrologic inverse analysis
Authors	Daniel O’Malley, John K. Golden, Velimir V. Vesselinov
Abstract	Inverse problems often involve matching observational data using a physical model that takes a large number of parameters as input. These problems tend to be under-constrained and require regularization to impose additional structure on the solution in parameter space. A central difficulty in regularization is turning a complex conceptual model of this additional structure into a functional mathematical form to be used in the inverse analysis. In this work we propose a method of regularization involving a machine learning technique known as a variational autoencoder (VAE). The VAE is trained to map a low-dimensional set of latent variables with a simple structure to the high-dimensional parameter space that has a complex structure. We train a VAE on unconditioned realizations of the parameters for a hydrological inverse problem. These unconditioned realizations neither rely on the observational data used to perform the inverse analysis nor require any forward runs of the physical model, thus making the computational cost of generating the training data minimal. The central benefit of this approach is that regularization is then performed on the latent variables from the VAE, which can be regularized simply. A second benefit of this approach is that the VAE reduces the number of variables in the optimization problem, thus making gradient-based optimization more computationally efficient when adjoint methods are unavailable. After performing regularization and optimization on the latent variables, the VAE then decodes the problem back to the original parameter space. Our approach constitutes a novel framework for regularization and optimization, readily applicable to a wide range of inverse problems. We call the approach RegAE.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02401v1
PDF	https://arxiv.org/pdf/1906.02401v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-regularize-with-a-variational
Repo
Framework

Neural Network Surgery with Sets


Title	Neural Network Surgery with Sets
Authors	Jonathan Raiman, Susan Zhang, Christy Dennison
Abstract	The cost to train machine learning models has been increasing exponentially, making exploration and research into the correct features and architecture a costly or intractable endeavor at scale. However, using a technique named “surgery” OpenAI Five was continuously trained to play the game DotA 2 over the course of 10 months through 20 major changes in features and architecture. Surgery transfers trained weights from one network to another after a selection process to determine which sections of the model are unchanged and which must be re-initialized. In the past, the selection process relied on heuristics, manual labor, or pre-existing boundaries in the structure of the model, limiting the ability to salvage experiments after modifications of the feature set or input reorderings. We propose a solution to automatically determine which components of a neural network model should be salvaged and which require retraining. We achieve this by allowing the model to operate over discrete sets of features and use set-based operations to determine the exact relationship between inputs and outputs, and how they change across tweaks in model architecture. In this paper, we introduce the methodology for enabling neural networks to operate on sets, derive two methods for detecting feature-parameter interaction maps, and show their equivalence. We empirically validate that we can surgery weights across feature and architecture changes to the OpenAI Five model.
Tasks	Dota 2
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06719v2
PDF	https://arxiv.org/pdf/1912.06719v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-surgery-with-sets
Repo
Framework