Paper Group ANR 276
Stability and Optimization Error of Stochastic Gradient Descent for Pairwise Learning. Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics. Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint. Streetify: Using Street View Imagery And Deep Learning For Urban Streets Developme …
Stability and Optimization Error of Stochastic Gradient Descent for Pairwise Learning
Title | Stability and Optimization Error of Stochastic Gradient Descent for Pairwise Learning |
Authors | Wei Shen, Zhenhuan Yang, Yiming Ying, Xiaoming Yuan |
Abstract | In this paper we study the stability and its trade-off with optimization error for stochastic gradient descent (SGD) algorithms in the pairwise learning setting. Pairwise learning refers to a learning task which involves a loss function depending on pairs of instances among which notable examples are bipartite ranking, metric learning, area under ROC (AUC) maximization and minimum error entropy (MEE) principle. Our contribution is twofold. Firstly, we establish the stability results of SGD for pairwise learning in the convex, strongly convex and non-convex settings, from which generalization bounds can be naturally derived. Secondly, we establish the trade-off between stability and optimization error of SGD algorithms for pairwise learning. This is achieved by lower-bounding the sum of stability and optimization error by the minimax statistical error over a prescribed class of pairwise loss functions. From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses. In addition, we illustrate our stability results by giving some specific examples of AUC maximization, metric learning and MEE. |
Tasks | Metric Learning |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11316v2 |
http://arxiv.org/pdf/1904.11316v2.pdf | |
PWC | https://paperswithcode.com/paper/stability-and-optimization-error-of |
Repo | |
Framework | |
Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics
Title | Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics |
Authors | Dalton Lunga, Jonathan Gerrand, Hsiuhan Lexie Yang, Christopher Layton, Robert Stewart |
Abstract | The shear volumes of data generated from earth observation and remote sensing technologies continue to make major impact; leaping key geospatial applications into the dual data and compute intensive era. As a consequence, this rapid advancement poses new computational and data processing challenges. We implement a novel remote sensing data flow (RESFlow) for advanced machine learning and computing with massive amounts of remotely sensed imagery. The core contribution is partitioning massive amount of data based on the spectral and semantic characteristics for distributed imagery analysis. RESFlow takes advantage of both a unified analytics engine for large-scale data processing and the availability of modern computing hardware to harness the acceleration of deep learning inference on expansive remote sensing imagery. The framework incorporates a strategy to optimize resource utilization across multiple executors assigned to a single worker. We showcase its deployment across computationally and data-intensive on pixel-level labeling workloads. The pipeline invokes deep learning inference at three stages; during deep feature extraction, deep metric mapping, and deep semantic segmentation. The tasks impose compute intensive and GPU resource sharing challenges motivating for a parallelized pipeline for all execution steps. By taking advantage of Apache Spark, Nvidia DGX1, and DGX2 computing platforms, we demonstrate unprecedented compute speed-ups for deep learning inference on pixel labeling workloads; processing 21,028~Terrabytes of imagery data and delivering an output maps at area rate of 5.245sq.km/sec, amounting to 453,168 sq.km/day - reducing a 28 day workload to 21~hours. |
Tasks | Semantic Segmentation |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.04383v1 |
https://arxiv.org/pdf/1908.04383v1.pdf | |
PWC | https://paperswithcode.com/paper/apache-spark-accelerated-deep-learning |
Repo | |
Framework | |
Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint
Title | Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint |
Authors | Eric C. Cyr, Mamikon A. Gulian, Ravi G. Patel, Mauro Perego, Nathaniel A. Trask |
Abstract | Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations. |
Tasks | |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04862v1 |
https://arxiv.org/pdf/1912.04862v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-training-and-initialization-of-deep |
Repo | |
Framework | |
Streetify: Using Street View Imagery And Deep Learning For Urban Streets Development
Title | Streetify: Using Street View Imagery And Deep Learning For Urban Streets Development |
Authors | Fahad Alhasoun, Marta Gonzalez |
Abstract | The classification of streets on road networks has been focused on the vehicular transportational features of streets such as arterials, major roads, minor roads and so forth based on their transportational use. City authorities on the other hand have been shifting to more urban inclusive planning of streets, encompassing the side use of a street combined with the transportational features of a street. In such classification schemes, streets are labeled for example as commercial throughway, residential neighborhood, park etc. This modern approach to urban planning has been adopted by major cities such as the city of San Francisco, the states of Florida and Pennsylvania among many others. Currently, the process of labeling streets according to their contexts is manual and hence is tedious and time consuming. In this paper, we propose an approach to collect and label imagery data then deploy advancements in computer vision towards modern urban planning. We collect and label street imagery then train deep convolutional neural networks (CNN) to perform the classification of street context. We show that CNN models can perform well achieving accuracies in the 81% to 87%, we then visualize samples from the embedding space of streets using the t-SNE method and apply class activation mapping methods to interpret the features in street imagery contributing to output classification from a model. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.08007v1 |
https://arxiv.org/pdf/1911.08007v1.pdf | |
PWC | https://paperswithcode.com/paper/streetify-using-street-view-imagery-and-deep |
Repo | |
Framework | |
An attention-based multi-resolution model for prostate whole slide imageclassification and localization
Title | An attention-based multi-resolution model for prostate whole slide imageclassification and localization |
Authors | Jiayun Li, Wenyuan Li, Arkadiusz Gertych, Beatrice S. Knudsen, William Speier, Corey W. Arnold |
Abstract | Histology review is often used as the `gold standard’ for disease diagnosis. Computer aided diagnosis tools can potentially help improve current pathology workflows by reducing examination time and interobserver variability. Previous work in cancer grading has focused mainly on classifying pre-defined regions of interest (ROIs), or relied on large amounts of fine-grained labels. In this paper, we propose a two-stage attention-based multiple instance learning model for slide-level cancer grading and weakly-supervised ROI detection and demonstrate its use in prostate cancer. Compared with existing Gleason classification models, our model goes a step further by utilizing visualized saliency maps to select informative tiles for fine-grained grade classification. The model was primarily developed on a large-scale whole slide dataset consisting of 3,521 prostate biopsy slides with only slide-level labels from 718 patients. The model achieved state-of-the-art performance for prostate cancer grading with an accuracy of 85.11% for classifying benign, low-grade (Gleason grade 3+3 or 3+4), and high-grade (Gleason grade 4+3 or higher) slides on an independent test set. | |
Tasks | Multiple Instance Learning |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13208v1 |
https://arxiv.org/pdf/1905.13208v1.pdf | |
PWC | https://paperswithcode.com/paper/190513208 |
Repo | |
Framework | |
A mixed model approach to drought prediction using artificial neural networks: Case of an operational drought monitoring environment
Title | A mixed model approach to drought prediction using artificial neural networks: Case of an operational drought monitoring environment |
Authors | Chrisgone Adede, Robert Oboko, Peter Wagacha, Clement Atzberger |
Abstract | Droughts, with their increasing frequency of occurrence, continue to negatively affect livelihoods and elements at risk. For example, the 2011 in drought in east Africa has caused massive losses document to have cost the Kenyan economy over $12bn. With the foregoing, the demand for ex-ante drought monitoring systems is ever-increasing. The study uses 10 precipitation and vegetation variables that are lagged over 1, 2 and 3-month time-steps to predict drought situations. In the model space search for the most predictive artificial neural network (ANN) model, as opposed to the traditional greedy search for the most predictive variables, we use the General Additive Model (GAM) approach. Together with a set of assumptions, we thereby reduce the cardinality of the space of models. Even though we build a total of 102 GAM models, only 21 have R2 greater than 0.7 and are thus subjected to the ANN process. The ANN process itself uses the brute-force approach that automatically partitions the training data into 10 sub-samples, builds the ANN models in these samples and evaluates their performance using multiple metrics. The results show the superiority of 1-month lag of the variables as compared to longer time lags of 2 and 3 months. The champion ANN model recorded an R2 of 0.78 in model testing using the out-of-sample data. This illustrates its ability to be a good predictor of drought situations 1-month ahead. Investigated as a classifier, the champion has a modest accuracy of 66% and a multi-class area under the ROC curve (AUROC) of 89.99% |
Tasks | |
Published | 2019-01-10 |
URL | http://arxiv.org/abs/1901.04927v1 |
http://arxiv.org/pdf/1901.04927v1.pdf | |
PWC | https://paperswithcode.com/paper/a-mixed-model-approach-to-drought-prediction |
Repo | |
Framework | |
Structural Decompositions for End-to-End Relighting
Title | Structural Decompositions for End-to-End Relighting |
Authors | Thomas Nestmeyer, Iain Matthews, Jean-François Lalonde, Andreas M. Lehrmann |
Abstract | Relighting is an essential step in artificially transferring an object from one image into another environment. For example, a believable teleconference in Augmented Reality requires a portrait recorded in the source environment to be displayed and relit consistent with the light configuration of the destination scene. In this paper, we investigate architectures for learning to both de-light and relight an image of a human face end-to-end. The architectures vary in how much they enforce physically-based image formation and rendering constraints. The most structured model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model and augments the render to relight including non-diffuse effects. An intermediate model uses fewer intrinsic constraints and the least structured model makes no assumptions on the image formation. To train our models and evaluate the approach, we collected portraits of 21 subjects with various expressions and poses, each in a sequence of 32 individual light sources in a controlled light stage setup. Our method leads to precise and believable relighting results in challenging illumination conditions and poses, including when the subject is facing away from the camera. We compare our method to state-of-the-art relighting approaches and illustrate its superiority in a series of quantitative and qualitative experiments. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03355v1 |
https://arxiv.org/pdf/1906.03355v1.pdf | |
PWC | https://paperswithcode.com/paper/structural-decompositions-for-end-to-end |
Repo | |
Framework | |
Online Sensor Hallucination via Knowledge Distillation for Multimodal Image Classification
Title | Online Sensor Hallucination via Knowledge Distillation for Multimodal Image Classification |
Authors | Saurabh Kumar, Biplab Banerjee, Subhasis Chaudhuri |
Abstract | We deal with the problem of information fusion driven satellite image/scene classification and propose a generic hallucination architecture considering that all the available sensor information are present during training while some of the image modalities may be absent while testing. It is well-known that different sensors are capable of capturing complementary information for a given geographical area and a classification module incorporating information from all the sources are expected to produce an improved performance as compared to considering only a subset of the modalities. However, the classical classifier systems inherently require all the features used to train the module to be present for the test instances as well, which may not always be possible for typical remote sensing applications (say, disaster management). As a remedy, we provide a robust solution in terms of a hallucination module that can approximate the missing modalities from the available ones during the decision-making stage. In order to ensure better knowledge transfer during modality hallucination, we explicitly incorporate concepts of knowledge distillation for the purpose of exploring the privileged (side) information in our framework and subsequently introduce an intuitive modular training approach. The proposed network is evaluated extensively on a large-scale corpus of PAN-MS image pairs (scene recognition) as well as on a benchmark hyperspectral image dataset (image classification) where we follow different experimental scenarios and find that the proposed hallucination based module indeed is capable of capturing the multi-source information, albeit the explicit absence of some of the sensor information, and aid in improved scene characterization. |
Tasks | Decision Making, Image Classification, Scene Classification, Scene Recognition, Transfer Learning |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10559v1 |
https://arxiv.org/pdf/1908.10559v1.pdf | |
PWC | https://paperswithcode.com/paper/online-sensor-hallucination-via-knowledge |
Repo | |
Framework | |
Automatically Extract the Semi-transparent Motion-blurred Hand from a Single Image
Title | Automatically Extract the Semi-transparent Motion-blurred Hand from a Single Image |
Authors | Xiaomei Zhao, Yihong Wu |
Abstract | When we use video chat, video game, or other video applications, motion-blurred hands often appear. Accurately extracting these hands is very useful for video editing and behavior analysis. However, existing motion-blurred object extraction methods either need user interactions, such as user supplied trimaps and scribbles, or need additional information, such as background images. In this paper, a novel method which can automatically extract the semi-transparent motion-blurred hand just according to the original RGB image is proposed. The proposed method separates the extraction task into two subtasks: alpha matte prediction and foreground prediction. These two subtasks are implemented by Xception based encoder-decoder networks. The extracted motion-blurred hand images can be calculated by multiplying the predicted alpha mattes and foreground images. Experiments on synthetic and real datasets show that the proposed method has promising performance. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11470v1 |
https://arxiv.org/pdf/1906.11470v1.pdf | |
PWC | https://paperswithcode.com/paper/automatically-extract-the-semi-transparent |
Repo | |
Framework | |
Nonmodular architectures of cognitive systems based on active inference
Title | Nonmodular architectures of cognitive systems based on active inference |
Authors | Manuel Baltieri, Christopher L. Buckley |
Abstract | In psychology and neuroscience it is common to describe cognitive systems as input/output devices where perceptual and motor functions are implemented in a purely feedforward, open-loop fashion. On this view, perception and action are often seen as encapsulated modules with limited interaction between them. While embodied and enactive approaches to cognitive science have challenged the idealisation of the brain as an input/output device, we argue that even the more recent attempts to model systems using closed-loop architectures still heavily rely on a strong separation between motor and perceptual functions. Previously, we have suggested that the mainstream notion of modularity strongly resonates with the separation principle of control theory. In this work we present a minimal model of a sensorimotor loop implementing an architecture based on the separation principle. We link this to popular formulations of perception and action in the cognitive sciences, and show its limitations when, for instance, external forces are not modelled by an agent. These forces can be seen as variables that an agent cannot directly control, i.e., a perturbation from the environment or an interference caused by other agents. As an alternative approach inspired by embodied cognitive science, we then propose a nonmodular architecture based on the active inference framework. We demonstrate the robustness of this architecture to unknown external inputs and show that the mechanism with which this is achieved in linear models is equivalent to integral control. |
Tasks | |
Published | 2019-03-22 |
URL | http://arxiv.org/abs/1903.09542v1 |
http://arxiv.org/pdf/1903.09542v1.pdf | |
PWC | https://paperswithcode.com/paper/nonmodular-architectures-of-cognitive-systems |
Repo | |
Framework | |
Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds
Title | Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds |
Authors | Lukas Prantl, Nuttapong Chentanez, Stefan Jeschke, Nils Thuerey |
Abstract | Point clouds, as a form of Lagrangian representation, allow for powerful and flexible applications in a large number of computational disciplines. We propose a novel deep-learning method to learn stable and temporally coherent feature spaces for points clouds that change over time. We identify a set of inherent problems with these approaches: without knowledge of the time dimension, the inferred solutions can exhibit strong flickering, and easy solutions to suppress this flickering can result in undesirable local minima that manifest themselves as halo structures. We propose a novel temporal loss function that takes into account higher time derivatives of the point positions, and encourages mingling, i.e., to prevent the aforementioned halos. We combine these techniques in a super-resolution method with a truncation approach to flexibly adapt the size of the generated positions. We show that our method works for large, deforming point sets from different sources to demonstrate the flexibility of our approach. |
Tasks | Super-Resolution |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.05279v2 |
https://arxiv.org/pdf/1907.05279v2.pdf | |
PWC | https://paperswithcode.com/paper/tranquil-clouds-neural-networks-for-learning |
Repo | |
Framework | |
Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning
Title | Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning |
Authors | Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, Zekai J. Gao |
Abstract | A number of popular systems, most notably Google’s TensorFlow, have been implemented from the ground up to support machine learning tasks. We consider how to make a very small set of changes to a modern relational database management system (RDBMS) to make it suitable for distributed learning computations. Changes include adding better support for recursion, and optimization and execution of very large compute plans. We also show that there are key advantages to using an RDBMS as a machine learning platform. In particular, learning based on a database management system allows for trivial scaling to large data sets and especially large models, where different computational units operate on different parts of a model that may be too large to fit into RAM. |
Tasks | |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11121v1 |
http://arxiv.org/pdf/1904.11121v1.pdf | |
PWC | https://paperswithcode.com/paper/declarative-recursive-computation-on-an-rdbms |
Repo | |
Framework | |
Automated Detection of Regions of Interest for Brain Perfusion MR Images
Title | Automated Detection of Regions of Interest for Brain Perfusion MR Images |
Authors | Svitlana M Alkhimova |
Abstract | Images with abnormal brain anatomy produce problems for automatic segmentation techniques, and as a result poor ROI detection affects both quantitative measurements and visual assessment of perfusion data. This paper presents a new approach for fully automated and relatively accurate ROI detection from dynamic susceptibility contrast perfusion magnetic resonance and can therefore be applied excellently in the perfusion analysis. In the proposed approach the segmentation output is a binary mask of perfusion ROI that has zero values for air pixels, pixels that represent non-brain tissues, and cerebrospinal fluid pixels. The process of binary mask producing starts with extracting low intensity pixels by thresholding. Optimal low-threshold value is solved by obtaining intensity pixels information from the approximate anatomical brain location. Holes filling algorithm and binary region growing algorithm are used to remove falsely detected regions and produce region of only brain tissues. Further, CSF pixels extraction is provided by thresholding of high intensity pixels from region of only brain tissues. Each time-point image of the perfusion sequence is used for adjustment of CSF pixels location. The segmentation results were compared with the manual segmentation performed by experienced radiologists, considered as the reference standard for evaluation of proposed approach. On average of 120 images the segmentation results have a good agreement with the reference standard. All detected perfusion ROIs were deemed by two experienced radiologists as satisfactory enough for clinical use. The results show that proposed approach is suitable to be used for perfusion ROI detection from DSC head scans. Segmentation tool based on the proposed approach can be implemented as a part of any automatic brain image processing system for clinical use. |
Tasks | |
Published | 2019-02-17 |
URL | http://arxiv.org/abs/1902.06323v1 |
http://arxiv.org/pdf/1902.06323v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-detection-of-regions-of-interest |
Repo | |
Framework | |
Time-Delay Momentum: A Regularization Perspective on the Convergence and Generalization of Stochastic Momentum for Deep Learning
Title | Time-Delay Momentum: A Regularization Perspective on the Convergence and Generalization of Stochastic Momentum for Deep Learning |
Authors | Ziming Zhang, Wenju Xu, Alan Sullivan |
Abstract | In this paper we study the problem of convergence and generalization error bound of stochastic momentum for deep learning from the perspective of regularization. To do so, we first interpret momentum as solving an $\ell_2$-regularized minimization problem to learn the offsets between arbitrary two successive model parameters. We call this {\em time-delay momentum} because the model parameter is updated after a few iterations towards finding the minimizer. We then propose our learning algorithm, \ie stochastic gradient descent (SGD) with time-delay momentum. We show that our algorithm can be interpreted as solving a sequence of strongly convex optimization problems using SGD. We prove that under mild conditions our algorithm can converge to a stationary point with rate of $O(\frac{1}{\sqrt{K}})$ and generalization error bound of $O(\frac{1}{\sqrt{n\delta}})$ with probability at least $1-\delta$, where $K,n$ are the numbers of model updates and training samples, respectively. We demonstrate the empirical superiority of our algorithm in deep learning in comparison with the state-of-the-art deep learning solvers. |
Tasks | |
Published | 2019-03-02 |
URL | https://arxiv.org/abs/1903.00760v2 |
https://arxiv.org/pdf/1903.00760v2.pdf | |
PWC | https://paperswithcode.com/paper/time-delay-momentum-a-regularization |
Repo | |
Framework | |
Dynamic Regularizer with an Informative Prior
Title | Dynamic Regularizer with an Informative Prior |
Authors | Avinash Kori, Manik Sharma |
Abstract | Regularization methods, specifically those which directly alter weights like $L_1$ and $L_2$, are an integral part of many learning algorithms. Both the regularizers mentioned above are formulated by assuming certain priors in the parameter space and these assumptions, in some cases, induce sparsity in the parameter space. Regularizers help in transferring beliefs one has on the dataset or the parameter space by introducing adequate terms in the loss function. Any kind of formulation represents a specific set of beliefs: $L_1$ regularization conveys that the parameter space should be sparse whereas $L_2$ regularization conveys that the parameter space should be bounded and continuous. These regularizers in turn leverage certain priors to express these inherent beliefs. A better understanding of how the prior affects the behavior of the parameters and how the priors can be updated based on the dataset can contribute greatly in improving the generalization capabilities of a function estimator. In this work, we introduce a weakly informative prior and then further extend it to an informative prior in order to formulate a regularization penalty, which shows better results in terms of inducing sparsity experimentally, when compared to regularizers based only on Gaussian and Laplacian priors. Experimentally, we verify that a regularizer based on an adapted prior improves the generalization capabilities of any network. We illustrate the performance of the proposed method on the MNIST and CIFAR-10 datasets. |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14241v1 |
https://arxiv.org/pdf/1910.14241v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-regularizer-with-an-informative-prior |
Repo | |
Framework | |