October 19, 2019

3386 words 16 mins read

Paper Group ANR 365

A Multiresolution Convolutional Neural Network with Partial Label Training for Annotating Reflectance Confocal Microscopy Images of Skin. Learning Explicit Deep Representations from Deep Kernel Networks. Deep Residual Learning for Accelerated MRI using Magnitude and Phase Networks. Theoretical Foundations of the A2RD Project: Part I. On Generality …

A Multiresolution Convolutional Neural Network with Partial Label Training for Annotating Reflectance Confocal Microscopy Images of Skin


Title	A Multiresolution Convolutional Neural Network with Partial Label Training for Annotating Reflectance Confocal Microscopy Images of Skin
Authors	Alican Bozkurt, Kivanc Kose, Christi Alessi-Fox, Melissa Gill, Dana H. Brooks, Jennifer G. Dy, Milind Rajadhyaksha
Abstract	We describe a new multiresolution “nested encoder-decoder” convolutional network architecture and use it to annotate morphological patterns in reflectance confocal microscopy (RCM) images of human skin for aiding cancer diagnosis. Skin cancers are the most common types of cancers, melanoma being the deadliest among them. RCM is an effective, non-invasive pre-screening tool for skin cancer diagnosis, with the required cellular resolution. However, images are complex, low-contrast, and highly variable, so that clinicians require months to years of expert-level training to be able to make accurate assessments. In this paper, we address classifying 4 key clinically important structural/textural patterns in RCM images. The occurrence and morphology of these patterns are used by clinicians for diagnosis of melanomas. The large size of RCM images, the large variance of pattern size, the large-scale range over which patterns appear, the class imbalance in collected images, and the lack of fully-labeled images all make this a challenging problem to address, even with automated machine learning tools. We designed a novel nested U-net architecture to cope with these challenges, and a selective loss function to handle partial labeling. Trained and tested on 56 melanoma-suspicious, partially labeled, 12k x 12k pixel images, our network automatically annotated diagnostic patterns with high sensitivity and specificity, providing consistent labels for unlabeled sections of the test images. Providing such annotation will aid clinicians in achieving diagnostic accuracy, and perhaps more important, dramatically facilitate clinical training, thus enabling much more rapid adoption of RCM into widespread clinical use process. In addition, our adaptation of U-net architecture provides an intrinsically multiresolution deep network that may be useful in other challenging biomedical image analysis applications.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.02213v2
PDF	http://arxiv.org/pdf/1802.02213v2.pdf
PWC	https://paperswithcode.com/paper/a-multiresolution-convolutional-neural
Repo
Framework

Learning Explicit Deep Representations from Deep Kernel Networks


Title	Learning Explicit Deep Representations from Deep Kernel Networks
Authors	Mingyuan Jiu, Hichem Sahbi
Abstract	Deep kernel learning aims at designing nonlinear combinations of multiple standard elementary kernels by training deep networks. This scheme has proven to be effective, but intractable when handling large-scale datasets especially when the depth of the trained networks increases; indeed, the complexity of evaluating these networks scales quadratically w.r.t. the size of training data and linearly w.r.t. the depth of the trained networks. In this paper, we address the issue of efficient computation in Deep Kernel Networks (DKNs) by designing effective maps in the underlying Reproducing Kernel Hilbert Spaces. Given a pretrained DKN, our method builds its associated Deep Map Network (DMN) whose inner product approximates the original network while being far more efficient. The design principle of our method is greedy and achieved layer-wise, by finding maps that approximate DKNs at different (input, intermediate and output) layers. This design also considers an extra fine-tuning step based on unsupervised learning, that further enhances the generalization ability of the trained DMNs. When plugged into SVMs, these DMNs turn out to be as accurate as the underlying DKNs while being at least an order of magnitude faster on large-scale datasets, as shown through extensive experiments on the challenging ImageCLEF and COREL5k benchmarks.
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11159v1
PDF	http://arxiv.org/pdf/1804.11159v1.pdf
PWC	https://paperswithcode.com/paper/learning-explicit-deep-representations-from
Repo
Framework

Deep Residual Learning for Accelerated MRI using Magnitude and Phase Networks


Title	Deep Residual Learning for Accelerated MRI using Magnitude and Phase Networks
Authors	Dongwook Lee, Jaejun Yoo, Sungho Tak, Jong Chul Ye
Abstract	Accelerated magnetic resonance (MR) scan acquisition with compressed sensing (CS) and parallel imaging is a powerful method to reduce MR imaging scan time. However, many reconstruction algorithms have high computational costs. To address this, we investigate deep residual learning networks to remove aliasing artifacts from artifact corrupted images. The proposed deep residual learning networks are composed of magnitude and phase networks that are separately trained. If both phase and magnitude information are available, the proposed algorithm can work as an iterative k-space interpolation algorithm using framelet representation. When only magnitude data is available, the proposed approach works as an image domain post-processing algorithm. Even with strong coherent aliasing artifacts, the proposed network successfully learned and removed the aliasing artifacts, whereas current parallel and CS reconstruction methods were unable to remove these artifacts. Comparisons using single and multiple coil show that the proposed residual network provides good reconstruction results with orders of magnitude faster computational time than existing compressed sensing methods. The proposed deep learning framework may have a great potential for accelerated MR reconstruction by generating accurate results immediately.
Tasks
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00432v1
PDF	http://arxiv.org/pdf/1804.00432v1.pdf
PWC	https://paperswithcode.com/paper/deep-residual-learning-for-accelerated-mri
Repo
Framework

Theoretical Foundations of the A2RD Project: Part I


Title	Theoretical Foundations of the A2RD Project: Part I
Authors	Juliao Braga, Joao Nuno Silva, Patricia Takako Endo, Nizam Omar
Abstract	This article identifies and discusses the theoretical foundations that were considered in the design of the A2RD model. In addition to the points considered, references are made to the studies available and considered in the approach.
Tasks
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08794v3
PDF	http://arxiv.org/pdf/1808.08794v3.pdf
PWC	https://paperswithcode.com/paper/theoretical-foundations-of-the-a2rd-project
Repo
Framework

On Generality and Knowledge Transferability in Cross-Domain Duplicate Question Detection for Heterogeneous Community Question Answering


Title	On Generality and Knowledge Transferability in Cross-Domain Duplicate Question Detection for Heterogeneous Community Question Answering
Authors	Mohomed Shazan Mohomed Jabbar, Luke Kumar, Hamman Samuel, Mi-Young Kim, Sankalp Prabhakar, Randy Goebel, Osmar Zaïane
Abstract	Duplicate question detection is an ongoing challenge in community question answering because semantically equivalent questions can have significantly different words and structures. In addition, the identification of duplicate questions can reduce the resources required for retrieval, when the same questions are not repeated. This study compares the performance of deep neural networks and gradient tree boosting, and explores the possibility of domain adaptation with transfer learning to improve the under-performing target domains for the text-pair duplicates classification task, using three heterogeneous datasets: general-purpose Quora, technical Ask Ubuntu, and academic English Stack Exchange. Ultimately, our study exposes the alternative hypothesis that the meaning of a “duplicate” is not inherently general-purpose, but rather is dependent on the domain of learning, hence reducing the chance of transfer learning through adapting to the domain.
Tasks	Community Question Answering, Domain Adaptation, Question Answering, Transfer Learning
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06596v1
PDF	http://arxiv.org/pdf/1811.06596v1.pdf
PWC	https://paperswithcode.com/paper/on-generality-and-knowledge-transferability
Repo
Framework

Clustering and Learning from Imbalanced Data


Title	Clustering and Learning from Imbalanced Data
Authors	Naman D. Singh, Abhinav Dhall
Abstract	A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for improved learning from class imbalanced datasets. The essential idea behind the proposed method is to use the distance between a minority class sample and its respective cluster centroid to infer the number of new sample points to be generated for that minority class sample. The proposed algorithm has very less dependence on the technique used for finding cluster centroids and does not effect the majority class learning in any way. It also improves learning from imbalanced data by incorporating the distribution structure of minority class samples in generation of new data samples. The newly generated minority class data is handled in a way as to prevent outlier production and overfitting. Implementation analysis on different datasets using deep neural networks as the learning classifier shows the effectiveness of this method as compared to other synthetic data resampling techniques across several evaluation metrics.
Tasks
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00972v2
PDF	http://arxiv.org/pdf/1811.00972v2.pdf
PWC	https://paperswithcode.com/paper/clustering-and-learning-from-imbalanced-data
Repo
Framework

Likelihood-free inference with an improved cross-entropy estimator


Title	Likelihood-free inference with an improved cross-entropy estimator
Authors	Markus Stoye, Johann Brehmer, Gilles Louppe, Juan Pavez, Kyle Cranmer
Abstract	We extend recent work (Brehmer, et. al., 2018) that use neural networks as surrogate models for likelihood-free inference. As in the previous work, we exploit the fact that the joint likelihood ratio and joint score, conditioned on both observed and latent variables, can often be extracted from an implicit generative model or simulator to augment the training data for these surrogate models. We show how this augmented training data can be used to provide a new cross-entropy estimator, which provides improved sample efficiency compared to previous loss functions exploiting this augmented training data.
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00973v1
PDF	http://arxiv.org/pdf/1808.00973v1.pdf
PWC	https://paperswithcode.com/paper/likelihood-free-inference-with-an-improved
Repo
Framework

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring


Title	Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring
Authors	Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet
Abstract	Deep Neural Networks have recently gained lots of success after enabling several breakthroughs in notoriously challenging problems. Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model. Unfortunately, once the models are sold they can be easily copied and redistributed. To avoid this, a tracking mechanism to identify models as the intellectual property of a particular vendor is necessary. In this work, we present an approach for watermarking Deep Neural Networks in a black-box way. Our scheme works for general classification tasks and can easily be combined with current learning algorithms. We show experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for and evaluate the robustness of our proposal against a multitude of practical attacks. Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04633v3
PDF	http://arxiv.org/pdf/1802.04633v3.pdf
PWC	https://paperswithcode.com/paper/turning-your-weakness-into-a-strength
Repo
Framework

Approximating Hamiltonian dynamics with the Nyström method


Title	Approximating Hamiltonian dynamics with the Nyström method
Authors	Alessandro Rudi, Leonard Wossnig, Carlo Ciliberto, Andrea Rocchetto, Massimiliano Pontil, Simone Severini
Abstract	Simulating the time-evolution of quantum mechanical systems is BQP-hard and expected to be one of the foremost applications of quantum computers. We consider classical algorithms for the approximation of Hamiltonian dynamics using subsampling methods from randomized numerical linear algebra. We derive a simulation technique whose runtime scales polynomially in the number of qubits and the Frobenius norm of the Hamiltonian. As an immediate application, we show that sample based quantum simulation, a type of evolution where the Hamiltonian is a density matrix, can be efficiently classically simulated under specific structural conditions. Our main technical contribution is a randomized algorithm for approximating Hermitian matrix exponentials. The proof leverages a low-rank, symmetric approximation via the Nystr"om method. Our results suggest that under strong sampling assumptions there exist classical poly-logarithmic time simulations of quantum computations.
Tasks
Published	2018-04-06
URL	https://arxiv.org/abs/1804.02484v4
PDF	https://arxiv.org/pdf/1804.02484v4.pdf
PWC	https://paperswithcode.com/paper/approximating-hamiltonian-dynamics-with-the
Repo
Framework

Adaptive and Calibrated Ensemble Learning with Dependent Tail-free Process


Title	Adaptive and Calibrated Ensemble Learning with Dependent Tail-free Process
Authors	Jeremiah Zhe Liu, John Paisley, Marianthi-Anna Kioumourtzoglou, Brent A. Coull
Abstract	Ensemble learning is a mainstay in modern data science practice. Conventional ensemble algorithms assigns to base models a set of deterministic, constant model weights that (1) do not fully account for variations in base model accuracy across subgroups, nor (2) provide uncertainty estimates for the ensemble prediction, which could result in mis-calibrated (i.e. precise but biased) predictions that could in turn negatively impact the algorithm performance in real-word applications. In this work, we present an adaptive, probabilistic approach to ensemble learning using dependent tail-free process as ensemble weight prior. Given input feature $\mathbf{x}$, our method optimally combines base models based on their predictive accuracy in the feature space $\mathbf{x} \in \mathcal{X}$, and provides interpretable uncertainty estimates both in model selection and in ensemble prediction. To encourage scalable and calibrated inference, we derive a structured variational inference algorithm that jointly minimize KL objective and the model’s calibration score (i.e. Continuous Ranked Probability Score (CRPS)). We illustrate the utility of our method on both a synthetic nonlinear function regression task, and on the real-world application of spatio-temporal integration of particle pollution prediction models in New England.
Tasks	Calibration, Model Selection
Published	2018-12-08
URL	http://arxiv.org/abs/1812.03350v2
PDF	http://arxiv.org/pdf/1812.03350v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-and-calibrated-ensemble-learning
Repo
Framework

One Big Net For Everything


Title	One Big Net For Everything
Authors	Juergen Schmidhuber
Abstract	I apply recent work on “learning to think” (2015) and on PowerPlay (2011) to the incremental training of an increasingly general problem solver, continually learning to solve new tasks without forgetting previous skills. The problem solver is a single recurrent neural network (or similar general purpose computer) called ONE. ONE is unusual in the sense that it is trained in various ways, e.g., by black box optimization / reinforcement learning / artificial evolution as well as supervised / unsupervised learning. For example, ONE may learn through neuroevolution to control a robot through environment-changing actions, and learn through unsupervised gradient descent to predict future inputs and vector-valued reward signals as suggested in 1990. User-given tasks can be defined through extra goal-defining input patterns, also proposed in 1990. Suppose ONE has already learned many skills. Now a copy of ONE can be re-trained to learn a new skill, e.g., through neuroevolution without a teacher. Here it may profit from re-using previously learned subroutines, but it may also forget previous skills. Then ONE is retrained in PowerPlay style (2011) on stored input/output traces of (a) ONE’s copy executing the new skill and (b) previous instances of ONE whose skills are still considered worth memorizing. Simultaneously, ONE is retrained on old traces (even those of unsuccessful trials) to become a better predictor, without additional expensive interaction with the enviroment. More and more control and prediction skills are thus collapsed into ONE, like in the chunker-automatizer system of the neural history compressor (1991). This forces ONE to relate partially analogous skills (with shared algorithmic information) to each other, creating common subroutines in form of shared subnetworks of ONE, to greatly speed up subsequent learning of additional, novel but algorithmically related skills.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08864v1
PDF	http://arxiv.org/pdf/1802.08864v1.pdf
PWC	https://paperswithcode.com/paper/one-big-net-for-everything
Repo
Framework

A Data Dependent Multiscale Model for Hyperspectral Unmixing With Spectral Variability


Title	A Data Dependent Multiscale Model for Hyperspectral Unmixing With Spectral Variability
Authors	Ricardo Augusto Borsoi, Tales Imbiriba, José Carlos Moreira Bermudez
Abstract	Spectral variability in hyperspectral images can result from factors including environmental, illumination, atmospheric and temporal changes. Its occurrence may lead to the propagation of significant estimation errors in the unmixing process. To address this issue, extended linear mixing models have been proposed which lead to large scale nonsmooth ill-posed inverse problems. Furthermore, the regularization strategies used to obtain meaningful results have introduced interdependencies among abundance solutions that further increase the complexity of the resulting optimization problem. In this paper we present a novel data dependent multiscale model for hyperspectral unmixing accounting for spectral variability. The new method incorporates spatial contextual information to the abundances in extended linear mixing models by using a multiscale transform based on superpixels. The proposed method results in a fast algorithm that solves the abundance estimation problem only once in each scale during each iteration. Simulation results using synthetic and real images compare the performances, both in accuracy and execution time, of the proposed algorithm and other state-of-the-art solutions.
Tasks	Hyperspectral Unmixing
Published	2018-08-02
URL	https://arxiv.org/abs/1808.01047v4
PDF	https://arxiv.org/pdf/1808.01047v4.pdf
PWC	https://paperswithcode.com/paper/a-data-dependent-multiscale-model-for
Repo
Framework

Online Learning: A Comprehensive Survey


Title	Online Learning: A Comprehensive Survey
Authors	Steven C. H. Hoi, Doyen Sahoo, Jing Lu, Peilin Zhao
Abstract	Online learning represents an important family of machine learning algorithms, in which a learner attempts to resolve an online prediction (or any type of decision-making) task by learning a model/hypothesis from a sequence of data instances one at a time. The goal of online learning is to ensure that the online learner would make a sequence of accurate predictions (or correct decisions) given the knowledge of correct answers to previous prediction or learning tasks and possibly additional information. This is in contrast to many traditional batch learning or offline machine learning algorithms that are often designed to train a model in batch from a given collection of training data instances. This survey aims to provide a comprehensive survey of the online machine learning literatures through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques. Generally speaking, according to the learning type and the forms of feedback information, the existing online learning works can be classified into three major categories: (i) supervised online learning where full feedback information is always available, (ii) online learning with limited feedback, and (iii) unsupervised online learning where there is no feedback available. Due to space limitation, the survey will be mainly focused on the first category, but also briefly cover some basics of the other two categories. Finally, we also discuss some open issues and attempt to shed light on potential future research directions in this field.
Tasks	Decision Making
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02871v2
PDF	http://arxiv.org/pdf/1802.02871v2.pdf
PWC	https://paperswithcode.com/paper/online-learning-a-comprehensive-survey
Repo
Framework

Disentangling Adversarial Robustness and Generalization


Title	Disentangling Adversarial Robustness and Generalization
Authors	David Stutz, Matthias Hein, Bernt Schiele
Abstract	Obtaining deep networks that are robust against adversarial examples and generalize well is an open problem. A recent hypothesis even states that both robust and accurate models are impossible, i.e., adversarial robustness and generalization are conflicting goals. In an effort to clarify the relationship between robustness and generalization, we assume an underlying, low-dimensional data manifold and show that: 1. regular adversarial examples leave the manifold; 2. adversarial examples constrained to the manifold, i.e., on-manifold adversarial examples, exist; 3. on-manifold adversarial examples are generalization errors, and on-manifold adversarial training boosts generalization; 4. regular robustness and generalization are not necessarily contradicting goals. These assumptions imply that both robust and accurate models are possible. However, different models (architectures, training strategies etc.) can exhibit different robustness and generalization characteristics. To confirm our claims, we present extensive experiments on synthetic data (with known manifold) as well as on EMNIST, Fashion-MNIST and CelebA.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00740v2
PDF	http://arxiv.org/pdf/1812.00740v2.pdf
PWC	https://paperswithcode.com/paper/disentangling-adversarial-robustness-and
Repo
Framework

Code Failure Prediction and Pattern Extraction using LSTM Networks


Title	Code Failure Prediction and Pattern Extraction using LSTM Networks
Authors	Mahdi Hajiaghayi, Ehsan Vahedi
Abstract	In this paper, we use a well-known Deep Learning technique called Long Short Term Memory (LSTM) recurrent neural networks to find sessions that are prone to code failure in applications that rely on telemetry data for system health monitoring. We also use LSTM networks to extract telemetry patterns that lead to a specific code failure. For code failure prediction, we treat the telemetry events, sequence of telemetry events and the outcome of each sequence as words, sentence and sentiment in the context of sentiment analysis, respectively. Our proposed method is able to process a large set of data and can automatically handle edge cases in code failure prediction. We take advantage of Bayesian optimization technique to find the optimal hyper parameters as well as the type of LSTM cells that leads to the best prediction performance. We then introduce the Contributors and Blockers concepts. In this paper, contributors are the set of events that cause a code failure, while blockers are the set of events that each of them individually prevents a code failure from happening, even in presence of one or multiple contributor(s). Once the proposed LSTM model is trained, we use a greedy approach to find the contributors and blockers. To develop and test our proposed method, we use synthetic (simulated) data in the first step. The synthetic data is generated using a number of rules for code failures, as well as a number of rules for preventing a code failure from happening. The trained LSTM model shows over 99% accuracy for detecting code failures in the synthetic data. The results from the proposed method outperform the classical learning models such as Decision Tree and Random Forest. Using the proposed greedy method, we are able to find the contributors and blockers in the synthetic data in more than 90% of the cases, with a performance better than sequential rule and pattern mining algorithms.
Tasks	Sentiment Analysis
Published	2018-12-13
URL	http://arxiv.org/abs/1812.05237v1
PDF	http://arxiv.org/pdf/1812.05237v1.pdf
PWC	https://paperswithcode.com/paper/code-failure-prediction-and-pattern
Repo
Framework