January 25, 2020

2987 words 15 mins read

Paper Group ANR 1632

Paper Group ANR 1632

Resolution enhancement in scanning electron microscopy using deep learning. Automated Classification of Seizures against Nonseizures: A Deep Learning Approach. FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition. Spatiotemporal Feature Learning for Event-Based Vision. Learnability Can Be Independent of ZFC Axioms: Explanations …

Resolution enhancement in scanning electron microscopy using deep learning

Title Resolution enhancement in scanning electron microscopy using deep learning
Authors Kevin de Haan, Zachary S. Ballard, Yair Rivenson, Yichen Wu, Aydogan Ozcan
Abstract We report resolution enhancement in scanning electron microscopy (SEM) images using a generative adversarial network. We demonstrate the veracity of this deep learning-based super-resolution technique by inferring unresolved features in low-resolution SEM images and comparing them with the accurately co-registered high-resolution SEM images of the same samples. Through spatial frequency analysis, we also report that our method generates images with frequency spectra matching higher resolution SEM images of the same fields-of-view. By using this technique, higher resolution SEM images can be taken faster, while also reducing both electron charging and damage to the samples.
Tasks Super-Resolution
Published 2019-01-30
URL http://arxiv.org/abs/1901.11094v1
PDF http://arxiv.org/pdf/1901.11094v1.pdf
PWC https://paperswithcode.com/paper/resolution-enhancement-in-scanning-electron
Repo
Framework

Automated Classification of Seizures against Nonseizures: A Deep Learning Approach

Title Automated Classification of Seizures against Nonseizures: A Deep Learning Approach
Authors Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang
Abstract In current clinical practice, electroencephalograms (EEG) are reviewed and analyzed by well-trained neurologists to provide supports for therapeutic decisions. The way of manual reviewing is labor-intensive and error prone. Automatic and accurate seizure/nonseizure classification methods are needed. One major problem is that the EEG signals for seizure state and nonseizure state exhibit considerable variations. In order to capture essential seizure features, this paper integrates an emerging deep learning model, the independently recurrent neural network (IndRNN), with a dense structure and an attention mechanism to exploit temporal and spatial discriminating features and overcome seizure variabilities. The dense structure is to ensure maximum information flow between layers. The attention mechanism is to capture spatial features. Evaluations are performed in cross-validation experiments over the noisy CHB-MIT data set. The obtained average sensitivity, specificity and precision of 88.80%, 88.60% and 88.69% are better than using the current state-of-the-art methods. In addition, we explore how the segment length affects the classification performance. Thirteen different segment lengths are assessed, showing that the classification performance varies over the segment lengths, and the maximal fluctuating margin is more than 4%. Thus, the segment length is an important factor influencing the classification performance.
Tasks EEG
Published 2019-06-05
URL https://arxiv.org/abs/1906.02745v1
PDF https://arxiv.org/pdf/1906.02745v1.pdf
PWC https://paperswithcode.com/paper/automated-classification-of-seizures-against
Repo
Framework

FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition

Title FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition
Authors Hongje Seong, Junhyuk Hyun, Euntai Kim
Abstract Scene recognition is an image recognition problem aimed at predicting the category of the place at which the image is taken. In this paper, a new scene recognition method using the convolutional neural network (CNN) is proposed. The proposed method is based on the fusion of the object and the scene information in the given image and the CNN framework is named as FOS (fusion of object and scene) Net. In addition, a new loss named scene coherence loss (SCL) is developed to train the FOSNet and to improve the scene recognition performance. The proposed SCL is based on the unique traits of the scene that the ‘sceneness’ spreads and the scene class does not change all over the image. The proposed FOSNet was experimented with three most popular scene recognition datasets, and their state-of-the-art performance is obtained in two sets: 60.14% on Places 2 and 90.37% on MIT indoor 67. The second highest performance of 77.28% is obtained on SUN 397.
Tasks Scene Recognition
Published 2019-07-17
URL https://arxiv.org/abs/1907.07570v2
PDF https://arxiv.org/pdf/1907.07570v2.pdf
PWC https://paperswithcode.com/paper/fosnet-an-end-to-end-trainable-deep-neural
Repo
Framework

Spatiotemporal Feature Learning for Event-Based Vision

Title Spatiotemporal Feature Learning for Event-Based Vision
Authors Rohan Ghosh, Anupam Gupta, Siyi Tang, Alcimar Soares, Nitish Thakor
Abstract Unlike conventional frame-based sensors, event-based visual sensors output information through spikes at a high temporal resolution. By only encoding changes in pixel intensity, they showcase a low-power consuming, low-latency approach to visual information sensing. To use this information for higher sensory tasks like object recognition and tracking, an essential simplification step is the extraction and learning of features. An ideal feature descriptor must be robust to changes involving (i) local transformations and (ii) re-appearances of a local event pattern. To that end, we propose a novel spatiotemporal feature representation learning algorithm based on slow feature analysis (SFA). Using SFA, smoothly changing linear projections are learnt which are robust to local visual transformations. In order to determine if the features can learn to be invariant to various visual transformations, feature point tracking tasks are used for evaluation. Extensive experiments across two datasets demonstrate the adaptability of the spatiotemporal feature learner to translation, scaling and rotational transformations of the feature points. More importantly, we find that the obtained feature representations are able to exploit the high temporal resolution of such event-based cameras in generating better feature tracks.
Tasks Event-based vision, Object Recognition, Representation Learning
Published 2019-03-16
URL http://arxiv.org/abs/1903.06923v1
PDF http://arxiv.org/pdf/1903.06923v1.pdf
PWC https://paperswithcode.com/paper/spatiotemporal-feature-learning-for-event
Repo
Framework

Learnability Can Be Independent of ZFC Axioms: Explanations and Implications

Title Learnability Can Be Independent of ZFC Axioms: Explanations and Implications
Authors William Taylor
Abstract In Ben-David et al.‘s “Learnability Can Be Undecidable,” they prove an independence result in theoretical machine learning. In particular, they define a new type of learnability, called Estimating The Maximum (EMX) learnability. They argue that this type of learnability fits in with other notions such as PAC learnability, Vapnik’s statistical learning setting, and other general learning settings. However, using some set-theoretic techniques, they show that some learning problems in the EMX setting are independent of ZFC. Specifically they prove that ZFC cannot prove or disprove EMX learnability of the finite subsets on the [0,1] interval. Moreover, the way they prove it shows that there can be no characteristic dimension for EMX; and, hence, for general learning settings. Here, I will explain their findings, discuss some limitations on those findings, and offer some suggestions about how to excise that undecidability. Parts 2-3 will explain the results of the paper, part 4-5 will discuss some limitations and next steps, and I will conclude in part 6.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.08410v1
PDF https://arxiv.org/pdf/1909.08410v1.pdf
PWC https://paperswithcode.com/paper/learnability-can-be-independent-of-zfc-axioms
Repo
Framework

Deconstructing Generative Adversarial Networks

Title Deconstructing Generative Adversarial Networks
Authors Banghua Zhu, Jiantao Jiao, David Tse
Abstract We deconstruct the performance of GANs into three components: 1. Formulation: we propose a perturbation view of the population target of GANs. Building on this interpretation, we show that GANs can be viewed as a generalization of the robust statistics framework, and propose a novel GAN architecture, termed as Cascade GANs, to provably recover meaningful low-dimensional generator approximations when the real distribution is high-dimensional and corrupted by outliers. 2. Generalization: given a population target of GANs, we design a systematic principle, projection under admissible distance, to design GANs to meet the population requirement using finite samples. We implement our principle in three cases to achieve polynomial and sometimes near-optimal sample complexities: (1) learning an arbitrary generator under an arbitrary pseudonorm; (2) learning a Gaussian location family under TV distance, where we utilize our principle provide a new proof for the optimality of Tukey median viewed as GANs; (3) learning a low-dimensional Gaussian approximation of a high-dimensional arbitrary distribution under Wasserstein distance. We demonstrate a fundamental trade-off in the approximation error and statistical error in GANs, and show how to apply our principle with empirical samples to predict how many samples are sufficient for GANs in order not to suffer from the discriminator winning problem. 3. Optimization: we demonstrate alternating gradient descent is provably not locally asymptotically stable in optimizing the GAN formulation of PCA. We diagnose the problem as the minimax duality gap being non-zero, and propose a new GAN architecture whose duality gap is zero, where the value of the game is equal to the previous minimax value (not the maximin value). We prove the new GAN architecture is globally asymptotically stable in optimization under alternating gradient descent.
Tasks
Published 2019-01-27
URL https://arxiv.org/abs/1901.09465v7
PDF https://arxiv.org/pdf/1901.09465v7.pdf
PWC https://paperswithcode.com/paper/deconstructing-generative-adversarial
Repo
Framework

Simple Strategies in Multi-Objective MDPs (Technical Report)

Title Simple Strategies in Multi-Objective MDPs (Technical Report)
Authors Florent Delgrange, Joost-Pieter Katoen, Tim Quatmann, Mickael Randour
Abstract We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using \Storm and Gurobi show the feasibility of our algorithms.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.11024v3
PDF https://arxiv.org/pdf/1910.11024v3.pdf
PWC https://paperswithcode.com/paper/simple-strategies-in-multi-objective-mdps
Repo
Framework

Deep Factors for Forecasting

Title Deep Factors for Forecasting
Authors Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski
Abstract Producing probabilistic forecasts for large collections of similar and/or dependent time series is a practically relevant and challenging task. Classical time series models fail to capture complex patterns in the data, and multivariate techniques struggle to scale to large problem sizes. Their reliance on strong structural assumptions makes them data-efficient, and allows them to provide uncertainty estimates. The converse is true for models based on deep neural networks, which can learn complex patterns and dependencies given enough data. In this paper, we propose a hybrid model that incorporates the benefits of both approaches. Our new method is data-driven and scalable via a latent, global, deep component. It also handles uncertainty through a local classical model. We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part. Our experiments demonstrate the advantages of our model both in term of data efficiency, accuracy and computational complexity.
Tasks Time Series
Published 2019-05-28
URL https://arxiv.org/abs/1905.12417v1
PDF https://arxiv.org/pdf/1905.12417v1.pdf
PWC https://paperswithcode.com/paper/deep-factors-for-forecasting
Repo
Framework

Confidence Bands and Hypothesis Test Methods for Recall and Precision Curves at Extremely Small Fractions with Applications to Drug Discovery

Title Confidence Bands and Hypothesis Test Methods for Recall and Precision Curves at Extremely Small Fractions with Applications to Drug Discovery
Authors Jeremy R. Ash, Jacqueline M. Hughes-Oliver
Abstract In virtual screening for drug discovery, recall curves are used to assess the performance of ranking algorithms, in which recall is a function of the fraction of data prioritized for experimental testing. Unfortunately, researchers almost never consider the uncertainty in the estimation of the recall curve when benchmarking algorithms. We confirm that a recently developed procedure for estimating pointwise confidence intervals for recall curves – and closely related variants, such as precision curves – can be applied to a variety of simulated data sets representative of those typically encountered in virtual screening. Since it is more desirable in benchmarks to present the uncertainty of performance over a range of testing fractions, we extend the pointwise confidence interval procedure to allow for the estimation of confidence bands for these curves. We also present hypothesis test methods to determine significant differences between the curves for competing algorithms. We show these methods have high power to detect significant differences at a range of small fractions typically tested, while maintaining control of type I error rate. These methods enable statistically rigorous comparisons of virtual screening algorithms using a metric that quantifies the aspect of performance that is of primary interest.
Tasks Drug Discovery
Published 2019-12-19
URL https://arxiv.org/abs/1912.09526v1
PDF https://arxiv.org/pdf/1912.09526v1.pdf
PWC https://paperswithcode.com/paper/confidence-bands-and-hypothesis-test-methods
Repo
Framework

Ground Metric Learning on Graphs

Title Ground Metric Learning on Graphs
Authors Matthieu Heitz, Nicolas Bonneel, David Coeurjolly, Marco Cuturi, Gabriel Peyré
Abstract Optimal transport (OT) distances between probability distributions are parameterized by the ground metric they use between observations. Their relevance for real-life applications strongly hinges on whether that ground metric parameter is suitably chosen. Selecting it adaptively and algorithmically from prior knowledge, the so-called ground metric learning GML) problem, has therefore appeared in various settings. We consider it in this paper when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest. This imposes a rich structure for candidate metrics, but also enables far more efficient learning procedures when compared to a direct optimization over the space of all metric matrices. We use this setting to tackle an inverse problem stemming from the observation of a density evolving with time: we seek a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution. This OT dynamic framework is relevant to model natural phenomena exhibiting displacements of mass, such as for instance the evolution of the color palette induced by the modification of lighting and materials.
Tasks Metric Learning
Published 2019-11-08
URL https://arxiv.org/abs/1911.03117v1
PDF https://arxiv.org/pdf/1911.03117v1.pdf
PWC https://paperswithcode.com/paper/ground-metric-learning-on-graphs
Repo
Framework

MaskPlus: Improving Mask Generation for Instance Segmentation

Title MaskPlus: Improving Mask Generation for Instance Segmentation
Authors Shichao Xu, Shuyue Lan, Qi Zhu
Abstract Instance segmentation is a promising yet challenging topic in computer vision. Recent approaches such as Mask R-CNN typically divide this problem into two parts – a detection component and a mask generation branch, and mostly focus on the improvement of the detection part. In this paper, we present an approach that extends Mask R-CNN with five novel optimization techniques for improving the mask generation branch and reducing the conflicts between the mask branch and the detection component in training. These five techniques are independent to each other and can be flexibly utilized in building various instance segmentation architectures for increasing the overall accuracy. We demonstrate the effectiveness of our approach with tests on the COCO dataset.
Tasks Instance Segmentation, Semantic Segmentation
Published 2019-07-15
URL https://arxiv.org/abs/1907.06713v3
PDF https://arxiv.org/pdf/1907.06713v3.pdf
PWC https://paperswithcode.com/paper/maskplus-improving-mask-generation-for
Repo
Framework

Universal Adversarial Perturbations for Speech Recognition Systems

Title Universal Adversarial Perturbations for Speech Recognition Systems
Authors Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar
Abstract In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems. We propose an algorithm to find a single quasi-imperceptible perturbation, which when added to any arbitrary speech signal, will most likely fool the victim speech recognition model. Our experiments demonstrate the application of our proposed technique by crafting audio-agnostic universal perturbations for the state-of-the-art ASR system – Mozilla DeepSpeech. Additionally, we show that such perturbations generalize to a significant extent across models that are not available during training, by performing a transferability test on a WaveNet based ASR system.
Tasks Speech Recognition
Published 2019-05-09
URL https://arxiv.org/abs/1905.03828v2
PDF https://arxiv.org/pdf/1905.03828v2.pdf
PWC https://paperswithcode.com/paper/universal-adversarial-perturbations-for
Repo
Framework

Linear and Quadratic Discriminant Analysis: Tutorial

Title Linear and Quadratic Discriminant Analysis: Tutorial
Authors Benyamin Ghojogh, Mark Crowley
Abstract This tutorial explains Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) as two fundamental classification methods in statistical and probabilistic learning. We start with the optimization of decision boundary on which the posteriors are equal. Then, LDA and QDA are derived for binary and multiple classes. The estimation of parameters in LDA and QDA are also covered. Then, we explain how LDA and QDA are related to metric learning, kernel principal component analysis, Mahalanobis distance, logistic regression, Bayes optimal classifier, Gaussian naive Bayes, and likelihood ratio test. We also prove that LDA and Fisher discriminant analysis are equivalent. We finally clarify some of the theoretical concepts with simulations we provide.
Tasks Metric Learning
Published 2019-06-01
URL https://arxiv.org/abs/1906.02590v1
PDF https://arxiv.org/pdf/1906.02590v1.pdf
PWC https://paperswithcode.com/paper/190602590
Repo
Framework

Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference

Title Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference
Authors Cole L. Hurwitz, Kai Xu, Akash Srivastava, Alessio P. Buccino, Matthias H. Hennig
Abstract Determining the positions of neurons in an extracellular recording is useful for investigating functional properties of the underlying neural circuitry. In this work, we present a Bayesian modelling approach for localizing the source of individual spikes on high-density, microelectrode arrays. To allow for scalable inference, we implement our model as a variational autoencoder and perform amortized variational inference. We evaluate our method on both biophysically realistic simulated and real extracellular datasets, demonstrating that it is more accurate than and can improve spike sorting performance over heuristic localization methods such as center of mass.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12375v3
PDF https://arxiv.org/pdf/1905.12375v3.pdf
PWC https://paperswithcode.com/paper/scalable-spike-source-localization-in
Repo
Framework

Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation

Title Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation
Authors Marcin Junczys-Dowmunt
Abstract This paper describes the Microsoft Translator submissions to the WMT19 news translation shared task for English-German. Our main focus is document-level neural machine translation with deep transformer models. We start with strong sentence-level baselines, trained on large-scale data created via data-filtering and noisy back-translation and find that back-translation seems to mainly help with translationese input. We explore fine-tuning techniques, deeper models and different ensembling strategies to counter these effects. Using document boundaries present in the authentic and synthetic parallel data, we create sequences of up to 1000 subword segments and train transformer translation models. We experiment with data augmentation techniques for the smaller authentic data with document-boundaries and for larger authentic data without boundaries. We further explore multi-task training for the incorporation of document-level source language monolingual data via the BERT-objective on the encoder and two-pass decoding for combinations of sentence-level and document-level systems. Based on preliminary human evaluation results, evaluators strongly prefer the document-level systems over our comparable sentence-level system. The document-level systems also seem to score higher than the human references in source-based direct assessment.
Tasks Data Augmentation, Machine Translation
Published 2019-07-14
URL https://arxiv.org/abs/1907.06170v1
PDF https://arxiv.org/pdf/1907.06170v1.pdf
PWC https://paperswithcode.com/paper/microsoft-translator-at-wmt-2019-towards
Repo
Framework
comments powered by Disqus