January 30, 2020

3084 words 15 mins read

Paper Group ANR 409

Analysis Of Momentum Methods. A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm. Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM. High-dimensional semi-supervised learning: in search for optimal inference of the mean. Quality analysis of DCGAN-generated mammography lesions. Learning Str …

Analysis Of Momentum Methods


Title	Analysis Of Momentum Methods
Authors	Nikola B. Kovachki, Andrew M. Stuart
Abstract	Gradient decent-based optimization methods underpin the parameter training which results in the impressive results now found when testing neural networks. Introducing stochasticity is key to their success in practical problems, and there is some understanding of the role of stochastic gradient decent in this context. Momentum modifications of gradient decent such as Polyak’s Heavy Ball method (HB) and Nesterov’s method of accelerated gradients (NAG), are widely adopted. In this work, our focus is on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm; to expose the ideas simply we work in the deterministic setting. We show that, contrary to popular belief, standard implementations of fixed momentum methods do no more than act to rescale the learning rate. We achieve this by showing that the momentum method converges to a gradient flow, with a momentum-dependent time-rescaling, using the method of modified equations from numerical analysis. Further we show that the momentum method admits an exponentially attractive invariant manifold on which the dynamic reduces to a gradient flow with respect to a modified loss function, equal to the original one plus a small perturbation.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04285v1
PDF	https://arxiv.org/pdf/1906.04285v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-momentum-methods
Repo
Framework

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm


Title	A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm
Authors	Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael I. Jordan
Abstract	In this note, we derive concentration inequalities for random vectors with subGaussian norm (a generalization of both subGaussian random vectors and norm bounded random vectors), which are tight up to logarithmic factors.
Tasks
Published	2019-02-11
URL	http://arxiv.org/abs/1902.03736v1
PDF	http://arxiv.org/pdf/1902.03736v1.pdf
PWC	https://paperswithcode.com/paper/a-short-note-on-concentration-inequalities
Repo
Framework

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM


Title	Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM
Authors	Sheng Lin, Xiaolong Ma, Shaokai Ye, Geng Yuan, Kaisheng Ma, Yanzhi Wang
Abstract	Weight quantization is one of the most important techniques of Deep Neural Networks (DNNs) model compression method. A recent work using systematic framework of DNN weight quantization with the advanced optimization algorithm ADMM (Alternating Direction Methods of Multipliers) achieves one of state-of-art results in weight quantization. In this work, we first extend such ADMM-based framework to guarantee solution feasibility and we have further developed a multi-step, progressive DNN weight quantization framework, with dual benefits of (i) achieving further weight quantization thanks to the special property of ADMM regularization, and (ii) reducing the search space within each step. Extensive experimental results demonstrate the superior performance compared with prior work. Some highlights: we derive the first lossless and fully binarized (for all layers) LeNet-5 for MNIST; And we derive the first fully binarized (for all layers) VGG-16 for CIFAR-10 and ResNet for ImageNet with reasonable accuracy loss.
Tasks	Model Compression, Quantization
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00789v1
PDF	https://arxiv.org/pdf/1905.00789v1.pdf
PWC	https://paperswithcode.com/paper/toward-extremely-low-bit-and-lossless
Repo
Framework

High-dimensional semi-supervised learning: in search for optimal inference of the mean


Title	High-dimensional semi-supervised learning: in search for optimal inference of the mean
Authors	Yuqian Zhang, Jelena Bradic
Abstract	We provide a high-dimensional semi-supervised inference framework focused on the mean and variance of the response. Our data are comprised of an extensive set of observations regarding the covariate vectors and a much smaller set of labeled observations where we observe both the response as well as the covariates. We allow the size of the covariates to be much larger than the sample size and impose weak conditions on a statistical form of the data. We provide new estimators of the mean and variance of the response that extend some of the recent results presented in low-dimensional models. In particular, at times we will not necessitate consistent estimation of the functional form of the data. Together with estimation of the population mean and variance, we provide their asymptotic distribution and confidence intervals where we showcase gains in efficiency compared to the sample mean and variance. Our procedure, with minor modifications, is then presented to make important contributions regarding inference about average treatment effects. We also investigate the robustness of estimation and coverage and showcase widespread applicability and generality of the proposed method.
Tasks
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00772v1
PDF	http://arxiv.org/pdf/1902.00772v1.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-semi-supervised-learning-in
Repo
Framework

Quality analysis of DCGAN-generated mammography lesions


Title	Quality analysis of DCGAN-generated mammography lesions
Authors	Basel Alyafi, Oliver Diaz, Joan C Vilanova, Javier del Riego, Robert Marti
Abstract	Medical image synthesis has gained a great focus recently, especially after the introduction of Generative Adversarial Networks (GANs). GANs have been used widely to provide anatomically-plausible and diverse samples for augmentation and other applications, including segmentation and super resolution. In our previous work, Deep Convolutional GANs were used to generate synthetic mammogram lesions, masses mainly, that could enhance the classification performance in imbalanced datasets. In this new work, a deeper investigation was carried out to explore other aspects of the generated images evaluation, i.e., realism, feature space distribution, and observers studies. t-Stochastic Neighbor Embedding (t-SNE) was used to reduce the dimensionality of real and fake images to enable 2D visualisations. Additionally, two expert radiologists performed a realism-evaluation study. Visualisations showed that the generated images have a similar feature distribution of the real ones, avoiding outliers. Moreover, Receiver Operating Characteristic (ROC) curve showed that the radiologists could not, in many cases, distinguish between synthetic and real lesions, giving 48% and 61% accuracies in a balanced sample set.
Tasks	Image Generation, Super-Resolution
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12850v2
PDF	https://arxiv.org/pdf/1911.12850v2.pdf
PWC	https://paperswithcode.com/paper/quality-analysis-of-dcgan-generated
Repo
Framework

Learning Structured Twin-Incoherent Twin-Projective Latent Dictionary Pairs for Classification


Title	Learning Structured Twin-Incoherent Twin-Projective Latent Dictionary Pairs for Classification
Authors	Zhao Zhang, Yulin Sun, Zheng Zhang, Yang Wang, Guangcan Liu, Meng Wang
Abstract	In this paper, we extend the popular dictionary pair learning (DPL) into the scenario of twin-projective latent flexible DPL under a structured twin-incoherence. Technically, a novel framework called Twin-Projective Latent Flexible DPL (TP-DPL) is proposed, which minimizes the twin-incoherence constrained flexibly-relaxed reconstruction error to avoid the possible over-fitting issue and produce accurate reconstruction. In this setting, our TP-DPL integrates the twin-incoherence based latent flexible DPL and the joint embedding of codes as well as salient features by twin-projection into a unified model in an adaptive neighborhood-preserving manner. As a result, TP-DPL unifies the salient feature extraction, representation and classification. The twin-incoherence constraint on codes and features can explicitly ensure high intra-class compactness and inter-class separation over them. TP-DPL also integrates the adaptive weighting to preserve the local neighborhood of the coefficients and salient features within each class explicitly. For efficiency, TP-DPL uses Frobenius-norm and abandons the costly l0/l1-norm for group sparse representation. Another byproduct is that TP-DPL can directly apply the class-specific twin-projective reconstruction residual to compute the label of data. Extensive results on public databases show that TP-DPL can deliver the state-of-the-art performance.
Tasks
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07878v1
PDF	https://arxiv.org/pdf/1908.07878v1.pdf
PWC	https://paperswithcode.com/paper/190807878
Repo
Framework

Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference


Title	Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference
Authors	Peter Kraft, Daniel Kang, Deepak Narayanan, Shoumik Palkar, Peter Bailis, Matei Zaharia
Abstract	Systems for ML inference are widely deployed today, but they typically optimize ML inference workloads using techniques designed for conventional data serving workloads and miss critical opportunities to leverage the statistical nature of ML. In this paper, we present Willump, an optimizer for ML inference that introduces two statistically-motivated optimizations targeting ML applications whose performance bottleneck is feature computation. First, Willump automatically cascades feature computation for classification queries: Willump classifies most data inputs using only high-value, low-cost features selected through empirical observations of ML model performance, improving query performance by up to 5x without statistically significant accuracy loss. Second, Willump accurately approximates ML top-K queries, discarding low-scoring inputs with an automatically constructed approximate model and then ranking the remainder with a more powerful model, improving query performance by up to 10x with minimal accuracy loss. Willump automatically tunes these optimizations’ parameters to maximize query performance while meeting an accuracy target. Moreover, Willump complements these statistical optimizations with compiler optimizations to automatically generate fast inference code for ML applications. We show that Willump improves the end-to-end performance of real-world ML inference pipelines curated from major data science competitions by up to 16x without statistically significant loss of accuracy.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01974v3
PDF	https://arxiv.org/pdf/1906.01974v3.pdf
PWC	https://paperswithcode.com/paper/willump-a-statistically-aware-end-to-end
Repo
Framework

Biologic and Prognostic Feature Scores from Whole-Slide Histology Images Using Deep Learning


Title	Biologic and Prognostic Feature Scores from Whole-Slide Histology Images Using Deep Learning
Authors	Okyaz Eminaga, Mahmood Abbas, Yuri Tolkach, Rosalie Nolley, Christian Kunder, Axel Semjonow, Martin Boegemann, Andreas Loening, James Brook, Daniel Rubin
Abstract	Histopathology is a reflection of the molecular changes and provides prognostic phenotypes representing the disease progression. In this study, we introduced feature scores generated from hematoxylin and eosin histology images based on deep learning (DL) models developed for prostate pathology. We demonstrated that these feature scores were significantly prognostic for time to event endpoints (biochemical recurrence and cancer-specific survival) and had simultaneously molecular biologic associations to relevant genomic alterations and molecular subtypes using already trained DL models that were not previously exposed to the datasets of the current study. Further, we discussed the potential of such feature scores to improve the current tumor grading system and the challenges that are associated with tumor heterogeneity and the development of prognostic models from histology images. Our findings uncover the potential of feature scores from histology images as digital biomarkers in precision medicine and as an expanding utility for digital pathology.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09100v3
PDF	https://arxiv.org/pdf/1910.09100v3.pdf
PWC	https://paperswithcode.com/paper/biologic-and-prognostic-feature-scores-from
Repo
Framework

A Constructive Prediction of the Generalization Error Across Scales


Title	A Constructive Prediction of the Generalization Error Across Scales
Authors	Jonathan S. Rosenfeld, Amir Rosenfeld, Yonatan Belinkov, Nir Shavit
Abstract	The dependency of the generalization error of neural networks on model and dataset size is of critical importance both in practice and for understanding the theory of neural networks. Nevertheless, the functional form of this dependency remains elusive. In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of model scaling (e.g., width, depth), we are able to simultaneously construct such a form and specify the exact models which can attain it across model/data scales. Our construction follows insights obtained from observations conducted over a range of model/data scales, in various model types and datasets, in vision and language tasks. We show that the form both fits the observations well across scales, and provides accurate predictions from small- to large-scale models and data.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12673v2
PDF	https://arxiv.org/pdf/1909.12673v2.pdf
PWC	https://paperswithcode.com/paper/a-constructive-prediction-of-the
Repo
Framework

FPGA-based Binocular Image Feature Extraction and Matching System


Title	FPGA-based Binocular Image Feature Extraction and Matching System
Authors	Qi Ni, Fei Wang, Ziwei Zhao, Peng Gao
Abstract	Image feature extraction and matching is a fundamental but computation intensive task in machine vision. This paper proposes a novel FPGA-based embedded system to accelerate feature extraction and matching. It implements SURF feature point detection and BRIEF feature descriptor construction and matching. For binocular stereo vision, feature matching includes both tracking matching and stereo matching, which simultaneously provide feature point correspondences and parallax information. Our system is evaluated on a ZYNQ XC7Z045 FPGA. The result demonstrates that it can process binocular video data at a high frame rate (640$\times$480 @ 162fps). Moreover, an extensive test proves our system has robustness for image compression, blurring and illumination.
Tasks	Image Compression, Stereo Matching, Stereo Matching Hand
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04890v2
PDF	https://arxiv.org/pdf/1905.04890v2.pdf
PWC	https://paperswithcode.com/paper/fpga-based-binocular-image-feature-extraction
Repo
Framework

Nearest Neighbor Sampling of Point Sets using Random Rays


Title	Nearest Neighbor Sampling of Point Sets using Random Rays
Authors	Liangchen Liu, Louis Ly, Colin Macdonald, Yen-Hsi Richard Tsai
Abstract	We propose a new framework for the sampling, compression, and analysis of distributions of point sets and other geometric objects embedded in Euclidean spaces. A set of randomly selected rays are projected onto their closest points in the data set, forming the ray signature. From the signature, statistical information about the data set, as well as certain geometrical information, can be extracted, independent of the ray set. We present promising results from “RayNN”, a neural network for the classification of point clouds based on ray signatures.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10737v2
PDF	https://arxiv.org/pdf/1911.10737v2.pdf
PWC	https://paperswithcode.com/paper/nearest-neighbor-sampling-of-point-sets-using
Repo
Framework

SenseNet: Deep Learning based Wideband spectrum sensing and modulation classification network


Title	SenseNet: Deep Learning based Wideband spectrum sensing and modulation classification network
Authors	Shivam Chandhok, Himani Joshi, A V Subramanyam, Sumit J. Darak
Abstract	Next generation networks are expected to operate in licensed, shared as well as unlicensed spectrum to support spectrum demands of a wide variety of services.Due to shortage of radio spectrum, the need for communication systems(like cognitive radio) that can sense wideband spectrum and locate desired spectrum resources in real time has increased.Automatic modulation classifier (AMC) is an important part of wideband spectrum sensing (WSS) as it enables identification of incumbent users transmitting in the adjacent vacant spectrum.Most of the proposed AMC work on Nyquist samples which need to be further processed before they can be fed to the classifier.Working with Nyquist sampled signal demands high rate ADC and results in high power consumption and high sensing time which is unacceptable for next generation communication systems.To overcome this drawback we propose to use sub-nyquist sample based WSS and modulation classification. In this paper, we propose a novel architecture called SenseNet which combines the task of spectrum sensing and modulation classification into a single unified pipeline.The proposed method is endowed with the capability to perform blind WSS and modulation classification directly on raw sub-nyquist samples which reduces complexity and sensing time since no prior estimation of sparsity is required. We extensively compare the performance of our proposed method on WSS as well as modulation classification tasks for a wide range of modulation schemes, input datasets, and channel conditions.A significant drawback of using sub-nyquist samples is reduced performance compared to systems that employ nyquist sampled signal.However,we show that for the proposed method,the classification accuracy approaches to Nyquist sampling based deep learning AMC with an increase in signal to noise ratio.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05255v1
PDF	https://arxiv.org/pdf/1912.05255v1.pdf
PWC	https://paperswithcode.com/paper/sensenet-deep-learning-based-wideband
Repo
Framework

Adaptive Noise Injection: A Structure-Expanding Regularization for RNN


Title	Adaptive Noise Injection: A Structure-Expanding Regularization for RNN
Authors	Rui Li, Kai Shuang, Mengyu Gu, Sen Su
Abstract	The vanilla LSTM has become one of the most potential architectures in word-level language modeling, like other recurrent neural networks, overfitting is always a key barrier for its effectiveness. The existing noise-injected regularizations introduce the random noises of fixation intensity, which inhibits the learning of the RNN throughout the training process. In this paper, we propose a new structure-expanding regularization method called Adjective Noise Injection (ANI), which considers the output of an extra RNN branch as a kind of adaptive noises and injects it into the main-branch RNN output. Due to the adaptive noises can be improved as the training processes, its negative effects can be weakened and even transformed into a positive effect to further improve the expressiveness of the main-branch RNN. As a result, ANI can regularize the RNN in the early stage of training and further promoting its training performance in the later stage. We conduct experiments on three widely-used corpora: PTB, WT2, and WT103, whose results verify both the regularization and promoting the training performance functions of ANI. Furthermore, we design a series simulation experiments to explore the reasons that may lead to the regularization effect of ANI, and we find that in training process, the robustness against the parameter update errors can be strengthened when the LSTM equipped with ANI.
Tasks	Language Modelling
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10885v1
PDF	https://arxiv.org/pdf/1907.10885v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-noise-injection-a-structure
Repo
Framework

Catastrophic forgetting: still a problem for DNNs


Title	Catastrophic forgetting: still a problem for DNNs
Authors	B. Pfülb, A. Gepperth, S. Abdullah, A. Kilian
Abstract	We investigate the performance of DNNs when trained on class-incremental visual problems consisting of initial training, followed by retraining with added visual classes. Catastrophic forgetting (CF) behavior is measured using a new evaluation procedure that aims at an application-oriented view of incremental learning. In particular, it imposes that model selection must be performed on the initial dataset alone, as well as demanding that retraining control be performed only using the retraining dataset, as initial dataset is usually too large to be kept. Experiments are conducted on class-incremental problems derived from MNIST, using a variety of different DNN models, some of them recently proposed to avoid catastrophic forgetting. When comparing our new evaluation procedure to previous approaches for assessing CF, we find their findings are completely negated, and that none of the tested methods can avoid CF in all experiments. This stresses the importance of a realistic empirical measurement procedure for catastrophic forgetting, and the need for further research in incremental learning for DNNs.
Tasks	Model Selection
Published	2019-05-20
URL	https://arxiv.org/abs/1905.08077v1
PDF	https://arxiv.org/pdf/1905.08077v1.pdf
PWC	https://paperswithcode.com/paper/catastrophic-forgetting-still-a-problem-for
Repo
Framework

Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors


Title	Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors
Authors	Telma Pereira, Sofia Pires, Marta Gromicho, Susana Pinto, Mamede de Carvalho, Sara C. Madeira
Abstract	Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk of error for predictions at patient-level, and 2) on the most adequate time to administer the non-invasive ventilation. To address these issues, we combine Conformal Prediction (a machine learning framework that complements predictions with confidence measures) and a mixture experts into a prognostic model which not only predicts whether an ALS patient will suffer from respiratory insufficiency but also the most likely time window of occurrence, at a given reliability level. Promising results were obtained, with near 80% of predictions being correctly identified.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13070v1
PDF	https://arxiv.org/pdf/1907.13070v1.pdf
PWC	https://paperswithcode.com/paper/predicting-assisted-ventilation-in
Repo
Framework