October 19, 2019

3248 words 16 mins read

Paper Group ANR 161

BCCNet: Bayesian classifier combination neural network. Convergence Analysis of Gradient Descent Algorithms with Proportional Updates. FoldingZero: Protein Folding from Scratch in Hydrophobic-Polar Model. Learning Dual Convolutional Neural Networks for Low-Level Vision. Efficient Semantic Segmentation for Visual Bird’s-eye View Interpretation. Adve …

BCCNet: Bayesian classifier combination neural network


Title	BCCNet: Bayesian classifier combination neural network
Authors	Olga Isupova, Yunpeng Li, Danil Kuzin, Stephen J Roberts, Katherine Willis, Steven Reece
Abstract	Machine learning research for developing countries can demonstrate clear sustainable impact by delivering actionable and timely information to in-country government organisations (GOs) and NGOs in response to their critical information requirements. We co-create products with UK and in-country commercial, GO and NGO partners to ensure the machine learning algorithms address appropriate user needs whether for tactical decision making or evidence-based policy decisions. In one particular case, we developed and deployed a novel algorithm, BCCNet, to quickly process large quantities of unstructured data to prevent and respond to natural disasters. Crowdsourcing provides an efficient mechanism to generate labels from unstructured data to prime machine learning algorithms for large scale data analysis. However, these labels are often imperfect with qualities varying among different citizen scientists, which prohibits their direct use with many state-of-the-art machine learning techniques. We describe BCCNet, a framework that simultaneously aggregates biased and contradictory labels from the crowd and trains an automatic classifier to process new data. Our case studies, mosquito sound detection for malaria prevention and damage detection for disaster response, show the efficacy of our method in the challenging context of developing world applications.
Tasks	Decision Making
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12258v1
PDF	http://arxiv.org/pdf/1811.12258v1.pdf
PWC	https://paperswithcode.com/paper/bccnet-bayesian-classifier-combination-neural
Repo
Framework

Convergence Analysis of Gradient Descent Algorithms with Proportional Updates


Title	Convergence Analysis of Gradient Descent Algorithms with Proportional Updates
Authors	Igor Gitman, Deepak Dilipkumar, Ben Parr
Abstract	The rise of deep learning in recent years has brought with it increasingly clever optimization methods to deal with complex, non-linear loss functions. These methods are often designed with convex optimization in mind, but have been shown to work well in practice even for the highly non-convex optimization associated with neural networks. However, one significant drawback of these methods when they are applied to deep learning is that the magnitude of the update step is sometimes disproportionate to the magnitude of the weights (much smaller or larger), leading to training instabilities such as vanishing and exploding gradients. An idea to combat this issue is gradient descent with proportional updates. Gradient descent with proportional updates was introduced in 2017. It was independently developed by You et al (Layer-wise Adaptive Rate Scaling (LARS) algorithm) and by Abu-El-Haija (PercentDelta algorithm). The basic idea of both of these algorithms is to make each step of the gradient descent proportional to the current weight norm and independent of the gradient magnitude. It is common in the context of new optimization methods to prove convergence or derive regret bounds under the assumption of Lipschitz continuity and convexity. However, even though LARS and PercentDelta were shown to work well in practice, there is no theoretical analysis of the convergence properties of these algorithms. Thus it is not clear if the idea of gradient descent with proportional updates is used in the optimal way, or if it could be improved by using a different norm or specific learning rate schedule, for example. Moreover, it is not clear if these algorithms can be extended to other problems, besides neural networks. We attempt to answer these questions by establishing the theoretical analysis of gradient descent with proportional updates, and verifying this analysis with empirical examples.
Tasks
Published	2018-01-09
URL	http://arxiv.org/abs/1801.03137v1
PDF	http://arxiv.org/pdf/1801.03137v1.pdf
PWC	https://paperswithcode.com/paper/convergence-analysis-of-gradient-descent
Repo
Framework

FoldingZero: Protein Folding from Scratch in Hydrophobic-Polar Model


Title	FoldingZero: Protein Folding from Scratch in Hydrophobic-Polar Model
Authors	Yanjun Li, Hengtong Kang, Ketian Ye, Shuyu Yin, Xiaolin Li
Abstract	De novo protein structure prediction from amino acid sequence is one of the most challenging problems in computational biology. As one of the extensively explored mathematical models for protein folding, Hydrophobic-Polar (HP) model enables thorough investigation of protein structure formation and evolution. Although HP model discretizes the conformational space and simplifies the folding energy function, it has been proven to be an NP-complete problem. In this paper, we propose a novel protein folding framework FoldingZero, self-folding a de novo protein 2D HP structure from scratch based on deep reinforcement learning. FoldingZero features the coupled approach of a two-head (policy and value heads) deep convolutional neural network (HPNet) and a regularized Upper Confidence Bounds for Trees (R-UCT). It is trained solely by a reinforcement learning algorithm, which improves HPNet and R-UCT iteratively through iterative policy optimization. Without any supervision and domain knowledge, FoldingZero not only achieves comparable results, but also learns the latent folding knowledge to stabilize the structure. Without exponential computation, FoldingZero shows promising potential to be adopted for real-world protein properties prediction.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00967v1
PDF	http://arxiv.org/pdf/1812.00967v1.pdf
PWC	https://paperswithcode.com/paper/foldingzero-protein-folding-from-scratch-in
Repo
Framework

Learning Dual Convolutional Neural Networks for Low-Level Vision


Title	Learning Dual Convolutional Neural Networks for Low-Level Vision
Authors	Jinshan Pan, Sifei Liu, Deqing Sun, Jiawei Zhang, Yang Liu, Jimmy Ren, Zechao Li, Jinhui Tang, Huchuan Lu, Yu-Wing Tai, Ming-Hsuan Yang
Abstract	In this paper, we propose a general dual convolutional neural network (DualCNN) for low-level vision problems, e.g., super-resolution, edge-preserving filtering, deraining and dehazing. These problems usually involve the estimation of two components of the target signals: structures and details. Motivated by this, our proposed DualCNN consists of two parallel branches, which respectively recovers the structures and details in an end-to-end manner. The recovered structures and details can generate the target signals according to the formation model for each particular application. The DualCNN is a flexible framework for low-level vision tasks and can be easily incorporated into existing CNNs. Experimental results show that the DualCNN can be effectively applied to numerous low-level vision tasks with favorable performance against the state-of-the-art methods.
Tasks	Rain Removal, Super-Resolution
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05020v1
PDF	http://arxiv.org/pdf/1805.05020v1.pdf
PWC	https://paperswithcode.com/paper/learning-dual-convolutional-neural-networks
Repo
Framework

Efficient Semantic Segmentation for Visual Bird’s-eye View Interpretation


Title	Efficient Semantic Segmentation for Visual Bird’s-eye View Interpretation
Authors	Timo Sämann, Karl Amende, Stefan Milz, Christian Witt, Martin Simon, Johannes Petzold
Abstract	The ability to perform semantic segmentation in real-time capable applications with limited hardware is of great importance. One such application is the interpretation of the visual bird’s-eye view, which requires the semantic segmentation of the four omnidirectional camera images. In this paper, we present an efficient semantic segmentation that sets new standards in terms of runtime and hardware requirements. Our two main contributions are the decrease of the runtime by parallelizing the ArgMax layer and the reduction of hardware requirements by applying the channel pruning method to the ENet model.
Tasks	Semantic Segmentation
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12008v1
PDF	http://arxiv.org/pdf/1811.12008v1.pdf
PWC	https://paperswithcode.com/paper/efficient-semantic-segmentation-for-visual
Repo
Framework

Adversarial Attack Type I: Cheat Classifiers by Significant Changes


Title	Adversarial Attack Type I: Cheat Classifiers by Significant Changes
Authors	Sanli Tang, Xiaolin Huang, Mingjian Chen, Chengjin Sun, Jie Yang
Abstract	Despite the great success of deep neural networks, the adversarial attack can cheat some well-trained classifiers by small permutations. In this paper, we propose another type of adversarial attack that can cheat classifiers by significant changes. For example, we can significantly change a face but well-trained neural networks still recognize the adversarial and the original example as the same person. Statistically, the existing adversarial attack increases Type II error and the proposed one aims at Type I error, which are hence named as Type II and Type I adversarial attack, respectively. The two types of attack are equally important but are essentially different, which are intuitively explained and numerically evaluated. To implement the proposed attack, a supervised variation autoencoder is designed and then the classifier is attacked by updating the latent variables using gradient information. {Besides, with pre-trained generative models, Type I attack on latent spaces is investigated as well.} Experimental results show that our method is practical and effective to generate Type I adversarial examples on large-scale image datasets. Most of these generated examples can pass detectors designed for defending Type II attack and the strengthening strategy is only efficient with a specific type attack, both implying that the underlying reasons for Type I and Type II attack are different.
Tasks	Adversarial Attack
Published	2018-09-03
URL	https://arxiv.org/abs/1809.00594v2
PDF	https://arxiv.org/pdf/1809.00594v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-attack-type-i-generating-false
Repo
Framework

k-Space Deep Learning for Parallel MRI: Application to Time-Resolved MR Angiography


Title	k-Space Deep Learning for Parallel MRI: Application to Time-Resolved MR Angiography
Authors	Eunju Cha, Eung Yeop Kim, Jong Chul Ye
Abstract	Time-resolved angiography with interleaved stochastic trajectories (TWIST) has been widely used for dynamic contrast enhanced MRI (DCE-MRI). To achieve highly accelerated acquisitions, TWIST combines the periphery of the k-space data from several adjacent frames to reconstruct one temporal frame. However, this view-sharing scheme limits the true temporal resolution of TWIST. Moreover, the k-space sampling patterns have been specially designed for a specific generalized autocalibrating partial parallel acquisition (GRAPPA) factor so that it is not possible to reduce the number of view-sharing once the k-data is acquired. To address these issues, this paper proposes a novel k-space deep learning approach for parallel MRI. In particular, we have designed our neural network so that accurate k-space interpolations are performed simultaneously for multiple coils by exploiting the redundancies along the coils and images. Reconstruction results using in vivo TWIST data set confirm that the proposed method can immediately generate high-quality reconstruction results with various choices of view- sharing, allowing us to exploit the trade-off between spatial and temporal resolution in time-resolved MR angiography.
Tasks
Published	2018-06-03
URL	http://arxiv.org/abs/1806.00806v2
PDF	http://arxiv.org/pdf/1806.00806v2.pdf
PWC	https://paperswithcode.com/paper/k-space-deep-learning-for-parallel-mri
Repo
Framework

Boosted Training of Convolutional Neural Networks for Multi-Class Segmentation


Title	Boosted Training of Convolutional Neural Networks for Multi-Class Segmentation
Authors	Lorenz Berger, Eoin Hyde, Matt Gibb, Nevil Pavithran, Garin Kelly, Faiz Mumtaz, Sébastien Ourselin
Abstract	Training deep neural networks on large and sparse datasets is still challenging and can require large amounts of computation and memory. In this work, we address the task of performing semantic segmentation on large volumetric data sets, such as CT scans. Our contribution is threefold: 1) We propose a boosted sampling scheme that uses a-posterior error maps, generated throughout training, to focus sampling on difficult regions, resulting in a more informative loss. This results in a significant training speed up and improves learning performance for image segmentation. 2) We propose a novel algorithm for boosting the SGD learning rate schedule by adaptively increasing and lowering the learning rate, avoiding the need for extensive hyperparameter tuning. 3) We show that our method is able to attain new state-of-the-art results on the VISCERAL Anatomy benchmark.
Tasks	Semantic Segmentation
Published	2018-06-13
URL	http://arxiv.org/abs/1806.05974v2
PDF	http://arxiv.org/pdf/1806.05974v2.pdf
PWC	https://paperswithcode.com/paper/boosted-training-of-convolutional-neural
Repo
Framework

One-class Collective Anomaly Detection based on Long Short-Term Memory Recurrent Neural Networks


Title	One-class Collective Anomaly Detection based on Long Short-Term Memory Recurrent Neural Networks
Authors	Nga Nguyen Thi, Van Loi Cao, Nhien-An Le-Khac
Abstract	Intrusion detection for computer network systems has been becoming one of the most critical tasks for network administrators today. It has an important role for organizations, governments and our society due to the valuable resources hosted on computer networks. Traditional misuse detection strategies are unable to detect new and unknown intrusion types. In contrast, anomaly detection in network security aims to distinguish between illegal or malicious events and normal behavior of network systems. Anomaly detection can be considered as a classification problem where it builds models of normal network behavior, of which it uses to detect new patterns that significantly deviate from the model. Most of the current approaches on anomaly detection is based on the learning of normal behavior and anomalous actions. They do not include memory that is they do not take into account previous events classify new ones. In this paper, we propose a one class collective anomaly detection model based on neural network learning. Normally a Long Short Term Memory Recurrent Neural Network (LSTM RNN) is trained only on normal data, and it is capable of predicting several time steps ahead of an input. In our approach, a LSTM RNN is trained on normal time series data before performing a prediction for each time step. Instead of considering each time-step separately, the observation of prediction errors from a certain number of time-steps is now proposed as a new idea for detecting collective anomalies. The prediction errors of a certain number of the latest time-steps above a threshold will indicate a collective anomaly. The model is evaluated on a time series version of the KDD 1999 dataset. The experiments demonstrate that the proposed model is capable to detect collective anomaly efficiently
Tasks	Anomaly Detection, Intrusion Detection, Time Series
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00324v1
PDF	http://arxiv.org/pdf/1802.00324v1.pdf
PWC	https://paperswithcode.com/paper/one-class-collective-anomaly-detection-based
Repo
Framework

Discriminative analysis of the human cortex using spherical CNNs - a study on Alzheimer’s disease diagnosis


Title	Discriminative analysis of the human cortex using spherical CNNs - a study on Alzheimer’s disease diagnosis
Authors	Xinyang Feng, Jie Yang, Andrew F. Laine, Elsa D. Angelini
Abstract	In neuroimaging studies, the human cortex is commonly modeled as a sphere to preserve the topological structure of the cortical surface. Cortical neuroimaging measures hence can be modeled in spherical representation. In this work, we explore analyzing the human cortex using spherical CNNs in an Alzheimer’s disease (AD) classification task using cortical morphometric measures derived from structural MRI. Our results show superior performance in classifying AD versus cognitively normal and in predicting MCI progression within two years, using structural MRI information only. This work demonstrates for the first time the potential of the spherical CNNs framework in the discriminative analysis of the human cortex and could be extended to other modalities and other neurological diseases.
Tasks
Published	2018-12-19
URL	http://arxiv.org/abs/1812.07749v1
PDF	http://arxiv.org/pdf/1812.07749v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-analysis-of-the-human-cortex
Repo
Framework

DIRECT: Deep Discriminative Embedding for Clustering of LIGO Data


Title	DIRECT: Deep Discriminative Embedding for Clustering of LIGO Data
Authors	Sara Bahaadini, Vahid Noroozi, Neda Rohani, Scott Coughlin, Michael Zevin, Aggelos K. Katsaggelos
Abstract	In this paper, benefiting from the strong ability of deep neural network in estimating non-linear functions, we propose a discriminative embedding function to be used as a feature extractor for clustering tasks. The trained embedding function transfers knowledge from the domain of a labeled set of morphologically-distinct images, known as classes, to a new domain within which new classes can potentially be isolated and identified. Our target application in this paper is the Gravity Spy Project, which is an effort to characterize transient, non-Gaussian noise present in data from the Advanced Laser Interferometer Gravitational-wave Observatory, or LIGO. Accumulating large, labeled sets of noise features and identifying of new classes of noise lead to a better understanding of their origin, which makes their removal from the data and/or detectors possible.
Tasks
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02296v1
PDF	http://arxiv.org/pdf/1805.02296v1.pdf
PWC	https://paperswithcode.com/paper/direct-deep-discriminative-embedding-for
Repo
Framework

PAC-Bayes bounds for stable algorithms with instance-dependent priors


Title	PAC-Bayes bounds for stable algorithms with instance-dependent priors
Authors	Omar Rivasplata, Emilio Parrado-Hernandez, John Shawe-Taylor, Shiliang Sun, Csaba Szepesvari
Abstract	PAC-Bayes bounds have been proposed to get risk estimates based on a training sample. In this paper the PAC-Bayes approach is combined with stability of the hypothesis learned by a Hilbert space valued algorithm. The PAC-Bayes setting is used with a Gaussian prior centered at the expected output. Thus a novelty of our paper is using priors defined in terms of the data-generating distribution. Our main result estimates the risk of the randomized algorithm in terms of the hypothesis stability coefficients. We also provide a new bound for the SVM classifier, which is compared to other known bounds experimentally. Ours appears to be the first stability-based bound that evaluates to non-trivial values.
Tasks
Published	2018-06-18
URL	http://arxiv.org/abs/1806.06827v2
PDF	http://arxiv.org/pdf/1806.06827v2.pdf
PWC	https://paperswithcode.com/paper/pac-bayes-bounds-for-stable-algorithms-with
Repo
Framework

Unreasonable Effectivness of Deep Learning


Title	Unreasonable Effectivness of Deep Learning
Authors	Finn Macleod
Abstract	We show how well known rules of back propagation arise from a weighted combination of finite automata. By redefining a finite automata as a predictor we combine the set of all $k$-state finite automata using a weighted majority algorithm. This aggregated prediction algorithm can be simplified using symmetry, and we prove the equivalence of an algorithm that does this. We demonstrate that this algorithm is equivalent to a form of a back propagation acting in a completely connected $k$-node neural network. Thus the use of the weighted majority algorithm allows a bound on the general performance of deep learning approaches to prediction via known results from online statistics. The presented framework opens more detailed questions about network topology; it is a bridge to the well studied techniques of semigroup theory and applying these techniques to answer what specific network topologies are capable of predicting. This informs both the design of artificial networks and the exploration of neuroscience models.
Tasks
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10768v1
PDF	http://arxiv.org/pdf/1803.10768v1.pdf
PWC	https://paperswithcode.com/paper/unreasonable-effectivness-of-deep-learning
Repo
Framework

Unknown Examples & Machine Learning Model Generalization


Title	Unknown Examples & Machine Learning Model Generalization
Authors	Yeounoh Chung, Peter J. Haas, Eli Upfal, Tim Kraska
Abstract	Over the past decades, researchers and ML practitioners have come up with better and better ways to build, understand and improve the quality of ML models, but mostly under the key assumption that the training data is distributed identically to the testing data. In many real-world applications, however, some potential training examples are unknown to the modeler, due to sample selection bias or, more generally, covariate shift, i.e., a distribution shift between the training and deployment stage. The resulting discrepancy between training and testing distributions leads to poor generalization performance of the ML model and hence biased predictions. We provide novel algorithms that estimate the number and properties of these unknown training examples—unknown unknowns. This information can then be used to correct the training set, prior to seeing any test data. The key idea is to combine species-estimation techniques with data-driven methods for estimating the feature values for the unknown unknowns. Experiments on a variety of ML models and datasets indicate that taking the unknown examples into account can yield a more robust ML model that generalizes better.
Tasks
Published	2018-08-24
URL	https://arxiv.org/abs/1808.08294v2
PDF	https://arxiv.org/pdf/1808.08294v2.pdf
PWC	https://paperswithcode.com/paper/unknown-examples-machine-learning-model
Repo
Framework

End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator


Title	End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator
Authors	Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract	The speech chain mechanism integrates automatic speech recognition (ASR) and text-to-speech synthesis (TTS) modules into a single cycle during training. In our previous work, we applied a speech chain mechanism as a semi-supervised learning. It provides the ability for ASR and TTS to assist each other when they receive unpaired data and let them infer the missing pair and optimize the model with reconstruction loss. If we only have speech without transcription, ASR generates the most likely transcription from the speech data, and then TTS uses the generated transcription to reconstruct the original speech features. However, in previous papers, we just limited our back-propagation to the closest module, which is the TTS part. One reason is that back-propagating the error through the ASR is challenging due to the output of the ASR are discrete tokens, creating non-differentiability between the TTS and ASR. In this paper, we address this problem and describe how to thoroughly train a speech chain end-to-end for reconstruction loss using a straight-through estimator (ST). Experimental results revealed that, with sampling from ST-Gumbel-Softmax, we were able to update ASR parameters and improve the ASR performances by 11% relative CER reduction compared to the baseline.
Tasks	Speech Recognition, Speech Synthesis, Text-To-Speech Synthesis
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13107v1
PDF	http://arxiv.org/pdf/1810.13107v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-feedback-loss-in-speech-chain
Repo
Framework