July 28, 2019

2946 words 14 mins read

Paper Group ANR 445

Paper Group ANR 445

A wake-sleep algorithm for recurrent, spiking neural networks. On Quadratic Penalties in Elastic Weight Consolidation. A Generative Approach to Question Answering. Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks. LDMNet: Low Dimensional Manifold Regularized Neural Networks. Recent Progress of Fac …

A wake-sleep algorithm for recurrent, spiking neural networks

Title A wake-sleep algorithm for recurrent, spiking neural networks
Authors Johannes Thiele, Peter Diehl, Matthew Cook
Abstract We investigate a recently proposed model for cortical computation which performs relational inference. It consists of several interconnected, structurally equivalent populations of leaky integrate-and-fire (LIF) neurons, which are trained in a self-organized fashion with spike-timing dependent plasticity (STDP). Despite its robust learning dynamics, the model is susceptible to a problem typical for recurrent networks which use a correlation based (Hebbian) learning rule: if trained with high learning rates, the recurrent connections can cause strong feedback loops in the network dynamics, which lead to the emergence of attractor states. This causes a strong reduction in the number of representable patterns and a decay in the inference ability of the network. As a solution, we introduce a conceptually very simple “wake-sleep” algorithm: during the wake phase, training is executed normally, while during the sleep phase, the network “dreams” samples from its generative model, which are induced by random input. This process allows us to activate the attractor states in the network, which can then be unlearned effectively by an anti-Hebbian mechanism. The algorithm allows us to increase learning rates up to a factor of ten while avoiding clustering, which allows the network to learn several times faster. Also for low learning rates, where clustering is not an issue, it improves convergence speed and reduces the final inference error.
Tasks
Published 2017-03-18
URL http://arxiv.org/abs/1703.06290v1
PDF http://arxiv.org/pdf/1703.06290v1.pdf
PWC https://paperswithcode.com/paper/a-wake-sleep-algorithm-for-recurrent-spiking
Repo
Framework

On Quadratic Penalties in Elastic Weight Consolidation

Title On Quadratic Penalties in Elastic Weight Consolidation
Authors Ferenc Huszár
Abstract Elastic weight consolidation (EWC, Kirkpatrick et al, 2017) is a novel algorithm designed to safeguard against catastrophic forgetting in neural networks. EWC can be seen as an approximation to Laplace propagation (Eskin et al, 2004), and this view is consistent with the motivation given by Kirkpatrick et al (2017). In this note, I present an extended derivation that covers the case when there are more than two tasks. I show that the quadratic penalties in EWC are inconsistent with this derivation and might lead to double-counting data from earlier tasks.
Tasks
Published 2017-12-11
URL http://arxiv.org/abs/1712.03847v1
PDF http://arxiv.org/pdf/1712.03847v1.pdf
PWC https://paperswithcode.com/paper/on-quadratic-penalties-in-elastic-weight
Repo
Framework

A Generative Approach to Question Answering

Title A Generative Approach to Question Answering
Authors Rajarshee Mitra
Abstract Question Answering has come a long way from answer sentence selection, relational QA to reading and comprehension. We shift our attention to generative question answering (gQA) by which we facilitate machine to read passages and answer questions by learning to generate the answers. We frame the problem as a generative task where the encoder being a network that models the relationship between question and passage and encoding them to a vector thus facilitating the decoder to directly form an abstraction of the answer. Not being able to retain facts and making repetitions are common mistakes that affect the overall legibility of answers. To counter these issues, we employ copying mechanism and maintenance of coverage vector in our model respectively. Our results on MS-MARCO demonstrate it’s superiority over baselines and we also show qualitative examples where we improved in terms of correctness and readability
Tasks Question Answering
Published 2017-11-16
URL http://arxiv.org/abs/1711.06238v2
PDF http://arxiv.org/pdf/1711.06238v2.pdf
PWC https://paperswithcode.com/paper/a-generative-approach-to-question-answering
Repo
Framework

Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks

Title Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks
Authors Gerda Bortsova, Gijs van Tulder, Florian Dubost, Tingying Peng, Nassir Navab, Aad van der Lugt, Daniel Bos, Marleen de Bruijne
Abstract Intracranial carotid artery calcification (ICAC) is a major risk factor for stroke, and might contribute to dementia and cognitive decline. Reliance on time-consuming manual annotation of ICAC hampers much demanded further research into the relationship between ICAC and neurological diseases. Automation of ICAC segmentation is therefore highly desirable, but difficult due to the proximity of the lesions to bony structures with a similar attenuation coefficient. In this paper, we propose a method for automatic segmentation of ICAC; the first to our knowledge. Our method is based on a 3D fully convolutional neural network that we extend with two regularization techniques. Firstly, we use deep supervision (hidden layers supervision) to encourage discriminative features in the hidden layers. Secondly, we augment the network with skip connections, as in the recently developed ResNet, and dropout layers, inserted in a way that skip connections circumvent them. We investigate the effect of skip connections and dropout. In addition, we propose a simple problem-specific modification of the network objective function that restricts the focus to the most important image regions and simplifies the optimization. We train and validate our model using 882 CT scans and test on 1,000. Our regularization techniques and objective improve the average Dice score by 7.1%, yielding an average Dice of 76.2% and 97.7% correlation between predicted ICAC volumes and manual annotations.
Tasks
Published 2017-06-04
URL http://arxiv.org/abs/1706.01148v1
PDF http://arxiv.org/pdf/1706.01148v1.pdf
PWC https://paperswithcode.com/paper/segmentation-of-intracranial-arterial
Repo
Framework

LDMNet: Low Dimensional Manifold Regularized Neural Networks

Title LDMNet: Low Dimensional Manifold Regularized Neural Networks
Authors Wei Zhu, Qiang Qiu, Jiaji Huang, Robert Calderbank, Guillermo Sapiro, Ingrid Daubechies
Abstract Deep neural networks have proved very successful on archetypal tasks for which large training sets are available, but when the training data are scarce, their performance suffers from overfitting. Many existing methods of reducing overfitting are data-independent, and their efficacy is often limited when the training set is very small. Data-dependent regularizations are mostly motivated by the observation that data of interest lie close to a manifold, which is typically hard to parametrize explicitly and often requires human input of tangent vectors. These methods typically only focus on the geometry of the input data, and do not necessarily encourage the networks to produce geometrically meaningful features. To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features. In LDMNet, we regularize the network by encouraging the combination of the input data and the output features to sample a collection of low dimensional manifolds, which are searched efficiently without explicit parametrization. To achieve this, we directly use the manifold dimension as a regularization term in a variational functional. The resulting Euler-Lagrange equation is a Laplace-Beltrami equation over a point cloud, which is solved by the point integral method without increasing the computational complexity. We demonstrate two benefits of LDMNet in the experiments. First, we show that LDMNet significantly outperforms widely-used network regularizers such as weight decay and DropOut. Second, we show that LDMNet can be designed to extract common features of an object imaged via different modalities, which proves to be very useful in real-world applications such as cross-spectral face recognition.
Tasks Face Recognition
Published 2017-11-16
URL http://arxiv.org/abs/1711.06246v1
PDF http://arxiv.org/pdf/1711.06246v1.pdf
PWC https://paperswithcode.com/paper/ldmnet-low-dimensional-manifold-regularized
Repo
Framework

Recent Progress of Face Image Synthesis

Title Recent Progress of Face Image Synthesis
Authors Zhihe Lu, Zhihang Li, Jie Cao, Ran He, Zhenan Sun
Abstract Face synthesis has been a fascinating yet challenging problem in computer vision and machine learning. Its main research effort is to design algorithms to generate photo-realistic face images via given semantic domain. It has been a crucial prepossessing step of main-stream face recognition approaches and an excellent test of AI ability to use complicated probability distributions. In this paper, we provide a comprehensive review of typical face synthesis works that involve traditional methods as well as advanced deep learning approaches. Particularly, Generative Adversarial Net (GAN) is highlighted to generate photo-realistic and identity preserving results. Furthermore, the public available databases and evaluation metrics are introduced in details. We end the review with discussing unsolved difficulties and promising directions for future research.
Tasks Face Generation, Face Recognition, Image Generation
Published 2017-06-15
URL http://arxiv.org/abs/1706.04717v1
PDF http://arxiv.org/pdf/1706.04717v1.pdf
PWC https://paperswithcode.com/paper/recent-progress-of-face-image-synthesis
Repo
Framework

Multibiometric Secure System Based on Deep Learning

Title Multibiometric Secure System Based on Deep Learning
Authors Veeru Talreja, Matthew C. Valenti, Nasser M. Nasrabadi
Abstract In this paper, we propose a secure multibiometric system that uses deep neural networks and error-correction coding. We present a feature-level fusion framework to generate a secure multibiometric template from each user’s multiple biometrics. Two fusion architectures, fully connected architecture and bilinear architecture, are implemented to develop a robust multibiometric shared representation. The shared representation is used to generate a cancelable biometric template that involves the selection of a different set of reliable and discriminative features for each user. This cancelable template is a binary vector and is passed through an appropriate error-correcting decoder to find a closest codeword and this codeword is hashed to generate the final secure template. The efficacy of the proposed approach is shown using a multimodal database where we achieve state-of-the-art matching performance, along with cancelability and security.
Tasks
Published 2017-08-07
URL http://arxiv.org/abs/1708.02314v1
PDF http://arxiv.org/pdf/1708.02314v1.pdf
PWC https://paperswithcode.com/paper/multibiometric-secure-system-based-on-deep
Repo
Framework

Avoiding Discrimination through Causal Reasoning

Title Avoiding Discrimination through Causal Reasoning
Authors Niki Kilbertus, Mateo Rojas-Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, Bernhard Schölkopf
Abstract Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively. Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from “What is the right fairness criterion?” to “What do we want to assume about the causal data generating process?” Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.
Tasks
Published 2017-06-08
URL http://arxiv.org/abs/1706.02744v2
PDF http://arxiv.org/pdf/1706.02744v2.pdf
PWC https://paperswithcode.com/paper/avoiding-discrimination-through-causal
Repo
Framework

Adaptivity to Noise Parameters in Nonparametric Active Learning

Title Adaptivity to Noise Parameters in Nonparametric Active Learning
Authors Andrea Locatelli, Alexandra Carpentier, Samory Kpotufe
Abstract This work addresses various open questions in the theory of active learning for nonparametric classification. Our contributions are both statistical and algorithmic: -We establish new minimax-rates for active learning under common \textit{noise conditions}. These rates display interesting transitions – due to the interaction between noise \textit{smoothness and margin} – not present in the passive setting. Some such transitions were previously conjectured, but remained unconfirmed. -We present a generic algorithmic strategy for adaptivity to unknown noise smoothness and margin; our strategy achieves optimal rates in many general situations; furthermore, unlike in previous work, we avoid the need for \textit{adaptive confidence sets}, resulting in strictly milder distributional requirements.
Tasks Active Learning
Published 2017-03-16
URL http://arxiv.org/abs/1703.05841v1
PDF http://arxiv.org/pdf/1703.05841v1.pdf
PWC https://paperswithcode.com/paper/adaptivity-to-noise-parameters-in
Repo
Framework

Bayesian stochastic blockmodeling

Title Bayesian stochastic blockmodeling
Authors Tiago P. Peixoto
Abstract This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.
Tasks Bayesian Inference, Model Selection
Published 2017-05-29
URL https://arxiv.org/abs/1705.10225v8
PDF https://arxiv.org/pdf/1705.10225v8.pdf
PWC https://paperswithcode.com/paper/bayesian-stochastic-blockmodeling
Repo
Framework

Image Forgery Localization Based on Multi-Scale Convolutional Neural Networks

Title Image Forgery Localization Based on Multi-Scale Convolutional Neural Networks
Authors Yaqi Liu, Qingxiao Guan, Xianfeng Zhao, Yun Cao
Abstract In this paper, we propose to utilize Convolutional Neural Networks (CNNs) and the segmentation-based multi-scale analysis to locate tampered areas in digital images. First, to deal with color input sliding windows of different scales, a unified CNN architecture is designed. Then, we elaborately design the training procedures of CNNs on sampled training patches. With a set of robust multi-scale tampering detectors based on CNNs, complementary tampering possibility maps can be generated. Last but not least, a segmentation-based method is proposed to fuse the maps and generate the final decision map. By exploiting the benefits of both the small-scale and large-scale analyses, the segmentation-based multi-scale analysis can lead to a performance leap in forgery localization of CNNs. Numerous experiments are conducted to demonstrate the effectiveness and efficiency of our method.
Tasks
Published 2017-06-13
URL http://arxiv.org/abs/1706.07842v4
PDF http://arxiv.org/pdf/1706.07842v4.pdf
PWC https://paperswithcode.com/paper/image-forgery-localization-based-on-multi
Repo
Framework

Evaluating vector-space models of analogy

Title Evaluating vector-space models of analogy
Authors Dawn Chen, Joshua C. Peterson, Thomas L. Griffiths
Abstract Vector-space representations provide geometric tools for reasoning about the similarity of a set of objects and their relationships. Recent machine learning methods for deriving vector-space embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These vector spaces have also been shown to exhibit a surprising capacity to capture verbal analogies, with similar results for natural images, giving new life to a classic model of analogies as parallelograms that was first proposed by cognitive scientists. We evaluate the parallelogram model of analogy as applied to modern word embeddings, providing a detailed analysis of the extent to which this approach captures human relational similarity judgments in a large benchmark dataset. We find that that some semantic relationships are better captured than others. We then provide evidence for deeper limitations of the parallelogram model based on the intrinsic geometric constraints of vector spaces, paralleling classic results for first-order similarity.
Tasks Word Embeddings
Published 2017-05-12
URL http://arxiv.org/abs/1705.04416v2
PDF http://arxiv.org/pdf/1705.04416v2.pdf
PWC https://paperswithcode.com/paper/evaluating-vector-space-models-of-analogy
Repo
Framework

Embedding Feature Selection for Large-scale Hierarchical Classification

Title Embedding Feature Selection for Large-scale Hierarchical Classification
Authors Azad Naik, Huzefa Rangwala
Abstract Large-scale Hierarchical Classification (HC) involves datasets consisting of thousands of classes and millions of training instances with high-dimensional features posing several big data challenges. Feature selection that aims to select the subset of discriminant features is an effective strategy to deal with large-scale HC problem. It speeds up the training process, reduces the prediction time and minimizes the memory requirements by compressing the total size of learned model weight vectors. Majority of the studies have also shown feature selection to be competent and successful in improving the classification accuracy by removing irrelevant features. In this work, we investigate various filter-based feature selection methods for dimensionality reduction to solve the large-scale HC problem. Our experimental evaluation on text and image datasets with varying distribution of features, classes and instances shows upto 3x order of speed-up on massive datasets and upto 45% less memory requirements for storing the weight vectors of learned model without any significant loss (improvement for some datasets) in the classification accuracy. Source Code: https://cs.gmu.edu/~mlbio/featureselection.
Tasks Dimensionality Reduction, Feature Selection
Published 2017-06-06
URL http://arxiv.org/abs/1706.01581v1
PDF http://arxiv.org/pdf/1706.01581v1.pdf
PWC https://paperswithcode.com/paper/embedding-feature-selection-for-large-scale
Repo
Framework

Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap

Title Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap
Authors Aryan Mokhtari, Hamed Hassani, Amin Karbasi
Abstract In this paper, we study the problem of \textit{constrained} and \textit{stochastic} continuous submodular maximization. Even though the objective function is not concave (nor convex) and is defined in terms of an expectation, we develop a variant of the conditional gradient method, called \alg, which achieves a \textit{tight} approximation guarantee. More precisely, for a monotone and continuous DR-submodular function and subject to a \textit{general} convex body constraint, we prove that \alg achieves a $[(1-1/e)\text{OPT} -\eps]$ guarantee (in expectation) with $\mathcal{O}{(1/\eps^3)}$ stochastic gradient computations. This guarantee matches the known hardness results and closes the gap between deterministic and stochastic continuous submodular maximization. By using stochastic continuous optimization as an interface, we also provide the first $(1-1/e)$ tight approximation guarantee for maximizing a \textit{monotone but stochastic} submodular \textit{set} function subject to a general matroid constraint.
Tasks
Published 2017-11-05
URL http://arxiv.org/abs/1711.01660v1
PDF http://arxiv.org/pdf/1711.01660v1.pdf
PWC https://paperswithcode.com/paper/conditional-gradient-method-for-stochastic
Repo
Framework

An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation

Title An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation
Authors Conrad Sanderson, Ryan Curtin
Abstract Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded versions of the Expectation Maximisation (EM) and k-means training algorithms. Multi-threading is achieved through reformulation of the EM and k-means algorithms into a MapReduce-like framework. Furthermore, the implementation uses several techniques to improve numerical stability and modelling accuracy. We demonstrate that the multi-threaded implementation achieves a speedup of an order of magnitude on a recent 16 core machine, and that it can achieve higher modelling accuracy than a previously well-established publically accessible implementation. The multi-threaded implementation is included as a user-friendly class in recent releases of the open source Armadillo C++ linear algebra library. The library is provided under the permissive Apache~2.0 license, allowing unencumbered use in commercial products.
Tasks
Published 2017-07-28
URL http://arxiv.org/abs/1707.09094v1
PDF http://arxiv.org/pdf/1707.09094v1.pdf
PWC https://paperswithcode.com/paper/an-open-source-c-implementation-of-multi
Repo
Framework
comments powered by Disqus