July 28, 2019

2946 words 14 mins read

Paper Group ANR 445

A wake-sleep algorithm for recurrent, spiking neural networks. On Quadratic Penalties in Elastic Weight Consolidation. A Generative Approach to Question Answering. Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks. LDMNet: Low Dimensional Manifold Regularized Neural Networks. Recent Progress of Fac …

A wake-sleep algorithm for recurrent, spiking neural networks


Title	A wake-sleep algorithm for recurrent, spiking neural networks
Authors	Johannes Thiele, Peter Diehl, Matthew Cook
Abstract	We investigate a recently proposed model for cortical computation which performs relational inference. It consists of several interconnected, structurally equivalent populations of leaky integrate-and-fire (LIF) neurons, which are trained in a self-organized fashion with spike-timing dependent plasticity (STDP). Despite its robust learning dynamics, the model is susceptible to a problem typical for recurrent networks which use a correlation based (Hebbian) learning rule: if trained with high learning rates, the recurrent connections can cause strong feedback loops in the network dynamics, which lead to the emergence of attractor states. This causes a strong reduction in the number of representable patterns and a decay in the inference ability of the network. As a solution, we introduce a conceptually very simple “wake-sleep” algorithm: during the wake phase, training is executed normally, while during the sleep phase, the network “dreams” samples from its generative model, which are induced by random input. This process allows us to activate the attractor states in the network, which can then be unlearned effectively by an anti-Hebbian mechanism. The algorithm allows us to increase learning rates up to a factor of ten while avoiding clustering, which allows the network to learn several times faster. Also for low learning rates, where clustering is not an issue, it improves convergence speed and reduces the final inference error.
Tasks
Published	2017-03-18
URL	http://arxiv.org/abs/1703.06290v1
PDF	http://arxiv.org/pdf/1703.06290v1.pdf
PWC	https://paperswithcode.com/paper/a-wake-sleep-algorithm-for-recurrent-spiking
Repo
Framework

On Quadratic Penalties in Elastic Weight Consolidation


Title	On Quadratic Penalties in Elastic Weight Consolidation
Authors	Ferenc Huszár
Abstract	Elastic weight consolidation (EWC, Kirkpatrick et al, 2017) is a novel algorithm designed to safeguard against catastrophic forgetting in neural networks. EWC can be seen as an approximation to Laplace propagation (Eskin et al, 2004), and this view is consistent with the motivation given by Kirkpatrick et al (2017). In this note, I present an extended derivation that covers the case when there are more than two tasks. I show that the quadratic penalties in EWC are inconsistent with this derivation and might lead to double-counting data from earlier tasks.
Tasks
Published	2017-12-11
URL	http://arxiv.org/abs/1712.03847v1
PDF	http://arxiv.org/pdf/1712.03847v1.pdf
PWC	https://paperswithcode.com/paper/on-quadratic-penalties-in-elastic-weight
Repo
Framework

A Generative Approach to Question Answering


Title	A Generative Approach to Question Answering
Authors	Rajarshee Mitra
Abstract	Question Answering has come a long way from answer sentence selection, relational QA to reading and comprehension. We shift our attention to generative question answering (gQA) by which we facilitate machine to read passages and answer questions by learning to generate the answers. We frame the problem as a generative task where the encoder being a network that models the relationship between question and passage and encoding them to a vector thus facilitating the decoder to directly form an abstraction of the answer. Not being able to retain facts and making repetitions are common mistakes that affect the overall legibility of answers. To counter these issues, we employ copying mechanism and maintenance of coverage vector in our model respectively. Our results on MS-MARCO demonstrate it’s superiority over baselines and we also show qualitative examples where we improved in terms of correctness and readability
Tasks	Question Answering
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06238v2
PDF	http://arxiv.org/pdf/1711.06238v2.pdf
PWC	https://paperswithcode.com/paper/a-generative-approach-to-question-answering
Repo
Framework

Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks


Title	Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks
Authors	Gerda Bortsova, Gijs van Tulder, Florian Dubost, Tingying Peng, Nassir Navab, Aad van der Lugt, Daniel Bos, Marleen de Bruijne
Abstract	Intracranial carotid artery calcification (ICAC) is a major risk factor for stroke, and might contribute to dementia and cognitive decline. Reliance on time-consuming manual annotation of ICAC hampers much demanded further research into the relationship between ICAC and neurological diseases. Automation of ICAC segmentation is therefore highly desirable, but difficult due to the proximity of the lesions to bony structures with a similar attenuation coefficient. In this paper, we propose a method for automatic segmentation of ICAC; the first to our knowledge. Our method is based on a 3D fully convolutional neural network that we extend with two regularization techniques. Firstly, we use deep supervision (hidden layers supervision) to encourage discriminative features in the hidden layers. Secondly, we augment the network with skip connections, as in the recently developed ResNet, and dropout layers, inserted in a way that skip connections circumvent them. We investigate the effect of skip connections and dropout. In addition, we propose a simple problem-specific modification of the network objective function that restricts the focus to the most important image regions and simplifies the optimization. We train and validate our model using 882 CT scans and test on 1,000. Our regularization techniques and objective improve the average Dice score by 7.1%, yielding an average Dice of 76.2% and 97.7% correlation between predicted ICAC volumes and manual annotations.
Tasks
Published	2017-06-04
URL	http://arxiv.org/abs/1706.01148v1
PDF	http://arxiv.org/pdf/1706.01148v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-intracranial-arterial
Repo
Framework

LDMNet: Low Dimensional Manifold Regularized Neural Networks


Title	LDMNet: Low Dimensional Manifold Regularized Neural Networks
Authors	Wei Zhu, Qiang Qiu, Jiaji Huang, Robert Calderbank, Guillermo Sapiro, Ingrid Daubechies
Abstract	Deep neural networks have proved very successful on archetypal tasks for which large training sets are available, but when the training data are scarce, their performance suffers from overfitting. Many existing methods of reducing overfitting are data-independent, and their efficacy is often limited when the training set is very small. Data-dependent regularizations are mostly motivated by the observation that data of interest lie close to a manifold, which is typically hard to parametrize explicitly and often requires human input of tangent vectors. These methods typically only focus on the geometry of the input data, and do not necessarily encourage the networks to produce geometrically meaningful features. To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features. In LDMNet, we regularize the network by encouraging the combination of the input data and the output features to sample a collection of low dimensional manifolds, which are searched efficiently without explicit parametrization. To achieve this, we directly use the manifold dimension as a regularization term in a variational functional. The resulting Euler-Lagrange equation is a Laplace-Beltrami equation over a point cloud, which is solved by the point integral method without increasing the computational complexity. We demonstrate two benefits of LDMNet in the experiments. First, we show that LDMNet significantly outperforms widely-used network regularizers such as weight decay and DropOut. Second, we show that LDMNet can be designed to extract common features of an object imaged via different modalities, which proves to be very useful in real-world applications such as cross-spectral face recognition.
Tasks	Face Recognition
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06246v1
PDF	http://arxiv.org/pdf/1711.06246v1.pdf
PWC	https://paperswithcode.com/paper/ldmnet-low-dimensional-manifold-regularized
Repo
Framework

Recent Progress of Face Image Synthesis


Title	Recent Progress of Face Image Synthesis
Authors	Zhihe Lu, Zhihang Li, Jie Cao, Ran He, Zhenan Sun
Abstract	Face synthesis has been a fascinating yet challenging problem in computer vision and machine learning. Its main research effort is to design algorithms to generate photo-realistic face images via given semantic domain. It has been a crucial prepossessing step of main-stream face recognition approaches and an excellent test of AI ability to use complicated probability distributions. In this paper, we provide a comprehensive review of typical face synthesis works that involve traditional methods as well as advanced deep learning approaches. Particularly, Generative Adversarial Net (GAN) is highlighted to generate photo-realistic and identity preserving results. Furthermore, the public available databases and evaluation metrics are introduced in details. We end the review with discussing unsolved difficulties and promising directions for future research.
Tasks	Face Generation, Face Recognition, Image Generation
Published	2017-06-15
URL	http://arxiv.org/abs/1706.04717v1
PDF	http://arxiv.org/pdf/1706.04717v1.pdf
PWC	https://paperswithcode.com/paper/recent-progress-of-face-image-synthesis
Repo
Framework

Multibiometric Secure System Based on Deep Learning


Title	Multibiometric Secure System Based on Deep Learning
Authors	Veeru Talreja, Matthew C. Valenti, Nasser M. Nasrabadi
Abstract	In this paper, we propose a secure multibiometric system that uses deep neural networks and error-correction coding. We present a feature-level fusion framework to generate a secure multibiometric template from each user’s multiple biometrics. Two fusion architectures, fully connected architecture and bilinear architecture, are implemented to develop a robust multibiometric shared representation. The shared representation is used to generate a cancelable biometric template that involves the selection of a different set of reliable and discriminative features for each user. This cancelable template is a binary vector and is passed through an appropriate error-correcting decoder to find a closest codeword and this codeword is hashed to generate the final secure template. The efficacy of the proposed approach is shown using a multimodal database where we achieve state-of-the-art matching performance, along with cancelability and security.
Tasks
Published	2017-08-07
URL	http://arxiv.org/abs/1708.02314v1
PDF	http://arxiv.org/pdf/1708.02314v1.pdf
PWC	https://paperswithcode.com/paper/multibiometric-secure-system-based-on-deep
Repo
Framework

Avoiding Discrimination through Causal Reasoning


Title	Avoiding Discrimination through Causal Reasoning
Authors	Niki Kilbertus, Mateo Rojas-Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, Bernhard Schölkopf
Abstract	Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively. Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from “What is the right fairness criterion?” to “What do we want to assume about the causal data generating process?” Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.
Tasks
Published	2017-06-08
URL	http://arxiv.org/abs/1706.02744v2
PDF	http://arxiv.org/pdf/1706.02744v2.pdf
PWC	https://paperswithcode.com/paper/avoiding-discrimination-through-causal
Repo
Framework

Adaptivity to Noise Parameters in Nonparametric Active Learning


Title	Adaptivity to Noise Parameters in Nonparametric Active Learning
Authors	Andrea Locatelli, Alexandra Carpentier, Samory Kpotufe
Abstract	This work addresses various open questions in the theory of active learning for nonparametric classification. Our contributions are both statistical and algorithmic: -We establish new minimax-rates for active learning under common \textit{noise conditions}. These rates display interesting transitions – due to the interaction between noise \textit{smoothness and margin} – not present in the passive setting. Some such transitions were previously conjectured, but remained unconfirmed. -We present a generic algorithmic strategy for adaptivity to unknown noise smoothness and margin; our strategy achieves optimal rates in many general situations; furthermore, unlike in previous work, we avoid the need for \textit{adaptive confidence sets}, resulting in strictly milder distributional requirements.
Tasks	Active Learning
Published	2017-03-16
URL	http://arxiv.org/abs/1703.05841v1
PDF	http://arxiv.org/pdf/1703.05841v1.pdf
PWC	https://paperswithcode.com/paper/adaptivity-to-noise-parameters-in
Repo
Framework

Bayesian stochastic blockmodeling


Title	Bayesian stochastic blockmodeling
Authors	Tiago P. Peixoto
Abstract	This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.
Tasks	Bayesian Inference, Model Selection
Published	2017-05-29
URL	https://arxiv.org/abs/1705.10225v8
PDF	https://arxiv.org/pdf/1705.10225v8.pdf
PWC	https://paperswithcode.com/paper/bayesian-stochastic-blockmodeling
Repo
Framework

Image Forgery Localization Based on Multi-Scale Convolutional Neural Networks


Title	Image Forgery Localization Based on Multi-Scale Convolutional Neural Networks
Authors	Yaqi Liu, Qingxiao Guan, Xianfeng Zhao, Yun Cao
Abstract	In this paper, we propose to utilize Convolutional Neural Networks (CNNs) and the segmentation-based multi-scale analysis to locate tampered areas in digital images. First, to deal with color input sliding windows of different scales, a unified CNN architecture is designed. Then, we elaborately design the training procedures of CNNs on sampled training patches. With a set of robust multi-scale tampering detectors based on CNNs, complementary tampering possibility maps can be generated. Last but not least, a segmentation-based method is proposed to fuse the maps and generate the final decision map. By exploiting the benefits of both the small-scale and large-scale analyses, the segmentation-based multi-scale analysis can lead to a performance leap in forgery localization of CNNs. Numerous experiments are conducted to demonstrate the effectiveness and efficiency of our method.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.07842v4
PDF	http://arxiv.org/pdf/1706.07842v4.pdf
PWC	https://paperswithcode.com/paper/image-forgery-localization-based-on-multi
Repo
Framework

Evaluating vector-space models of analogy


Title	Evaluating vector-space models of analogy
Authors	Dawn Chen, Joshua C. Peterson, Thomas L. Griffiths
Abstract	Vector-space representations provide geometric tools for reasoning about the similarity of a set of objects and their relationships. Recent machine learning methods for deriving vector-space embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These vector spaces have also been shown to exhibit a surprising capacity to capture verbal analogies, with similar results for natural images, giving new life to a classic model of analogies as parallelograms that was first proposed by cognitive scientists. We evaluate the parallelogram model of analogy as applied to modern word embeddings, providing a detailed analysis of the extent to which this approach captures human relational similarity judgments in a large benchmark dataset. We find that that some semantic relationships are better captured than others. We then provide evidence for deeper limitations of the parallelogram model based on the intrinsic geometric constraints of vector spaces, paralleling classic results for first-order similarity.
Tasks	Word Embeddings
Published	2017-05-12
URL	http://arxiv.org/abs/1705.04416v2
PDF	http://arxiv.org/pdf/1705.04416v2.pdf
PWC	https://paperswithcode.com/paper/evaluating-vector-space-models-of-analogy
Repo
Framework

Embedding Feature Selection for Large-scale Hierarchical Classification


Title	Embedding Feature Selection for Large-scale Hierarchical Classification
Authors	Azad Naik, Huzefa Rangwala
Abstract	Large-scale Hierarchical Classification (HC) involves datasets consisting of thousands of classes and millions of training instances with high-dimensional features posing several big data challenges. Feature selection that aims to select the subset of discriminant features is an effective strategy to deal with large-scale HC problem. It speeds up the training process, reduces the prediction time and minimizes the memory requirements by compressing the total size of learned model weight vectors. Majority of the studies have also shown feature selection to be competent and successful in improving the classification accuracy by removing irrelevant features. In this work, we investigate various filter-based feature selection methods for dimensionality reduction to solve the large-scale HC problem. Our experimental evaluation on text and image datasets with varying distribution of features, classes and instances shows upto 3x order of speed-up on massive datasets and upto 45% less memory requirements for storing the weight vectors of learned model without any significant loss (improvement for some datasets) in the classification accuracy. Source Code: https://cs.gmu.edu/~mlbio/featureselection.
Tasks	Dimensionality Reduction, Feature Selection
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01581v1
PDF	http://arxiv.org/pdf/1706.01581v1.pdf
PWC	https://paperswithcode.com/paper/embedding-feature-selection-for-large-scale
Repo
Framework

Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap


Title	Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap
Authors	Aryan Mokhtari, Hamed Hassani, Amin Karbasi
Abstract	In this paper, we study the problem of \textit{constrained} and \textit{stochastic} continuous submodular maximization. Even though the objective function is not concave (nor convex) and is defined in terms of an expectation, we develop a variant of the conditional gradient method, called \alg, which achieves a \textit{tight} approximation guarantee. More precisely, for a monotone and continuous DR-submodular function and subject to a \textit{general} convex body constraint, we prove that \alg achieves a $[(1-1/e)\text{OPT} -\eps]$ guarantee (in expectation) with $\mathcal{O}{(1/\eps^3)}$ stochastic gradient computations. This guarantee matches the known hardness results and closes the gap between deterministic and stochastic continuous submodular maximization. By using stochastic continuous optimization as an interface, we also provide the first $(1-1/e)$ tight approximation guarantee for maximizing a \textit{monotone but stochastic} submodular \textit{set} function subject to a general matroid constraint.
Tasks
Published	2017-11-05
URL	http://arxiv.org/abs/1711.01660v1
PDF	http://arxiv.org/pdf/1711.01660v1.pdf
PWC	https://paperswithcode.com/paper/conditional-gradient-method-for-stochastic
Repo
Framework

An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation


Title	An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation
Authors	Conrad Sanderson, Ryan Curtin
Abstract	Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded versions of the Expectation Maximisation (EM) and k-means training algorithms. Multi-threading is achieved through reformulation of the EM and k-means algorithms into a MapReduce-like framework. Furthermore, the implementation uses several techniques to improve numerical stability and modelling accuracy. We demonstrate that the multi-threaded implementation achieves a speedup of an order of magnitude on a recent 16 core machine, and that it can achieve higher modelling accuracy than a previously well-established publically accessible implementation. The multi-threaded implementation is included as a user-friendly class in recent releases of the open source Armadillo C++ linear algebra library. The library is provided under the permissive Apache~2.0 license, allowing unencumbered use in commercial products.
Tasks
Published	2017-07-28
URL	http://arxiv.org/abs/1707.09094v1
PDF	http://arxiv.org/pdf/1707.09094v1.pdf
PWC	https://paperswithcode.com/paper/an-open-source-c-implementation-of-multi
Repo
Framework