May 5, 2019

2675 words 13 mins read

Paper Group ANR 489

Paper Group ANR 489

No bad local minima: Data independent training error guarantees for multilayer neural networks. Learning Gaussian Graphical Models With Fractional Marginal Pseudo-likelihood. Accelerating Deep Learning with Shrinkage and Recall. Neural Coarse-Graining: Extracting slowly-varying latent degrees of freedom with neural networks. Correct classification …

No bad local minima: Data independent training error guarantees for multilayer neural networks

Title No bad local minima: Data independent training error guarantees for multilayer neural networks
Authors Daniel Soudry, Yair Carmon
Abstract We use smoothed analysis techniques to provide guarantees on the training loss of Multilayer Neural Networks (MNNs) at differentiable local minima. Specifically, we examine MNNs with piecewise linear activation functions, quadratic loss and a single output, under mild over-parametrization. We prove that for a MNN with one hidden layer, the training error is zero at every differentiable local minimum, for almost every dataset and dropout-like noise realization. We then extend these results to the case of more than one hidden layer. Our theoretical guarantees assume essentially nothing on the training data, and are verified numerically. These results suggest why the highly non-convex loss of such MNNs can be easily optimized using local updates (e.g., stochastic gradient descent), as observed empirically.
Tasks
Published 2016-05-26
URL http://arxiv.org/abs/1605.08361v2
PDF http://arxiv.org/pdf/1605.08361v2.pdf
PWC https://paperswithcode.com/paper/no-bad-local-minima-data-independent-training
Repo
Framework

Learning Gaussian Graphical Models With Fractional Marginal Pseudo-likelihood

Title Learning Gaussian Graphical Models With Fractional Marginal Pseudo-likelihood
Authors Janne Leppä-aho, Johan Pensar, Teemu Roos, Jukka Corander
Abstract We propose a Bayesian approximate inference method for learning the dependence structure of a Gaussian graphical model. Using pseudo-likelihood, we derive an analytical expression to approximate the marginal likelihood for an arbitrary graph structure without invoking any assumptions about decomposability. The majority of the existing methods for learning Gaussian graphical models are either restricted to decomposable graphs or require specification of a tuning parameter that may have a substantial impact on learned structures. By combining a simple sparsity inducing prior for the graph structures with a default reference prior for the model parameters, we obtain a fast and easily applicable scoring function that works well for even high-dimensional data. We demonstrate the favourable performance of our approach by large-scale comparisons against the leading methods for learning non-decomposable Gaussian graphical models. A theoretical justification for our method is provided by showing that it yields a consistent estimator of the graph structure.
Tasks
Published 2016-02-25
URL http://arxiv.org/abs/1602.07863v1
PDF http://arxiv.org/pdf/1602.07863v1.pdf
PWC https://paperswithcode.com/paper/learning-gaussian-graphical-models-with-1
Repo
Framework

Accelerating Deep Learning with Shrinkage and Recall

Title Accelerating Deep Learning with Shrinkage and Recall
Authors Shuai Zheng, Abhinav Vishnu, Chris Ding
Abstract Deep Learning is a very powerful machine learning model. Deep Learning trains a large number of parameters for multiple layers and is very slow when data is in large scale and the architecture size is large. Inspired from the shrinking technique used in accelerating computation of Support Vector Machines (SVM) algorithm and screening technique used in LASSO, we propose a shrinking Deep Learning with recall (sDLr) approach to speed up deep learning computation. We experiment shrinking Deep Learning with recall (sDLr) using Deep Neural Network (DNN), Deep Belief Network (DBN) and Convolution Neural Network (CNN) on 4 data sets. Results show that the speedup using shrinking Deep Learning with recall (sDLr) can reach more than 2.0 while still giving competitive classification performance.
Tasks
Published 2016-05-04
URL http://arxiv.org/abs/1605.01369v2
PDF http://arxiv.org/pdf/1605.01369v2.pdf
PWC https://paperswithcode.com/paper/accelerating-deep-learning-with-shrinkage-and
Repo
Framework

Neural Coarse-Graining: Extracting slowly-varying latent degrees of freedom with neural networks

Title Neural Coarse-Graining: Extracting slowly-varying latent degrees of freedom with neural networks
Authors Nicholas Guttenberg, Martin Biehl, Ryota Kanai
Abstract We present a loss function for neural networks that encompasses an idea of trivial versus non-trivial predictions, such that the network jointly determines its own prediction goals and learns to satisfy them. This permits the network to choose sub-sets of a problem which are most amenable to its abilities to focus on solving, while discarding ‘distracting’ elements that interfere with its learning. To do this, the network first transforms the raw data into a higher-level categorical representation, and then trains a predictor from that new time series to its future. To prevent a trivial solution of mapping the signal to zero, we introduce a measure of non-triviality via a contrast between the prediction error of the learned model with a naive model of the overall signal statistics. The transform can learn to discard uninformative and unpredictable components of the signal in favor of the features which are both highly predictive and highly predictable. This creates a coarse-grained model of the time-series dynamics, focusing on predicting the slowly varying latent parameters which control the statistics of the time-series, rather than predicting the fast details directly. The result is a semi-supervised algorithm which is capable of extracting latent parameters, segmenting sections of time-series with differing statistics, and building a higher-level representation of the underlying dynamics from unlabeled data.
Tasks Time Series
Published 2016-09-01
URL http://arxiv.org/abs/1609.00116v1
PDF http://arxiv.org/pdf/1609.00116v1.pdf
PWC https://paperswithcode.com/paper/neural-coarse-graining-extracting-slowly
Repo
Framework

Correct classification for big/smart/fast data machine learning

Title Correct classification for big/smart/fast data machine learning
Authors Sander Stepanov
Abstract Table (database) / Relational database Classification for big/smart/fast data machine learning is one of the most important tasks of predictive analytics and extracting valuable information from data. It is core applied technique for what now understood under data science and/or artificial intelligence. Widely used Decision Tree (Random Forest) and rare used rule based PRISM , VFST, etc classifiers are empirical substitutions of theoretically correct to use Boolean functions minimization. Developing Minimization of Boolean functions algorithms is started long time ago by Edward Veitch’s 1952. Since it, big efforts by wide scientific/industrial community was done to find feasible solution of Boolean functions minimization. In this paper we propose consider table data classification from mathematical point of view, as minimization of Boolean functions. It is shown that data representation may be transformed to Boolean functions form and how to use known algorithms. For simplicity, binary output function is used for development, what opens doors for multivalued outputs developments.
Tasks
Published 2016-09-27
URL http://arxiv.org/abs/1609.08550v1
PDF http://arxiv.org/pdf/1609.08550v1.pdf
PWC https://paperswithcode.com/paper/correct-classification-for-bigsmartfast-data
Repo
Framework

Differentiable Programs with Neural Libraries

Title Differentiable Programs with Neural Libraries
Authors Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow
Abstract We develop a framework for combining differentiable programming languages with neural networks. Using this framework we create end-to-end trainable systems that learn to write interpretable algorithms with perceptual components. We explore the benefits of inductive biases for strong generalization and modularity that come from the program-like structure of our models. In particular, modularity allows us to learn a library of (neural) functions which grows and improves as more tasks are solved. Empirically, we show that this leads to lifelong learning systems that transfer knowledge to new tasks more effectively than baselines.
Tasks
Published 2016-11-07
URL http://arxiv.org/abs/1611.02109v2
PDF http://arxiv.org/pdf/1611.02109v2.pdf
PWC https://paperswithcode.com/paper/differentiable-programs-with-neural-libraries
Repo
Framework

Appearance Harmonization for Single Image Shadow Removal

Title Appearance Harmonization for Single Image Shadow Removal
Authors Liqian Ma, Jue Wang, Eli Shechtman, Kalyan Sunkavalli, Shimin Hu
Abstract Shadows often create unwanted artifacts in photographs, and removing them can be very challenging. Previous shadow removal methods often produce de-shadowed regions that are visually inconsistent with the rest of the image. In this work we propose a fully automatic shadow region harmonization approach that improves the appearance compatibility of the de-shadowed region as typically produced by previous methods. It is based on a shadow-guided patch-based image synthesis approach that reconstructs the shadow region using patches sampled from non-shadowed regions. The result is then refined based on the reconstruction confidence to handle unique image patterns. Many shadow removal results and comparisons are show the effectiveness of our improvement. Quantitative evaluation on a benchmark dataset suggests that our automatic shadow harmonization approach effectively improves upon the state-of-the-art.
Tasks Image Generation, Image Shadow Removal
Published 2016-03-21
URL http://arxiv.org/abs/1603.06398v1
PDF http://arxiv.org/pdf/1603.06398v1.pdf
PWC https://paperswithcode.com/paper/appearance-harmonization-for-single-image
Repo
Framework

Proceedings First International Workshop on Hammers for Type Theories

Title Proceedings First International Workshop on Hammers for Type Theories
Authors Jasmin Christian Blanchette, Cezary Kaliszyk
Abstract This volume of EPTCS contains the proceedings of the First Workshop on Hammers for Type Theories (HaTT 2016), held on 1 July 2016 as part of the International Joint Conference on Automated Reasoning (IJCAR 2016) in Coimbra, Portugal. The proceedings contain four regular papers, as well as abstracts of the two invited talks by Pierre Corbineau (Verimag, France) and Aleksy Schubert (University of Warsaw, Poland).
Tasks
Published 2016-06-17
URL http://arxiv.org/abs/1606.05427v1
PDF http://arxiv.org/pdf/1606.05427v1.pdf
PWC https://paperswithcode.com/paper/proceedings-first-international-workshop-on
Repo
Framework

Patterns of Scalable Bayesian Inference

Title Patterns of Scalable Bayesian Inference
Authors Elaine Angelino, Matthew James Johnson, Ryan P. Adams
Abstract Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward.
Tasks Bayesian Inference
Published 2016-02-16
URL http://arxiv.org/abs/1602.05221v2
PDF http://arxiv.org/pdf/1602.05221v2.pdf
PWC https://paperswithcode.com/paper/patterns-of-scalable-bayesian-inference
Repo
Framework

Generalizing Prototype Theory: A Formal Quantum Framework

Title Generalizing Prototype Theory: A Formal Quantum Framework
Authors Diederik Aerts, Jan Broekaert, Liane Gabora, Sandro Sozzo
Abstract Theories of natural language and concepts have been unable to model the flexibility, creativity, context-dependence, and emergence, exhibited by words, concepts and their combinations. The mathematical formalism of quantum theory has instead been successful in capturing these phenomena such as graded membership, situational meaning, composition of categories, and also more complex decision making situations, which cannot be modeled in traditional probabilistic approaches. We show how a formal quantum approach to concepts and their combinations can provide a powerful extension of prototype theory. We explain how prototypes can interfere in conceptual combinations as a consequence of their contextual interactions, and provide an illustration of this using an intuitive wave-like diagram. This quantum-conceptual approach gives new life to original prototype theory, without however making it a privileged concept theory, as we explain at the end of our paper.
Tasks Decision Making
Published 2016-01-25
URL http://arxiv.org/abs/1601.06610v1
PDF http://arxiv.org/pdf/1601.06610v1.pdf
PWC https://paperswithcode.com/paper/generalizing-prototype-theory-a-formal
Repo
Framework

Confidence-Constrained Maximum Entropy Framework for Learning from Multi-Instance Data

Title Confidence-Constrained Maximum Entropy Framework for Learning from Multi-Instance Data
Authors Behrouz Behmardi, Forrest Briggs, Xiaoli Z. Fern, Raviv Raich
Abstract Multi-instance data, in which each object (bag) contains a collection of instances, are widespread in machine learning, computer vision, bioinformatics, signal processing, and social sciences. We present a maximum entropy (ME) framework for learning from multi-instance data. In this approach each bag is represented as a distribution using the principle of ME. We introduce the concept of confidence-constrained ME (CME) to simultaneously learn the structure of distribution space and infer each distribution. The shared structure underlying each density is used to learn from instances inside each bag. The proposed CME is free of tuning parameters. We devise a fast optimization algorithm capable of handling large scale multi-instance data. In the experimental section, we evaluate the performance of the proposed approach in terms of exact rank recovery in the space of distributions and compare it with the regularized ME approach. Moreover, we compare the performance of CME with Multi-Instance Learning (MIL) state-of-the-art algorithms and show a comparable performance in terms of accuracy with reduced computational complexity.
Tasks
Published 2016-03-07
URL http://arxiv.org/abs/1603.01901v1
PDF http://arxiv.org/pdf/1603.01901v1.pdf
PWC https://paperswithcode.com/paper/confidence-constrained-maximum-entropy
Repo
Framework

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

Title Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations
Authors Behnam Neyshabur, Yuhuai Wu, Ruslan Salakhutdinov, Nathan Srebro
Abstract We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require capturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes.
Tasks
Published 2016-05-23
URL http://arxiv.org/abs/1605.07154v1
PDF http://arxiv.org/pdf/1605.07154v1.pdf
PWC https://paperswithcode.com/paper/path-normalized-optimization-of-recurrent
Repo
Framework

A Scalable and Robust Framework for Intelligent Real-time Video Surveillance

Title A Scalable and Robust Framework for Intelligent Real-time Video Surveillance
Authors Shreenath Dutt, Ankita Kalra
Abstract In this paper, we present an intelligent, reliable and storage-efficient video surveillance system using Apache Storm and OpenCV. As a Storm topology, we have added multiple information extraction modules that only write important content to the disk. Our topology is extensible, capable of adding novel algorithms as per the use case without affecting the existing ones, since all the processing is independent of each other. This framework is also highly scalable and fault tolerant, which makes it a best option for organisations that need to monitor a large network of surveillance cameras.
Tasks
Published 2016-10-30
URL http://arxiv.org/abs/1610.09590v1
PDF http://arxiv.org/pdf/1610.09590v1.pdf
PWC https://paperswithcode.com/paper/a-scalable-and-robust-framework-for
Repo
Framework

Word sense disambiguation: a complex network approach

Title Word sense disambiguation: a complex network approach
Authors Edilson A. Correa Jr., Alneu de Andrade Lopes, Diego R. Amancio
Abstract In recent years, concepts and methods of complex networks have been employed to tackle the word sense disambiguation (WSD) task by representing words as nodes, which are connected if they are semantically similar. Despite the increasingly number of studies carried out with such models, most of them use networks just to represent the data, while the pattern recognition performed on the attribute space is performed using traditional learning techniques. In other words, the structural relationship between words have not been explicitly used in the pattern recognition process. In addition, only a few investigations have probed the suitability of representations based on bipartite networks and graphs (bigraphs) for the problem, as many approaches consider all possible links between words. In this context, we assess the relevance of a bipartite network model representing both feature words (i.e. the words characterizing the context) and target (ambiguous) words to solve ambiguities in written texts. Here, we focus on the semantical relationships between these two type of words, disregarding the relationships between feature words. In special, the proposed method not only serves to represent texts as graphs, but also constructs a structure on which the discrimination of senses is accomplished. Our results revealed that the proposed learning algorithm in such bipartite networks provides excellent results mostly when topical features are employed to characterize the context. Surprisingly, our method even outperformed the support vector machine algorithm in particular cases, with the advantage of being robust even if a small training dataset is available. Taken together, the results obtained here show that the proposed representation/classification method might be useful to improve the semantical characterization of written texts.
Tasks Word Sense Disambiguation
Published 2016-06-25
URL http://arxiv.org/abs/1606.07950v2
PDF http://arxiv.org/pdf/1606.07950v2.pdf
PWC https://paperswithcode.com/paper/word-sense-disambiguation-a-complex-network
Repo
Framework

FusionNet: 3D Object Classification Using Multiple Data Representations

Title FusionNet: 3D Object Classification Using Multiple Data Representations
Authors Vishakh Hegde, Reza Zadeh
Abstract High-quality 3D object recognition is an important component of many vision and robotics systems. We tackle the object recognition problem using two data representations, to achieve leading results on the Princeton ModelNet challenge. The two representations: 1. Volumetric representation: the 3D object is discretized spatially as binary voxels - $1$ if the voxel is occupied and $0$ otherwise. 2. Pixel representation: the 3D object is represented as a set of projected 2D pixel images. Current leading submissions to the ModelNet Challenge use Convolutional Neural Networks (CNNs) on pixel representations. However, we diverge from this trend and additionally, use Volumetric CNNs to bridge the gap between the efficiency of the above two representations. We combine both representations and exploit them to learn new features, which yield a significantly better classifier than using either of the representations in isolation. To do this, we introduce new Volumetric CNN (V-CNN) architectures.
Tasks 3D Object Classification, 3D Object Recognition, Object Classification, Object Recognition
Published 2016-07-19
URL http://arxiv.org/abs/1607.05695v4
PDF http://arxiv.org/pdf/1607.05695v4.pdf
PWC https://paperswithcode.com/paper/fusionnet-3d-object-classification-using
Repo
Framework
comments powered by Disqus