April 2, 2020

3039 words 15 mins read

Paper Group ANR 90

Paper Group ANR 90

Almost Sure Convergence of Dropout Algorithms for Neural Networks. Modelisation de l’incertitude et de l’imprecision de donnees de crowdsourcing : MONITOR. Facets of the PIE Environment for Proving, Interpolating and Eliminating on the Basis of First-Order Logic. Semantic Relatedness and Taxonomic Word Embeddings. Communication-Efficient Distribute …

Almost Sure Convergence of Dropout Algorithms for Neural Networks

Title Almost Sure Convergence of Dropout Algorithms for Neural Networks
Authors Albert Senen-Cerda, Jaron Sanders
Abstract We investigate the convergence and convergence rate of stochastic training algorithms for Neural Networks (NNs) that, over the years, have spawned from Dropout (Hinton et al., 2012). Modeling that neurons in the brain may not fire, dropout algorithms consist in practice of multiplying the weight matrices of a NN component-wise by independently drawn random matrices with ${0,1}$-valued entries during each iteration of the Feedforward-Backpropagation algorithm. This paper presents a probability theoretical proof that for any NN topology and differentiable polynomially bounded activation functions, if we project the NN’s weights into a compact set and use a dropout algorithm, then the weights converge to a unique stationary set of a projected system of Ordinary Differential Equations (ODEs). We also establish an upper bound on the rate of convergence of Gradient Descent (GD) on the limiting ODEs of dropout algorithms for arborescences (a class of trees) of arbitrary depth and with linear activation functions.
Tasks
Published 2020-02-06
URL https://arxiv.org/abs/2002.02247v1
PDF https://arxiv.org/pdf/2002.02247v1.pdf
PWC https://paperswithcode.com/paper/almost-sure-convergence-of-dropout-algorithms
Repo
Framework

Modelisation de l’incertitude et de l’imprecision de donnees de crowdsourcing : MONITOR

Title Modelisation de l’incertitude et de l’imprecision de donnees de crowdsourcing : MONITOR
Authors Constance Thierry, Jean-Christophe Dubois, Yolande Le Gall, Arnaud Martin
Abstract Crowdsourcing is defined as the outsourcing of tasks to a crowd of contributors. The crowd is very diverse on these platforms and includes malicious contributors attracted by the remuneration of tasks and not conscientiously performing them. It is essential to identify these contributors in order to avoid considering their responses. As not all contributors have the same aptitude for a task, it seems appropriate to give weight to their answers according to their qualifications. This paper, published at the ICTAI 2019 conference, proposes a method, MONITOR, for estimating the profile of the contributor and aggregating the responses using belief function theory.
Tasks
Published 2020-02-26
URL https://arxiv.org/abs/2002.11717v1
PDF https://arxiv.org/pdf/2002.11717v1.pdf
PWC https://paperswithcode.com/paper/modelisation-de-lincertitude-et-de
Repo
Framework

Facets of the PIE Environment for Proving, Interpolating and Eliminating on the Basis of First-Order Logic

Title Facets of the PIE Environment for Proving, Interpolating and Eliminating on the Basis of First-Order Logic
Authors Christoph Wernhard
Abstract PIE is a Prolog-embedded environment for automated reasoning on the basis of first-order logic. Its main focus is on formulas, as constituents of complex formalizations that are structured through formula macros, and as outputs of reasoning tasks such as second-order quantifier elimination and Craig interpolation. It supports a workflow based on documents that intersperse macro definitions, invocations of reasoners, and LaTeX-formatted natural language text. Starting from various examples, the paper discusses features and application possibilities of PIE along with current limitations and issues for future research.
Tasks
Published 2020-02-24
URL https://arxiv.org/abs/2002.10892v1
PDF https://arxiv.org/pdf/2002.10892v1.pdf
PWC https://paperswithcode.com/paper/facets-of-the-pie-environment-for-proving
Repo
Framework

Semantic Relatedness and Taxonomic Word Embeddings

Title Semantic Relatedness and Taxonomic Word Embeddings
Authors Magdalena Kacmajor, John D. Kelleher, Filip Klubicka, Alfredo Maldonado
Abstract This paper connects a series of papers dealing with taxonomic word embeddings. It begins by noting that there are different types of semantic relatedness and that different lexical representations encode different forms of relatedness. A particularly important distinction within semantic relatedness is that of thematic versus taxonomic relatedness. Next, we present a number of experiments that analyse taxonomic embeddings that have been trained on a synthetic corpus that has been generated via a random walk over a taxonomy. These experiments demonstrate how the properties of the synthetic corpus, such as the percentage of rare words, are affected by the shape of the knowledge graph the corpus is generated from. Finally, we explore the interactions between the relative sizes of natural and synthetic corpora on the performance of embeddings when taxonomic and thematic embeddings are combined.
Tasks Word Embeddings
Published 2020-02-14
URL https://arxiv.org/abs/2002.06235v1
PDF https://arxiv.org/pdf/2002.06235v1.pdf
PWC https://paperswithcode.com/paper/semantic-relatedness-and-taxonomic-word
Repo
Framework

Communication-Efficient Distributed SGD with Error-Feedback, Revisited

Title Communication-Efficient Distributed SGD with Error-Feedback, Revisited
Authors Tran Thi Phuong, Le Trieu Phong
Abstract We show that the convergence proof of a recent algorithm called dist-EF-SGD for distributed stochastic gradient descent with communication efficiency using error-feedback of Zheng et al. (NeurIPS 2019) is problematic mathematically. Concretely, the original error bound for arbitrary sequences of learning rate is unfortunately incorrect, leading to an invalidated upper bound in the convergence theorem for the algorithm. As evidences, we explicitly provide several counter-examples, for both convex and non-convex cases, to show the incorrectness of the error bound. We fix the issue by providing a new error bound and its corresponding proof, leading to a new convergence theorem for the dist-EF-SGD algorithm, and therefore recovering its mathematical analysis.
Tasks
Published 2020-03-09
URL https://arxiv.org/abs/2003.04706v1
PDF https://arxiv.org/pdf/2003.04706v1.pdf
PWC https://paperswithcode.com/paper/communication-efficient-distributed-sgd-with-2
Repo
Framework

CIFAR-10 Image Classification Using Feature Ensembles

Title CIFAR-10 Image Classification Using Feature Ensembles
Authors Felipe O. Giuste, Juan C. Vizcarra
Abstract Image classification requires the generation of features capable of detecting image patterns informative of group identity. The objective of this study was to classify images from the public CIFAR-10 image dataset by leveraging combinations of disparate image feature sources from both manual and deep learning approaches. Histogram of oriented gradients (HOG) and pixel intensities successfully inform classification (53% and 59% classification accuracy, respectively), yet there is much room for improvement. VGG16 with ImageNet trained weights and a CIFAR-10 optimized model (CIFAR-VGG) further improve upon image classification (60% and 93.43% accuracy, respectively). We further improved classification by utilizing transfer learning to re-establish optimal network weights for VGG16 (TL-VGG) and Inception ResNet v2 (TL-Inception) resulting in significant performance increases (85% and 90.74%, respectively), yet fail to surpass CIFAR-VGG. We hypothesized that if each generated feature set obtained some unique insight into the classification problem, then combining these features would result in greater classification accuracy, surpassing that of CIFAR-VGG. Upon selection of the top 1000 principal components from TL-VGG, TL-Inception, HOG, pixel intensities, and CIFAR-VGG, we achieved testing accuracy of 94.6%, lending support to our hypothesis.
Tasks Image Classification, Transfer Learning
Published 2020-02-07
URL https://arxiv.org/abs/2002.03846v2
PDF https://arxiv.org/pdf/2002.03846v2.pdf
PWC https://paperswithcode.com/paper/cifar-10-image-classification-using-feature
Repo
Framework

Structured Compression and Sharing of Representational Space for Continual Learning

Title Structured Compression and Sharing of Representational Space for Continual Learning
Authors Gobinda Saha, Isha Garg, Aayush Ankit, Kaushik Roy
Abstract Humans are skilled at learning adaptively and efficiently throughout their lives, but learning tasks incrementally causes artificial neural networks to overwrite relevant information learned about older tasks, resulting in ‘Catastrophic Forgetting’. Efforts to overcome this phenomenon suffer from poor utilization of resources in many ways, such as through the need to save older data or parametric importance scores, or to grow the network architecture. We propose an algorithm that enables a network to learn continually and efficiently by partitioning the representational space into a Core space, that contains the condensed information from previously learned tasks, and a Residual space, which is akin to a scratch space for learning the current task. The information in the Residual space is then compressed using Principal Component Analysis and added to the Core space, freeing up parameters for the next task. We evaluate our algorithm on P-MNIST, CIFAR-10 and CIFAR-100 datasets. We achieve comparable accuracy to state-of-the-art methods while overcoming the problem of catastrophic forgetting completely. Additionally, we get up to 4.5x improvement in energy efficiency during inference due to the structured nature of the resulting architecture.
Tasks Continual Learning
Published 2020-01-23
URL https://arxiv.org/abs/2001.08650v2
PDF https://arxiv.org/pdf/2001.08650v2.pdf
PWC https://paperswithcode.com/paper/structured-compression-and-sharing-of
Repo
Framework

Ternary Feature Masks: continual learning without any forgetting

Title Ternary Feature Masks: continual learning without any forgetting
Authors Marc Masana, Tinne Tuytelaars, Joost van de Weijer
Abstract In this paper, we propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue – and show experimentally – that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters to be added for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. Our masks do not permit any changes to features which are used by previous tasks. As this may be too restrictive to allow learning of new tasks, we add task-specific feature normalization. This way, already learned features can adapt to the current task without changing the behavior of these features for previous tasks. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
Tasks Continual Learning
Published 2020-01-23
URL https://arxiv.org/abs/2001.08714v1
PDF https://arxiv.org/pdf/2001.08714v1.pdf
PWC https://paperswithcode.com/paper/ternary-feature-masks-continual-learning
Repo
Framework

What’s a Good Prediction? Issues in Evaluating General Value Functions Through Error

Title What’s a Good Prediction? Issues in Evaluating General Value Functions Through Error
Authors Alex Kearney, Anna Koop, Patrick M. Pilarski
Abstract Constructing and maintaining knowledge of the world is a central problem for artificial intelligence research. Approaches to constructing an agent’s knowledge using predictions have received increased amounts of interest in recent years. A particularly promising collection of research centres itself around architectures that formulate predictions as General Value Functions (GVFs), an approach commonly referred to as \textit{predictive knowledge}. A pernicious challenge for predictive knowledge architectures is determining what to predict. In this paper, we argue that evaluation methods—i.e., return error and RUPEE—are not well suited for the challenges of determining what to predict. As a primary contribution, we provide extended examples that evaluate predictions in terms of how they are used in further prediction tasks: a key motivation of predictive knowledge systems. We demonstrate that simply because a GVF’s error is low, it does not necessarily follow the prediction is useful as a cumulant. We suggest evaluating 1) the relevance of a GVF’s features to the prediction task at hand, and 2) evaluation of GVFs by \textit{how} they are used. To determine feature relevance, we generalize AutoStep to GTD, producing a step-size learning method suited to the life-long continual learning settings that predictive knowledge architectures are commonly deployed in. This paper contributes a first look into evaluation of predictions through their use, an integral component of predictive knowledge which is as of yet explored.
Tasks Continual Learning
Published 2020-01-23
URL https://arxiv.org/abs/2001.08823v1
PDF https://arxiv.org/pdf/2001.08823v1.pdf
PWC https://paperswithcode.com/paper/whats-a-good-prediction-issues-in-evaluating
Repo
Framework

Ensemble of Deep Convolutional Neural Networks for Automatic Pavement Crack Detection and Measurement

Title Ensemble of Deep Convolutional Neural Networks for Automatic Pavement Crack Detection and Measurement
Authors Zhun Fan, Chong Li, Ying Chen, Paola Di Mascio, Xiaopeng Chen, Guijie Zhu, Giuseppe Loprencipe
Abstract Automated pavement crack detection and measurement are important road issues. Agencies have to guarantee the improvement of road safety. Conventional crack detection and measurement algorithms can be extremely time-consuming and low efficiency. Therefore, recently, innovative algorithms have received increased attention from researchers. In this paper, we propose an ensemble of convolutional neural networks (without a pooling layer) based on probability fusion for automated pavement crack detection and measurement. Specifically, an ensemble of convolutional neural networks was employed to identify the structure of small cracks with raw images. Secondly, outputs of the individual convolutional neural network model for the ensemble were averaged to produce the final crack probability value of each pixel, which can obtain a predicted probability map. Finally, the predicted morphological features of the cracks were measured by using the skeleton extraction algorithm. To validate the proposed method, some experiments were performed on two public crack databases (CFD and AigleRN) and the results of the different state-of-the-art methods were compared. The experimental results show that the proposed method outperforms the other methods. For crack measurement, the crack length and width can be measure based on different crack types (complex, common, thin, and intersecting cracks.). The results show that the proposed algorithm can be effectively applied for crack measurement.
Tasks
Published 2020-02-08
URL https://arxiv.org/abs/2002.03241v1
PDF https://arxiv.org/pdf/2002.03241v1.pdf
PWC https://paperswithcode.com/paper/ensemble-of-deep-convolutional-neural-1
Repo
Framework

Learning Directed Locomotion in Modular Robots with Evolvable Morphologies

Title Learning Directed Locomotion in Modular Robots with Evolvable Morphologies
Authors Gongjin Lan, Matteo De Carlo, Fuda van Diggelen, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben
Abstract We generalize the well-studied problem of gait learning in modular robots in two dimensions. Firstly, we address locomotion in a given target direction that goes beyond learning a typical undirected gait. Secondly, rather than studying one fixed robot morphology we consider a test suite of different modular robots. This study is based on our interest in evolutionary robot systems where both morphologies and controllers evolve. In such a system, newborn robots have to learn to control their own body that is a random combination of the bodies of the parents. We apply and compare two learning algorithms, Bayesian optimization and HyperNEAT. The results of the experiments in simulation show that both methods successfully learn good controllers, but Bayesian optimization is more effective and efficient. We validate the best learned controllers by constructing three robots from the test suite in the real world and observe their fitness and actual trajectories. The obtained results indicate a reality gap that depends on the controllers and the shape of the robots, but overall the trajectories are adequate and follow the target directions successfully.
Tasks
Published 2020-01-21
URL https://arxiv.org/abs/2001.07804v1
PDF https://arxiv.org/pdf/2001.07804v1.pdf
PWC https://paperswithcode.com/paper/learning-directed-locomotion-in-modular
Repo
Framework

Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives

Title Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives
Authors Raman Arora, Teodor V. Marinov, Enayat Ullah
Abstract In this paper, we revisit the problem of private stochastic convex optimization. We propose an algorithm, based on noisy mirror descent, which achieves optimal rates up to a logarithmic factor, both in terms of statistical complexity and number of queries to a first-order stochastic oracle. Unlike prior work, we do not require Lipschitz continuity of stochastic gradients to achieve optimal rates. Our algorithm generalizes beyond the Euclidean setting and yields anytime utility and privacy guarantees.
Tasks
Published 2020-02-22
URL https://arxiv.org/abs/2002.09609v1
PDF https://arxiv.org/pdf/2002.09609v1.pdf
PWC https://paperswithcode.com/paper/private-stochastic-convex-optimization
Repo
Framework

Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume

Title Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume
Authors Adrian Johnston, Gustavo Carneiro
Abstract Monocular depth estimation has become one of the most studied applications in computer vision, where the most accurate approaches are based on fully supervised learning models. However, the acquisition of accurate and large ground truth data sets to model these fully supervised methods is a major challenge for the further development of the area. Self-supervised methods trained with monocular videos constitute one the most promising approaches to mitigate the challenge mentioned above due to the wide-spread availability of training data. Consequently, they have been intensively studied, where the main ideas explored consist of different types of model architectures, loss functions, and occlusion masks to address non-rigid motion. In this paper, we propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction. Compared with the usual localised convolution operation, self-attention can explore a more general contextual information that allows the inference of similar disparity values at non-contiguous regions of the image. Discrete disparity prediction has been shown by fully supervised methods to provide a more robust and sharper depth estimation than the more common continuous disparity prediction, besides enabling the estimation of depth uncertainty. We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D, closing the gap with respect self-supervised stereo training and fully supervised approaches.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2020-03-31
URL https://arxiv.org/abs/2003.13951v1
PDF https://arxiv.org/pdf/2003.13951v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-monocular-trained-depth
Repo
Framework

Generalization of Change-Point Detection in Time Series Data Based on Direct Density Ratio Estimation

Title Generalization of Change-Point Detection in Time Series Data Based on Direct Density Ratio Estimation
Authors Mikhail Hushchyn, Andrey Ustyuzhanin
Abstract The goal of the change-point detection is to discover changes of time series distribution. One of the state of the art approaches of the change-point detection are based on direct density ratio estimation. In this work we show how existing algorithms can be generalized using various binary classification and regression models. In particular, we show that the Gradient Boosting over Decision Trees and Neural Networks can be used for this purpose. The algorithms are tested on several synthetic and real-world datasets. The results show that the proposed methods outperform classical RuLSIF algorithm. Discussion of cases where the proposed algorithms have advantages over existing methods are also provided.
Tasks Change Point Detection, Time Series
Published 2020-01-17
URL https://arxiv.org/abs/2001.06386v1
PDF https://arxiv.org/pdf/2001.06386v1.pdf
PWC https://paperswithcode.com/paper/generalization-of-change-point-detection-in
Repo
Framework

PatchPerPix for Instance Segmentation

Title PatchPerPix for Instance Segmentation
Authors Peter Hirsch, Lisa Mais, Dagmar Kainmueller
Abstract In this paper we present a novel method for proposal free instance segmentation that can handle sophisticated object shapes that span large parts of an image and form dense object clusters with crossovers. Our method is based on predicting dense local shape descriptors, which we assemble to form instances. All instances are assembled simultaneously in one go. To our knowledge, our method is the first non-iterative method that yields instances that are composed of learnt shape patches. We evaluate our method on a diverse range of data domains, where it defines the new state of the art on four benchmarks, namely the ISBI 2012 EM segmentation benchmark, the BBBC010 C. elegans dataset, and 2d as well as 3d fluorescence microscopy datasets of cell nuclei. We show furthermore that our method also applies to 3d light microscopy data of drosophila neurons, which exhibit extreme cases of complex shape clusters.
Tasks Instance Segmentation, Semantic Segmentation
Published 2020-01-21
URL https://arxiv.org/abs/2001.07626v2
PDF https://arxiv.org/pdf/2001.07626v2.pdf
PWC https://paperswithcode.com/paper/patchperpix-for-instance-segmentation
Repo
Framework
comments powered by Disqus