October 20, 2019

2748 words 13 mins read

Paper Group AWR 334

Paper Group AWR 334

Quadrature-based features for kernel approximation. Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters. Recurrent machines for likelihood-free inference. A minimax near-optimal algorithm for adaptive rejection sampling. Deep Frank-Wolfe For Neural Network Optimization. Multistep Neural Networks for Data-driven …

Quadrature-based features for kernel approximation

Title Quadrature-based features for kernel approximation
Authors Marina Munkhoeva, Yermek Kapushev, Evgeny Burnaev, Ivan Oseledets
Abstract We consider the problem of improving kernel approximation via randomized feature maps. These maps arise as Monte Carlo approximation to integral representations of kernel functions and scale up kernel methods for larger datasets. Based on an efficient numerical integration technique, we propose a unifying approach that reinterprets the previous random features methods and extends to better estimates of the kernel approximation. We derive the convergence behaviour and conduct an extensive empirical study that supports our hypothesis.
Tasks
Published 2018-02-11
URL http://arxiv.org/abs/1802.03832v4
PDF http://arxiv.org/pdf/1802.03832v4.pdf
PWC https://paperswithcode.com/paper/quadrature-based-features-for-kernel
Repo https://github.com/quffka/quffka
Framework none

Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters

Title Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters
Authors Anil Goyal, Emilie Morvant, Pascal Germain, Massih-Reza Amini
Abstract In this paper we propose a boosting based multiview learning algorithm, referred to as PB-MVBoost, which iteratively learns i) weights over view-specific voters capturing view-specific information; and ii) weights over views by optimizing a PAC-Bayes multiview C-Bound that takes into account the accuracy of view-specific classifiers and the diversity between the views. We derive a generalization bound for this strategy following the PAC-Bayes theory which is a suitable tool to deal with models expressed as weighted combination over a set of voters. Different experiments on three publicly available datasets show the efficiency of the proposed approach with respect to state-of-art models.
Tasks Document Classification, Multilingual text classification, Multiview Learning, Text Classification
Published 2018-08-17
URL http://arxiv.org/abs/1808.05784v2
PDF http://arxiv.org/pdf/1808.05784v2.pdf
PWC https://paperswithcode.com/paper/multiview-boosting-by-controlling-the
Repo https://github.com/goyalanil/Multiview_Dataset_MNIST
Framework none

Recurrent machines for likelihood-free inference

Title Recurrent machines for likelihood-free inference
Authors Arthur Pesah, Antoine Wehenkel, Gilles Louppe
Abstract Likelihood-free inference is concerned with the estimation of the parameters of a non-differentiable stochastic simulator that best reproduce real observations. In the absence of a likelihood function, most of the existing inference methods optimize the simulator parameters through a handcrafted iterative procedure that tries to make the simulated data more similar to the observations. In this work, we explore whether meta-learning can be used in the likelihood-free context, for learning automatically from data an iterative optimization procedure that would solve likelihood-free inference problems. We design a recurrent inference machine that learns a sequence of parameter updates leading to good parameter estimates, without ever specifying some explicit notion of divergence between the simulated data and the real data distributions. We demonstrate our approach on toy simulators, showing promising results both in terms of performance and robustness.
Tasks Meta-Learning
Published 2018-11-30
URL http://arxiv.org/abs/1811.12932v2
PDF http://arxiv.org/pdf/1811.12932v2.pdf
PWC https://paperswithcode.com/paper/recurrent-machines-for-likelihood-free
Repo https://github.com/artix41/ALFI-pytorch
Framework pytorch

A minimax near-optimal algorithm for adaptive rejection sampling

Title A minimax near-optimal algorithm for adaptive rejection sampling
Authors Juliette Achdou, Joseph C. Lam, Alexandra Carpentier, Gilles Blanchard
Abstract Rejection Sampling is a fundamental Monte-Carlo method. It is used to sample from distributions admitting a probability density function which can be evaluated exactly at any given point, albeit at a high computational cost. However, without proper tuning, this technique implies a high rejection rate. Several methods have been explored to cope with this problem, based on the principle of adaptively estimating the density by a simpler function, using the information of the previous samples. Most of them either rely on strong assumptions on the form of the density, or do not offer any theoretical performance guarantee. We give the first theoretical lower bound for the problem of adaptive rejection sampling and introduce a new algorithm which guarantees a near-optimal rejection rate in a minimax sense.
Tasks
Published 2018-10-22
URL http://arxiv.org/abs/1810.09390v1
PDF http://arxiv.org/pdf/1810.09390v1.pdf
PWC https://paperswithcode.com/paper/a-minimax-near-optimal-algorithm-for-adaptive
Repo https://github.com/josephclam/NNARS
Framework none

Deep Frank-Wolfe For Neural Network Optimization

Title Deep Frank-Wolfe For Neural Network Optimization
Authors Leonard Berrada, Andrew Zisserman, M. Pawan Kumar
Abstract Learning a deep neural network requires solving a challenging optimization problem: it is a high-dimensional, non-convex and non-smooth minimization problem with a large number of terms. The current practice in neural network optimization is to rely on the stochastic gradient descent (SGD) algorithm or its adaptive variants. However, SGD requires a hand-designed schedule for the learning rate. In addition, its adaptive variants tend to produce solutions that generalize less well on unseen data than SGD with a hand-designed schedule. We present an optimization method that offers empirically the best of both worlds: our algorithm yields good generalization performance while requiring only one hyper-parameter. Our approach is based on a composite proximal framework, which exploits the compositional nature of deep neural networks and can leverage powerful convex optimization algorithms by design. Specifically, we employ the Frank-Wolfe (FW) algorithm for SVM, which computes an optimal step-size in closed-form at each time-step. We further show that the descent direction is given by a simple backward pass in the network, yielding the same computational cost per iteration as SGD. We present experiments on the CIFAR and SNLI data sets, where we demonstrate the significant superiority of our method over Adam, Adagrad, as well as the recently proposed BPGrad and AMSGrad. Furthermore, we compare our algorithm to SGD with a hand-designed learning rate schedule, and show that it provides similar generalization while converging faster. The code is publicly available at https://github.com/oval-group/dfw.
Tasks
Published 2018-11-19
URL http://arxiv.org/abs/1811.07591v2
PDF http://arxiv.org/pdf/1811.07591v2.pdf
PWC https://paperswithcode.com/paper/deep-frank-wolfe-for-neural-network
Repo https://github.com/oval-group/dfw
Framework pytorch

Multistep Neural Networks for Data-driven Discovery of Nonlinear Dynamical Systems

Title Multistep Neural Networks for Data-driven Discovery of Nonlinear Dynamical Systems
Authors Maziar Raissi, Paris Perdikaris, George Em Karniadakis
Abstract The process of transforming observed data into predictive mathematical models of the physical world has always been paramount in science and engineering. Although data is currently being collected at an ever-increasing pace, devising meaningful models out of such observations in an automated fashion still remains an open problem. In this work, we put forth a machine learning approach for identifying nonlinear dynamical systems from data. Specifically, we blend classical tools from numerical analysis, namely the multi-step time-stepping schemes, with powerful nonlinear function approximators, namely deep neural networks, to distill the mechanisms that govern the evolution of a given data-set. We test the effectiveness of our approach for several benchmark problems involving the identification of complex, nonlinear and chaotic dynamics, and we demonstrate how this allows us to accurately learn the dynamics, forecast future states, and identify basins of attraction. In particular, we study the Lorenz system, the fluid flow behind a cylinder, the Hopf bifurcation, and the Glycoltic oscillator model as an example of complicated nonlinear dynamics typical of biological systems.
Tasks
Published 2018-01-04
URL http://arxiv.org/abs/1801.01236v1
PDF http://arxiv.org/pdf/1801.01236v1.pdf
PWC https://paperswithcode.com/paper/multistep-neural-networks-for-data-driven
Repo https://github.com/maziarraissi/MultistepNNs
Framework none

Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs

Title Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs
Authors Xuhao Chen
Abstract Deep neural networks have achieved remarkable accuracy in many artificial intelligence applications, e.g. computer vision, at the cost of a large number of parameters and high computational complexity. Weight pruning can compress DNN models by removing redundant parameters in the networks, but it brings sparsity in the weight matrix, and therefore makes the computation inefficient on GPUs. Although pruning can remove more than 80% of the weights, it actually hurts inference performance (speed) when running models on GPUs. Two major problems cause this unsatisfactory performance on GPUs. First, lowering convolution onto matrix multiplication reduces data reuse opportunities and wastes memory bandwidth. Second, the sparsity brought by pruning makes the computation irregular, which leads to inefficiency when running on massively parallel GPUs. To overcome these two limitations, we propose Escort, an efficient sparse convolutional neural networks on GPUs. Instead of using the lowering method, we choose to compute the sparse convolutions directly. We then orchestrate the parallelism and locality for the direct sparse convolution kernel, and apply customized optimization techniques to further improve performance. Evaluation on NVIDIA GPUs show that Escort can improve sparse convolution speed by 2.63x and 3.07x, and inference speed by 1.43x and 1.69x, compared to CUBLAS and CUSPARSE respectively.
Tasks
Published 2018-02-28
URL http://arxiv.org/abs/1802.10280v2
PDF http://arxiv.org/pdf/1802.10280v2.pdf
PWC https://paperswithcode.com/paper/escort-efficient-sparse-convolutional-neural
Repo https://github.com/gkUwen/learning-material
Framework none

The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities

Title The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities
Authors Marco Del Tredici, Raquel Fernández
Abstract We investigate the birth and diffusion of lexical innovations in a large dataset of online social communities. We build on sociolinguistic theories and focus on the relation between the spread of a novel term and the social role of the individuals who use it, uncovering characteristics of innovators and adopters. Finally, we perform a prediction task that allows us to anticipate whether an innovation will successfully spread within a community.
Tasks
Published 2018-06-15
URL http://arxiv.org/abs/1806.05838v1
PDF http://arxiv.org/pdf/1806.05838v1.pdf
PWC https://paperswithcode.com/paper/the-road-to-success-assessing-the-fate-of
Repo https://github.com/marcodel13/The-Road-to-Success
Framework none

Hateminers : Detecting Hate speech against Women

Title Hateminers : Detecting Hate speech against Women
Authors Punyajoy Saha, Binny Mathew, Pawan Goyal, Animesh Mukherjee
Abstract With the online proliferation of hate speech, there is an urgent need for systems that can detect such harmful content. In this paper, We present the machine learning models developed for the Automatic Misogyny Identification (AMI) shared task at EVALITA 2018. We generate three types of features: Sentence Embeddings, TF-IDF Vectors, and BOW Vectors to represent each tweet. These features are then concatenated and fed into the machine learning models. Our model came First for the English Subtask A and Fifth for the English Subtask B. We release our winning model for public use and it’s available at https://github.com/punyajoy/Hateminers-EVALITA.
Tasks Hate Speech Detection, Sentence Embeddings
Published 2018-12-17
URL http://arxiv.org/abs/1812.06700v1
PDF http://arxiv.org/pdf/1812.06700v1.pdf
PWC https://paperswithcode.com/paper/hateminers-detecting-hate-speech-against
Repo https://github.com/punyajoy/Hateminers-EVALITA
Framework none

Reduced-order modeling with artificial neurons for gravitational-wave inference

Title Reduced-order modeling with artificial neurons for gravitational-wave inference
Authors Alvin J. K. Chua, Chad R. Galley, Michele Vallisneri
Abstract Gravitational-wave data analysis is rapidly absorbing techniques from deep learning, with a focus on convolutional networks and related methods that treat noisy time series as images. We pursue an alternative approach, in which waveforms are first represented as weighted sums over reduced bases (reduced-order modeling); we then train artificial neural networks to map gravitational-wave source parameters into basis coefficients. Statistical inference proceeds directly in coefficient space, where it is theoretically straightforward and computationally efficient. The neural networks also provide analytic waveform derivatives, which are useful for gradient-based sampling schemes. We demonstrate fast and accurate coefficient interpolation for the case of a four-dimensional binary-inspiral waveform family, and discuss promising applications of our framework in parameter estimation.
Tasks Time Series
Published 2018-11-13
URL https://arxiv.org/abs/1811.05491v2
PDF https://arxiv.org/pdf/1811.05491v2.pdf
PWC https://paperswithcode.com/paper/roman-reduced-order-modeling-with-artificial
Repo https://github.com/vallis/truebayes
Framework pytorch

GLAD: GLocalized Anomaly Detection via Active Feature Space Suppression

Title GLAD: GLocalized Anomaly Detection via Active Feature Space Suppression
Authors Shubhomoy Das, Janardhan Rao Doppa
Abstract We propose an algorithm called GLAD (GLocalized Anomaly Detection) that allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. The key idea is to place a uniform prior on the relevance of each member of the anomaly detection ensemble over the input feature space via a neural network trained on unlabeled instances, and tune the weights of the neural network to adjust the local relevance of each ensemble member using all labeled instances. Our experiments on synthetic and real-world data show the effectiveness of GLAD in learning the local relevance of ensemble members and discovering anomalies via label feedback.
Tasks Anomaly Detection
Published 2018-10-02
URL http://arxiv.org/abs/1810.01403v3
PDF http://arxiv.org/pdf/1810.01403v3.pdf
PWC https://paperswithcode.com/paper/glad-glocalized-anomaly-detection-via-active
Repo https://github.com/freedombenLiu/ad_examples
Framework tf

Seglearn: A Python Package for Learning Sequences and Time Series

Title Seglearn: A Python Package for Learning Sequences and Time Series
Authors David M. Burns, Cari M. Whyne
Abstract Seglearn is an open-source python package for machine learning time series or sequences using a sliding window segmentation approach. The implementation provides a flexible pipeline for tackling classification, regression, and forecasting problems with multivariate sequence and contextual data. This package is compatible with scikit-learn and is listed under scikit-learn Related Projects. The package depends on numpy, scipy, and scikit-learn. Seglearn is distributed under the BSD 3-Clause License. Documentation includes a detailed API description, user guide, and examples. Unit tests provide a high degree of code coverage.
Tasks Time Series
Published 2018-03-21
URL http://arxiv.org/abs/1803.08118v3
PDF http://arxiv.org/pdf/1803.08118v3.pdf
PWC https://paperswithcode.com/paper/seglearn-a-python-package-for-learning
Repo https://github.com/dmbee/seglearn
Framework tf

Deep Reinforcement One-Shot Learning for Artificially Intelligent Classification Systems

Title Deep Reinforcement One-Shot Learning for Artificially Intelligent Classification Systems
Authors Anton Puzanov, Kobi Cohen
Abstract In recent years there has been a sharp rise in networking applications, in which significant events need to be classified but only a few training instances are available. These are known as cases of one-shot learning. Examples include analyzing network traffic under zero-day attacks, and computer vision tasks by sensor networks deployed in the field. To handle this challenging task, organizations often use human analysts to classify events under high uncertainty. Existing algorithms use a threshold-based mechanism to decide whether to classify an object automatically or send it to an analyst for deeper inspection. However, this approach leads to a significant waste of resources since it does not take the practical temporal constraints of system resources into account. Our contribution is threefold. First, we develop a novel Deep Reinforcement One-shot Learning (DeROL) framework to address this challenge. The basic idea of the DeROL algorithm is to train a deep-Q network to obtain a policy which is oblivious to the unseen classes in the testing data. Then, in real-time, DeROL maps the current state of the one-shot learning process to operational actions based on the trained deep-Q network, to maximize the objective function. Second, we develop the first open-source software for practical artificially intelligent one-shot classification systems with limited resources for the benefit of researchers in related fields. Third, we present an extensive experimental study using the OMNIGLOT dataset for computer vision tasks and the UNSW-NB15 dataset for intrusion detection tasks that demonstrates the versatility and efficiency of the DeROL framework.
Tasks Intrusion Detection, Omniglot, One-Shot Learning
Published 2018-08-04
URL http://arxiv.org/abs/1808.01527v1
PDF http://arxiv.org/pdf/1808.01527v1.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-one-shot-learning-for
Repo https://github.com/antonpuz/DeROL
Framework tf

Set Aggregation Network as a Trainable Pooling Layer

Title Set Aggregation Network as a Trainable Pooling Layer
Authors Łukasz Maziarka, Marek Śmieja, Aleksandra Nowak, Jacek Tabor, Łukasz Struski, Przemysław Spurek
Abstract Global pooling, such as max- or sum-pooling, is one of the key ingredients in deep neural networks used for processing images, texts, graphs and other types of structured data. Based on the recent DeepSets architecture proposed by Zaheer et al. (NIPS 2017), we introduce a Set Aggregation Network (SAN) as an alternative global pooling layer. In contrast to typical pooling operators, SAN allows to embed a given set of features to a vector representation of arbitrary size. We show that by adjusting the size of embedding, SAN is capable of preserving the whole information from the input. In experiments, we demonstrate that replacing global pooling layer by SAN leads to the improvement of classification accuracy. Moreover, it is less prone to overfitting and can be used as a regularizer.
Tasks
Published 2018-10-03
URL https://arxiv.org/abs/1810.01868v3
PDF https://arxiv.org/pdf/1810.01868v3.pdf
PWC https://paperswithcode.com/paper/set-aggregation-network-for-structured-data
Repo https://github.com/gmum/set-aggregation
Framework tf

Low Cost Edge Sensing for High Quality Demosaicking

Title Low Cost Edge Sensing for High Quality Demosaicking
Authors Yan Niu, Jihong Ouyang, Wanli Zuo, Fuxin Wang
Abstract Digital cameras that use Color Filter Arrays (CFA) entail a demosaicking procedure to form full RGB images. As today’s camera users generally require images to be viewed instantly, demosaicking algorithms for real applications must be fast. Moreover, the associated cost should be lower than the cost saved by using CFA. For this purpose, we revisit the classical Hamilton-Adams (HA) algorithm, which outperforms many sophisticated techniques in both speed and accuracy. Inspired by HA’s strength and weakness, we design a very low cost edge sensing scheme. Briefly, it guides demosaicking by a logistic functional of the difference between directional variations. We extensively compare our algorithm with 28 demosaicking algorithms by running their open source codes on benchmark datasets. Compared to methods of similar computational cost, our method achieves substantially higher accuracy, Whereas compared to methods of similar accuracy, our method has significantly lower cost. Moreover, on test images of currently popular resolution, the quality of our algorithm is comparable to top performers, whereas its speed is tens of times faster.
Tasks Demosaicking
Published 2018-06-03
URL http://arxiv.org/abs/1806.00771v2
PDF http://arxiv.org/pdf/1806.00771v2.pdf
PWC https://paperswithcode.com/paper/low-cost-edge-sensing-for-high-quality
Repo https://github.com/shmilyo/Low-Cost-Edge-Sensing-for-High-Quality-Demosaicking
Framework none
comments powered by Disqus