October 17, 2019

3232 words 16 mins read

Paper Group ANR 839

Paper Group ANR 839

MVTec D2S: Densely Segmented Supermarket Dataset. Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence. Large-Scale Cox Process Inference using Variational Fourier Features. Discovering Markov Blanket from Multiple interventional Datasets. Poisson Multi-Bernoulli Mapping Using Gibbs Sam …

MVTec D2S: Densely Segmented Supermarket Dataset

Title MVTec D2S: Densely Segmented Supermarket Dataset
Authors Patrick Follmann, Tobias Böttger, Philipp Härtinger, Rebecca König, Markus Ulrich
Abstract We introduce the Densely Segmented Supermarket (D2S) dataset, a novel benchmark for instance-aware semantic segmentation in an industrial domain. It contains 21,000 high-resolution images with pixel-wise labels of all object instances. The objects comprise groceries and everyday products from 60 categories. The benchmark is designed such that it resembles the real-world setting of an automatic checkout, inventory, or warehouse system. The training images only contain objects of a single class on a homogeneous background, while the validation and test sets are much more complex and diverse. To further benchmark the robustness of instance segmentation methods, the scenes are acquired with different lightings, rotations, and backgrounds. We ensure that there are no ambiguities in the labels and that every instance is labeled comprehensively. The annotations are pixel-precise and allow using crops of single instances for articial data augmentation. The dataset covers several challenges highly relevant in the field, such as a limited amount of training data and a high diversity in the test and validation sets. The evaluation of state-of-the-art object detection and instance segmentation methods on D2S reveals significant room for improvement.
Tasks Data Augmentation, Instance Segmentation, Object Detection, Semantic Segmentation
Published 2018-04-23
URL http://arxiv.org/abs/1804.08292v2
PDF http://arxiv.org/pdf/1804.08292v2.pdf
PWC https://paperswithcode.com/paper/mvtec-d2s-densely-segmented-supermarket
Repo
Framework

Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence

Title Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence
Authors Dirk A. Lorenz, Quoc Tran-Dinh
Abstract We revisit the classical Douglas-Rachford (DR) method for finding a zero of the sum of two maximal monotone operators. Since the practical performance of the DR method crucially depends on the stepsizes, we aim at developing an adaptive stepsize rule. To that end, we take a closer look at a linear case of the problem and use our findings to develop a stepsize strategy that eliminates the need for stepsize tuning. We analyze a general non-stationary DR scheme and prove its convergence for a convergent sequence of stepsizes with summable increments. This, in turn, proves the convergence of the method with the new adaptive stepsize rule. We also derive the related non-stationary alternating direction method of multipliers (ADMM) from such a non-stationary DR method. We illustrate the efficiency of the proposed methods on several numerical examples.
Tasks
Published 2018-01-11
URL http://arxiv.org/abs/1801.03765v2
PDF http://arxiv.org/pdf/1801.03765v2.pdf
PWC https://paperswithcode.com/paper/non-stationary-douglas-rachford-and
Repo
Framework

Large-Scale Cox Process Inference using Variational Fourier Features

Title Large-Scale Cox Process Inference using Variational Fourier Features
Authors S. T. John, James Hensman
Abstract Gaussian process modulated Poisson processes provide a flexible framework for modelling spatiotemporal point patterns. So far this had been restricted to one dimension, binning to a pre-determined grid, or small data sets of up to a few thousand data points. Here we introduce Cox process inference based on Fourier features. This sparse representation induces global rather than local constraints on the function space and is computationally efficient. This allows us to formulate a grid-free approximation that scales well with the number of data points and the size of the domain. We demonstrate that this allows MCMC approximations to the non-Gaussian posterior. We also find that, in practice, Fourier features have more consistent optimization behavior than previous approaches. Our approximate Bayesian method can fit over 100,000 events with complex spatiotemporal patterns in three dimensions on a single GPU.
Tasks
Published 2018-04-03
URL http://arxiv.org/abs/1804.01016v1
PDF http://arxiv.org/pdf/1804.01016v1.pdf
PWC https://paperswithcode.com/paper/large-scale-cox-process-inference-using
Repo
Framework

Discovering Markov Blanket from Multiple interventional Datasets

Title Discovering Markov Blanket from Multiple interventional Datasets
Authors Kui Yu, Lin Liu, Jiuyong Li
Abstract In this paper, we study the problem of discovering the Markov blanket (MB) of a target variable from multiple interventional datasets. Datasets attained from interventional experiments contain richer causal information than passively observed data (observational data) for MB discovery. However, almost all existing MB discovery methods are designed for finding MBs from a single observational dataset. To identify MBs from multiple interventional datasets, we face two challenges: (1) unknown intervention variables; (2) nonidentical data distributions. To tackle the challenges, we theoretically analyze (a) under what conditions we can find the correct MB of a target variable, and (b) under what conditions we can identify the causes of the target variable via discovering its MB. Based on the theoretical analysis, we propose a new algorithm for discovering MBs from multiple interventional datasets, and present the conditions/assumptions which assure the correctness of the algorithm. To our knowledge, this work is the first to present the theoretical analyses about the conditions for MB discovery in multiple interventional datasets and the algorithm to find the MBs in relation to the conditions. Using benchmark Bayesian networks and real-world datasets, the experiments have validated the effectiveness and efficiency of the proposed algorithm in the paper.
Tasks
Published 2018-01-25
URL http://arxiv.org/abs/1801.08295v1
PDF http://arxiv.org/pdf/1801.08295v1.pdf
PWC https://paperswithcode.com/paper/discovering-markov-blanket-from-multiple
Repo
Framework

Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

Title Poisson Multi-Bernoulli Mapping Using Gibbs Sampling
Authors Maryam Fatemi, Karl Granström, Lennart Svensson, Francisco J. R. Ruiz, Lars Hammarstrand
Abstract This paper addresses the mapping problem. Using a conjugate prior form, we derive the exact theoretical batch multi-object posterior density of the map given a set of measurements. The landmarks in the map are modeled as extended objects, and the measurements are described as a Poisson process, conditioned on the map. We use a Poisson process prior on the map and prove that the posterior distribution is a hybrid Poisson, multi-Bernoulli mixture distribution. We devise a Gibbs sampling algorithm to sample from the batch multi-object posterior. The proposed method can handle uncertainties in the data associations and the cardinality of the set of landmarks, and is parallelizable, making it suitable for large-scale problems. The performance of the proposed method is evaluated on synthetic data and is shown to outperform a state-of-the-art method.
Tasks
Published 2018-11-07
URL http://arxiv.org/abs/1811.03154v1
PDF http://arxiv.org/pdf/1811.03154v1.pdf
PWC https://paperswithcode.com/paper/poisson-multi-bernoulli-mapping-using-gibbs
Repo
Framework

Pay More Attention - Neural Architectures for Question-Answering

Title Pay More Attention - Neural Architectures for Question-Answering
Authors Zia Hasan, Sebastian Fischer
Abstract Machine comprehension is a representative task of natural language understanding. Typically, we are given context paragraph and the objective is to answer a question that depends on the context. Such a problem requires to model the complex interactions between the context paragraph and the question. Lately, attention mechanisms have been found to be quite successful at these tasks and in particular, attention mechanisms with attention flow from both context-to-question and question-to-context have been proven to be quite useful. In this paper, we study two state-of-the-art attention mechanisms called Bi-Directional Attention Flow (BiDAF) and Dynamic Co-Attention Network (DCN) and propose a hybrid scheme combining these two architectures that gives better overall performance. Moreover, we also suggest a new simpler attention mechanism that we call Double Cross Attention (DCA) that provides better results compared to both BiDAF and Co-Attention mechanisms while providing similar performance as the hybrid scheme. The objective of our paper is to focus particularly on the attention layer and to suggest improvements on that. Our experimental evaluations show that both our proposed models achieve superior results on the Stanford Question Answering Dataset (SQuAD) compared to BiDAF and DCN attention mechanisms.
Tasks Question Answering, Reading Comprehension
Published 2018-03-25
URL http://arxiv.org/abs/1803.09230v1
PDF http://arxiv.org/pdf/1803.09230v1.pdf
PWC https://paperswithcode.com/paper/pay-more-attention-neural-architectures-for
Repo
Framework

Determining Optimal Number of k-Clusters based on Predefined Level-of-Similarity

Title Determining Optimal Number of k-Clusters based on Predefined Level-of-Similarity
Authors Rabindra Lamsal, Shubham Katiyar
Abstract This paper proposes a centroid-based clustering algorithm which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure, which collectively decides whether to assign an incoming data-point to a pre-existing cluster, or create a new cluster and assign the data-point to it. The proposed clustering algorithm is application-specific and is applicable when the need is to perform clustering analysis of a stream of data-points, where the similarity measure between an incoming data-point and the cluster to which the data-point is to be associated with, is greater than the predefined Level-of-Similarity.
Tasks
Published 2018-10-03
URL https://arxiv.org/abs/1810.01878v2
PDF https://arxiv.org/pdf/1810.01878v2.pdf
PWC https://paperswithcode.com/paper/real-time-clustering-algorithm-based-on
Repo
Framework

Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems

Title Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems
Authors Dongkun Zhang, Lu Lu, Ling Guo, George Em Karniadakis
Abstract Physics-informed neural networks (PINNs) have recently emerged as an alternative way of solving partial differential equations (PDEs) without the need of building elaborate grids, instead, using a straightforward implementation. In particular, in addition to the deep neural network (DNN) for the solution, a second DNN is considered that represents the residual of the PDE. The residual is then combined with the mismatch in the given data of the solution in order to formulate the loss function. This framework is effective but is lacking uncertainty quantification of the solution due to the inherent randomness in the data or due to the approximation limitations of the DNN architecture. Here, we propose a new method with the objective of endowing the DNN with uncertainty quantification for both sources of uncertainty, i.e., the parametric uncertainty and the approximation uncertainty. We first account for the parametric uncertainty when the parameter in the differential equation is represented as a stochastic process. Multiple DNNs are designed to learn the modal functions of the arbitrary polynomial chaos (aPC) expansion of its solution by using stochastic data from sparse sensors. We can then make predictions from new sensor measurements very efficiently with the trained DNNs. Moreover, we employ dropout to correct the over-fitting and also to quantify the uncertainty of DNNs in approximating the modal functions. We then design an active learning strategy based on the dropout uncertainty to place new sensors in the domain to improve the predictions of DNNs. Several numerical tests are conducted for both the forward and the inverse problems to quantify the effectiveness of PINNs combined with uncertainty quantification. This NN-aPC new paradigm of physics-informed deep learning with uncertainty quantification can be readily applied to other types of stochastic PDEs in multi-dimensions.
Tasks Active Learning
Published 2018-09-21
URL http://arxiv.org/abs/1809.08327v1
PDF http://arxiv.org/pdf/1809.08327v1.pdf
PWC https://paperswithcode.com/paper/quantifying-total-uncertainty-in-physics
Repo
Framework

A sequential sampling strategy for extreme event statistics in nonlinear dynamical systems

Title A sequential sampling strategy for extreme event statistics in nonlinear dynamical systems
Authors Mustafa A. Mohamad, Themistoklis P. Sapsis
Abstract We develop a method for the evaluation of extreme event statistics associated with nonlinear dynamical systems, using a small number of samples. From an initial dataset of design points, we formulate a sequential strategy that provides the ‘next-best’ data point (set of parameters) that when evaluated results in improved estimates of the probability density function (pdf) for a scalar quantity of interest. The approach utilizes Gaussian process regression to perform Bayesian inference on the parameter-to-observation map describing the quantity of interest. We then approximate the desired pdf along with uncertainty bounds utilizing the posterior distribution of the inferred map. The ‘next-best’ design point is sequentially determined through an optimization procedure that selects the point in parameter space that maximally reduces uncertainty between the estimated bounds of the pdf prediction. Since the optimization process utilizes only information from the inferred map it has minimal computational cost. Moreover, the special form of the metric emphasizes the tails of the pdf. The method is practical for systems where the dimensionality of the parameter space is of moderate size, i.e. order O(10). We apply the method to estimate the extreme event statistics for a very high-dimensional system with millions of degrees of freedom: an offshore platform subjected to three-dimensional irregular waves. It is demonstrated that the developed approach can accurately determine the extreme event statistics using limited number of samples.
Tasks Bayesian Inference
Published 2018-04-19
URL http://arxiv.org/abs/1804.07240v1
PDF http://arxiv.org/pdf/1804.07240v1.pdf
PWC https://paperswithcode.com/paper/a-sequential-sampling-strategy-for-extreme
Repo
Framework

Online Fall Detection using Recurrent Neural Networks

Title Online Fall Detection using Recurrent Neural Networks
Authors Mirto Musci, Daniele De Martini, Nicola Blago, Tullio Facchinetti, Marco Piastra
Abstract Unintentional falls can cause severe injuries and even death, especially if no immediate assistance is given. The aim of Fall Detection Systems (FDSs) is to detect an occurring fall. This information can be used to trigger the necessary assistance in case of injury. This can be done by using either ambient-based sensors, e.g. cameras, or wearable devices. The aim of this work is to study the technical aspects of FDSs based on wearable devices and artificial intelligence techniques, in particular Deep Learning (DL), to implement an effective algorithm for on-line fall detection. The proposed classifier is based on a Recurrent Neural Network (RNN) model with underlying Long Short-Term Memory (LSTM) blocks. The method is tested on the publicly available SisFall dataset, with extended annotation, and compared with the results obtained by the SisFall authors.
Tasks
Published 2018-04-13
URL http://arxiv.org/abs/1804.04976v1
PDF http://arxiv.org/pdf/1804.04976v1.pdf
PWC https://paperswithcode.com/paper/online-fall-detection-using-recurrent-neural
Repo
Framework

Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units

Title Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units
Authors Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong
Abstract The acoustic-to-word model based on the Connectionist Temporal Classification (CTC) criterion is a natural end-to-end (E2E) system directly targeting word as output unit. Two issues exist in the system: first, the current output of the CTC model relies on the current input and does not account for context weighted inputs. This is the hard alignment issue. Second, the word-based CTC model suffers from the out-of-vocabulary (OOV) issue. This means it can model only frequently occurring words while tagging the remaining words as OOV. Hence, such a model is limited in its capacity in recognizing only a fixed set of frequent words. In this study, we propose addressing these problems using a combination of attention mechanism and mixed-units. In particular, we introduce Attention CTC, Self-Attention CTC, Hybrid CTC, and Mixed-unit CTC. First, we blend attention modeling capabilities directly into the CTC network using Attention CTC and Self-Attention CTC. Second, to alleviate the OOV issue, we present Hybrid CTC which uses a word and letter CTC with shared hidden layers. The Hybrid CTC consults the letter CTC when the word CTC emits an OOV. Then, we propose a much better solution by training a Mixed-unit CTC which decomposes all the OOV words into sequences of frequent words and multi-letter units. Evaluated on a 3400 hours Microsoft Cortana voice assistant task, our final acoustic-to-word solution using attention and mixed-units achieves a relative reduction in word error rate (WER) over the vanilla word CTC by 12.09%. Such an E2E model without using any language model (LM) or complex decoder also outperforms a traditional context-dependent (CD) phoneme CTC with strong LM and decoder by 6.79% relative.
Tasks Language Modelling
Published 2018-12-31
URL https://arxiv.org/abs/1812.11928v2
PDF https://arxiv.org/pdf/1812.11928v2.pdf
PWC https://paperswithcode.com/paper/advancing-acoustic-to-word-ctc-model-with
Repo
Framework

Logistic Regression, Neural Networks and Dempster-Shafer Theory: a New Perspective

Title Logistic Regression, Neural Networks and Dempster-Shafer Theory: a New Perspective
Authors Thierry Denoeux
Abstract We revisit logistic regression and its nonlinear extensions, including multilayer feedforward neural networks, by showing that these classifiers can be viewed as converting input or higher-level features into Dempster-Shafer mass functions and aggregating them by Dempster’s rule of combination. The probabilistic outputs of these classifiers are the normalized plausibilities corresponding to the underlying combined mass function. This mass function is more informative than the output probability distribution. In particular, it makes it possible to distinguish between lack of evidence (when none of the features provides discriminant information) from conflicting evidence (when different features support different classes). This expressivity of mass functions allows us to gain insight into the role played by each input feature in logistic regression, and to interpret hidden unit outputs in multilayer neural networks. It also makes it possible to use alternative decision rules, such as interval dominance, which select a set of classes when the available evidence does not unambiguously point to a single class, thus trading reduced error rate for higher imprecision.
Tasks
Published 2018-07-05
URL https://arxiv.org/abs/1807.01846v3
PDF https://arxiv.org/pdf/1807.01846v3.pdf
PWC https://paperswithcode.com/paper/logistic-regression-neural-networks-and
Repo
Framework

Object-oriented lexical encoding of multiword expressions: Short and sweet

Title Object-oriented lexical encoding of multiword expressions: Short and sweet
Authors Agata Savary, Simon Petitjean, Timm Lichte, Laura Kallmeyer, Jakub Waszczuk
Abstract Multiword expressions (MWEs) exhibit both regular and idiosyncratic properties. Their idiosyncrasy requires lexical encoding in parallel with their component words. Their (at times intricate) regularity, on the other hand, calls for means of flexible factorization to avoid redundant descriptions of shared properties. However, so far, non-redundant general-purpose lexical encoding of MWEs has not received a satisfactory solution. We offer a proof of concept that this challenge might be effectively addressed within eXtensible MetaGrammar (XMG), an object-oriented metagrammar framework. We first make an existing metagrammatical resource, the FrenchTAG grammar, MWE-aware. We then evaluate the factorization gain during incremental implementation with XMG on a dataset extracted from an MWE-annotated reference corpus.
Tasks
Published 2018-10-23
URL http://arxiv.org/abs/1810.09947v1
PDF http://arxiv.org/pdf/1810.09947v1.pdf
PWC https://paperswithcode.com/paper/object-oriented-lexical-encoding-of-multiword
Repo
Framework

Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: a winning solution to the NIJ “Real-Time Crime Forecasting Challenge”

Title Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: a winning solution to the NIJ “Real-Time Crime Forecasting Challenge”
Authors Seth Flaxman, Michael Chirico, Pau Pereira, Charles Loeffler
Abstract We propose a generic spatiotemporal event forecasting method, which we developed for the National Institute of Justice’s (NIJ) Real-Time Crime Forecasting Challenge. Our method is a spatiotemporal forecasting model combining scalable randomized Reproducing Kernel Hilbert Space (RKHS) methods for approximating Gaussian processes with autoregressive smoothing kernels in a regularized supervised learning framework. While the smoothing kernels capture the two main approaches in current use in the field of crime forecasting, kernel density estimation (KDE) and self-exciting point process (SEPP) models, the RKHS component of the model can be understood as an approximation to the popular log-Gaussian Cox Process model. For inference, we discretize the spatiotemporal point pattern and learn a log-intensity function using the Poisson likelihood and highly efficient gradient-based optimization methods. Model hyperparameters including quality of RKHS approximation, spatial and temporal kernel lengthscales, number of autoregressive lags, bandwidths for smoothing kernels, as well as cell shape, size, and rotation, were learned using crossvalidation. Resulting predictions significantly exceeded baseline KDE estimates and SEPP models for sparse events.
Tasks Density Estimation, Gaussian Processes
Published 2018-01-09
URL https://arxiv.org/abs/1801.02858v4
PDF https://arxiv.org/pdf/1801.02858v4.pdf
PWC https://paperswithcode.com/paper/scalable-high-resolution-forecasting-of
Repo
Framework

End-to-End Retrieval in Continuous Space

Title End-to-End Retrieval in Continuous Space
Authors Daniel Gillick, Alessandro Presta, Gaurav Singh Tomar
Abstract Most text-based information retrieval (IR) systems index objects by words or phrases. These discrete systems have been augmented by models that use embeddings to measure similarity in continuous space. But continuous-space models are typically used just to re-rank the top candidates. We consider the problem of end-to-end continuous retrieval, where standard approximate nearest neighbor (ANN) search replaces the usual discrete inverted index, and rely entirely on distances between learned embeddings. By training simple models specifically for retrieval, with an appropriate model architecture, we improve on a discrete baseline by 8% and 26% (MAP) on two similar-question retrieval tasks. We also discuss the problem of evaluation for retrieval systems, and show how to modify existing pairwise similarity datasets for this purpose.
Tasks Information Retrieval
Published 2018-11-19
URL http://arxiv.org/abs/1811.08008v1
PDF http://arxiv.org/pdf/1811.08008v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-retrieval-in-continuous-space
Repo
Framework
comments powered by Disqus