May 5, 2019

2948 words 14 mins read

Paper Group ANR 467

Probabilistic Failure Analysis in Model Validation & Verification. k-variates++: more pluses in the k-means++. Warped Convolutions: Efficient Invariance to Spatial Transformations. Forecasting Volatility in Indian Stock Market using Artificial Neural Network with Multiple Inputs and Outputs. Savu: A Python-based, MPI Framework for Simultaneous Proc …

Probabilistic Failure Analysis in Model Validation & Verification


Title	Probabilistic Failure Analysis in Model Validation & Verification
Authors	Ning Ge, Marc Pantel, Xavier Crégut
Abstract	Automated fault localization is an important issue in model validation and verification. It helps the end users in analyzing the origin of failure. In this work, we show the early experiments with probabilistic analysis approaches in fault localization. Inspired by the Kullback-Leibler Divergence from Bayesian probabilistic theory, we propose a suspiciousness factor to compute the fault contribution for the transitions in the reachability graph of model checking, using which to rank the potential faulty transitions. To automatically locate design faults in the simulation model of detailed design, we propose to use the statistical model Hidden Markov Model (HMM), which provides statistically identical information to component’s real behavior. The core of this method is a fault localization algorithm that gives out the set of suspicious ranked faulty components and a backward algorithm that computes the matching degree between the HMM and the simulation model to evaluate the confidence degree of the localization conclusion.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.05083v2
PDF	http://arxiv.org/pdf/1611.05083v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-failure-analysis-in-model
Repo
Framework

k-variates++: more pluses in the k-means++


Title	k-variates++: more pluses in the k-means++
Authors	Richard Nock, Raphaël Canyasse, Roksana Boreli, Frank Nielsen
Abstract	k-means++ seeding has become a de facto standard for hard clustering algorithms. In this paper, our first contribution is a two-way generalisation of this seeding, k-variates++, that includes the sampling of general densities rather than just a discrete set of Dirac densities anchored at the point locations, and a generalisation of the well known Arthur-Vassilvitskii (AV) approximation guarantee, in the form of a bias+variance approximation bound of the global optimum. This approximation exhibits a reduced dependency on the “noise” component with respect to the optimal potential — actually approaching the statistical lower bound. We show that k-variates++ reduces to efficient (biased seeding) clustering algorithms tailored to specific frameworks; these include distributed, streaming and on-line clustering, with direct approximation results for these algorithms. Finally, we present a novel application of k-variates++ to differential privacy. For either the specific frameworks considered here, or for the differential privacy setting, there is little to no prior results on the direct application of k-means++ and its approximation bounds — state of the art contenders appear to be significantly more complex and / or display less favorable (approximation) properties. We stress that our algorithms can still be run in cases where there is \textit{no} closed form solution for the population minimizer. We demonstrate the applicability of our analysis via experimental evaluation on several domains and settings, displaying competitive performances vs state of the art.
Tasks
Published	2016-02-03
URL	http://arxiv.org/abs/1602.01198v2
PDF	http://arxiv.org/pdf/1602.01198v2.pdf
PWC	https://paperswithcode.com/paper/k-variates-more-pluses-in-the-k-means
Repo
Framework

Warped Convolutions: Efficient Invariance to Spatial Transformations


Title	Warped Convolutions: Efficient Invariance to Spatial Transformations
Authors	João F. Henriques, Andrea Vedaldi
Abstract	Convolutional Neural Networks (CNNs) are extremely efficient, since they exploit the inherent translation-invariance of natural images. However, translation is just one of a myriad of useful spatial transformations. Can the same efficiency be attained when considering other spatial invariances? Such generalized convolutions have been considered in the past, but at a high computational cost. We present a construction that is simple and exact, yet has the same computational complexity that standard convolutions enjoy. It consists of a constant image warp followed by a simple convolution, which are standard blocks in deep learning toolboxes. With a carefully crafted warp, the resulting architecture can be made equivariant to a wide range of two-parameter spatial transformations. We show encouraging results in realistic scenarios, including the estimation of vehicle poses in the Google Earth dataset (rotation and scale), and face poses in Annotated Facial Landmarks in the Wild (3D rotations under perspective).
Tasks
Published	2016-09-14
URL	http://arxiv.org/abs/1609.04382v4
PDF	http://arxiv.org/pdf/1609.04382v4.pdf
PWC	https://paperswithcode.com/paper/warped-convolutions-efficient-invariance-to
Repo
Framework

Forecasting Volatility in Indian Stock Market using Artificial Neural Network with Multiple Inputs and Outputs


Title	Forecasting Volatility in Indian Stock Market using Artificial Neural Network with Multiple Inputs and Outputs
Authors	Tamal Datta Chaudhuri, Indranil Ghosh
Abstract	Volatility in stock markets has been extensively studied in the applied finance literature. In this paper, Artificial Neural Network models based on various back propagation algorithms have been constructed to predict volatility in the Indian stock market through volatility of NIFTY returns and volatility of gold returns. This model considers India VIX, CBOE VIX, volatility of crude oil returns (CRUDESDR), volatility of DJIA returns (DJIASDR), volatility of DAX returns (DAXSDR), volatility of Hang Seng returns (HANGSDR) and volatility of Nikkei returns (NIKKEISDR) as predictor variables. Three sets of experiments have been performed over three time periods to judge the effectiveness of the approach.
Tasks
Published	2016-04-18
URL	http://arxiv.org/abs/1604.05008v1
PDF	http://arxiv.org/pdf/1604.05008v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-volatility-in-indian-stock-market
Repo
Framework

Savu: A Python-based, MPI Framework for Simultaneous Processing of Multiple, N-dimensional, Large Tomography Datasets


Title	Savu: A Python-based, MPI Framework for Simultaneous Processing of Multiple, N-dimensional, Large Tomography Datasets
Authors	Nicola Wadeson, Mark Basham
Abstract	Diamond Light Source (DLS), the UK synchrotron facility, attracts scientists from across the world to perform ground-breaking x-ray experiments. With over 3000 scientific users per year, vast amounts of data are collected across the experimental beamlines, with the highest volume of data collected during tomographic imaging experiments. A growing interest in tomography as an imaging technique, has led to an expansion in the range of experiments performed, in addition to a growth in the size of the data per experiment. Savu is a portable, flexible, scientific processing pipeline capable of processing multiple, n-dimensional datasets in serial on a PC, or in parallel across a cluster. Developed at DLS, and successfully deployed across the beamlines, it uses a modular plugin format to enable experiment-specific processing and utilises parallel HDF5 to remove RAM restrictions. The Savu design, described throughout this paper, focuses on easy integration of existing and new functionality, flexibility and ease of use for users and developers alike.
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.08015v1
PDF	http://arxiv.org/pdf/1610.08015v1.pdf
PWC	https://paperswithcode.com/paper/savu-a-python-based-mpi-framework-for
Repo
Framework

Revisiting Multiple Instance Neural Networks


Title	Revisiting Multiple Instance Neural Networks
Authors	Xinggang Wang, Yongluan Yan, Peng Tang, Xiang Bai, Wenyu Liu
Abstract	Recently neural networks and multiple instance learning are both attractive topics in Artificial Intelligence related research fields. Deep neural networks have achieved great success in supervised learning problems, and multiple instance learning as a typical weakly-supervised learning method is effective for many applications in computer vision, biometrics, nature language processing, etc. In this paper, we revisit the problem of solving multiple instance learning problems using neural networks. Neural networks are appealing for solving multiple instance learning problem. The multiple instance neural networks perform multiple instance learning in an end-to-end way, which take a bag with various number of instances as input and directly output bag label. All of the parameters in a multiple instance network are able to be optimized via back-propagation. We propose a new multiple instance neural network to learn bag representations, which is different from the existing multiple instance neural networks that focus on estimating instance label. In addition, recent tricks developed in deep learning have been studied in multiple instance networks, we find deep supervision is effective for boosting bag classification accuracy. In the experiments, the proposed multiple instance networks achieve state-of-the-art or competitive performance on several MIL benchmarks. Moreover, it is extremely fast for both testing and training, e.g., it takes only 0.0003 second to predict a bag and a few seconds to train on a MIL datasets on a moderate CPU.
Tasks	Multiple Instance Learning
Published	2016-10-08
URL	http://arxiv.org/abs/1610.02501v1
PDF	http://arxiv.org/pdf/1610.02501v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-multiple-instance-neural-networks
Repo
Framework

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models


Title	Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models
Authors	Bernardo Ávila Pires, Csaba Szepesvári
Abstract	In this paper we study a model-based approach to calculating approximately optimal policies in Markovian Decision Processes. In particular, we derive novel bounds on the loss of using a policy derived from a factored linear model, a class of models which generalize numerous previous models out of those that come with strong computational guarantees. For the first time in the literature, we derive performance bounds for model-based techniques where the model inaccuracy is measured in weighted norms. Moreover, our bounds show a decreased sensitivity to the discount factor and, unlike similar bounds derived for other approaches, they are insensitive to measure mismatch. Similarly to previous works, our proofs are also based on contraction arguments, but with the main differences that we use carefully constructed norms building on Banach lattices, and the contraction property is only assumed for operators acting on “compressed” spaces, thus weakening previous assumptions, while strengthening previous results.
Tasks
Published	2016-02-19
URL	http://arxiv.org/abs/1602.06346v2
PDF	http://arxiv.org/pdf/1602.06346v2.pdf
PWC	https://paperswithcode.com/paper/policy-error-bounds-for-model-based
Repo
Framework

Exact Lower Bounds for the Agnostic Probably-Approximately-Correct (PAC) Machine Learning Model


Title	Exact Lower Bounds for the Agnostic Probably-Approximately-Correct (PAC) Machine Learning Model
Authors	Aryeh Kontorovich, Iosif Pinelis
Abstract	We provide an exact non-asymptotic lower bound on the minimax expected excess risk (EER) in the agnostic probably-ap-proximately-correct (PAC) machine learning classification model and identify minimax learning algorithms as certain maximally symmetric and minimally randomized “voting” procedures. Based on this result, an exact asymptotic lower bound on the minimax EER is provided. This bound is of the simple form $c_\infty/\sqrt{\nu}$ as $\nu\to\infty$, where $c_\infty=0.16997\dots$ is a universal constant, $\nu=m/d$, $m$ is the size of the training sample, and $d$ is the Vapnik–Chervonenkis dimension of the hypothesis class. It is shown that the differences between these asymptotic and non-asymptotic bounds, as well as the differences between these two bounds and the maximum EER of any learning algorithms that minimize the empirical risk, are asymptotically negligible, and all these differences are due to ties in the mentioned “voting” procedures. A few easy to compute non-asymptotic lower bounds on the minimax EER are also obtained, which are shown to be close to the exact asymptotic lower bound $c_\infty/\sqrt{\nu}$ even for rather small values of the ratio $\nu=m/d$. As an application of these results, we substantially improve existing lower bounds on the tail probability of the excess risk. Among the tools used are Bayes estimation and apparently new identities and inequalities for binomial distributions.
Tasks
Published	2016-06-29
URL	http://arxiv.org/abs/1606.08920v2
PDF	http://arxiv.org/pdf/1606.08920v2.pdf
PWC	https://paperswithcode.com/paper/exact-lower-bounds-for-the-agnostic-probably
Repo
Framework

Product Classification in E-Commerce using Distributional Semantics


Title	Product Classification in E-Commerce using Distributional Semantics
Authors	Vivek Gupta, Harish Karnick, Ashendra Bansal, Pradhuman Jhala
Abstract	Product classification is the task of automatically predicting a taxonomy path for a product in a predefined taxonomy hierarchy given a textual product description or title. For efficient product classification we require a suitable representation for a document (the textual description of a product) feature vector and efficient and fast algorithms for prediction. To address the above challenges, we propose a new distributional semantics representation for document vector formation. We also develop a new two-level ensemble approach utilizing (with respect to the taxonomy tree) a path-wise, node-wise and depth-wise classifiers for error reduction in the final product classification. Our experiments show the effectiveness of the distributional representation and the ensemble approach on data sets from a leading e-commerce platform and achieve better results on various evaluation metrics compared to earlier approaches.
Tasks
Published	2016-06-20
URL	http://arxiv.org/abs/1606.06083v2
PDF	http://arxiv.org/pdf/1606.06083v2.pdf
PWC	https://paperswithcode.com/paper/product-classification-in-e-commerce-using
Repo
Framework

A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward


Title	A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward
Authors	S. A. Murphy, Y. Deng, E. B. Laber, H. R. Maei, R. S. Sutton, K. Witkiewitz
Abstract	We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.
Tasks
Published	2016-07-18
URL	http://arxiv.org/abs/1607.05047v1
PDF	http://arxiv.org/pdf/1607.05047v1.pdf
PWC	https://paperswithcode.com/paper/a-batch-off-policy-actor-critic-algorithm-for
Repo
Framework

Comparing Fifty Natural Languages and Twelve Genetic Languages Using Word Embedding Language Divergence (WELD) as a Quantitative Measure of Language Distance


Title	Comparing Fifty Natural Languages and Twelve Genetic Languages Using Word Embedding Language Divergence (WELD) as a Quantitative Measure of Language Distance
Authors	Ehsaneddin Asgari, Mohammad R. K. Mofrad
Abstract	We introduce a new measure of distance between languages based on word embedding, called word embedding language divergence (WELD). WELD is defined as divergence between unified similarity distribution of words between languages. Using such a measure, we perform language comparison for fifty natural languages and twelve genetic languages. Our natural language dataset is a collection of sentence-aligned parallel corpora from bible translations for fifty languages spanning a variety of language families. Although we use parallel corpora, which guarantees having the same content in all languages, interestingly in many cases languages within the same family cluster together. In addition to natural languages, we perform language comparison for the coding regions in the genomes of 12 different organisms (4 plants, 6 animals, and two human subjects). Our result confirms a significant high-level difference in the genetic language model of humans/animals versus plants. The proposed method is a step toward defining a quantitative measure of similarity between languages, with applications in languages classification, genre identification, dialect identification, and evaluation of translations.
Tasks	Language Modelling
Published	2016-04-28
URL	http://arxiv.org/abs/1604.08561v1
PDF	http://arxiv.org/pdf/1604.08561v1.pdf
PWC	https://paperswithcode.com/paper/comparing-fifty-natural-languages-and-twelve
Repo
Framework

The SVM Classifier Based on the Modified Particle Swarm Optimization


Title	The SVM Classifier Based on the Modified Particle Swarm Optimization
Authors	L. Demidova, E. Nikulchev, Yu. Sokolova
Abstract	The problem of development of the SVM classifier based on the modified particle swarm optimization has been considered. This algorithm carries out the simultaneous search of the kernel function type, values of the kernel function parameters and value of the regularization parameter for the SVM classifier. Such SVM classifier provides the high quality of data classification. The idea of particles’ {\guillemotleft}regeneration{\guillemotright} is put on the basis of the modified particle swarm optimization algorithm. At the realization of this idea, some particles change their kernel function type to the one which corresponds to the particle with the best value of the classification accuracy. The offered particle swarm optimization algorithm allows reducing the time expenditures for development of the SVM classifier. The results of experimental studies confirm the efficiency of this algorithm.
Tasks
Published	2016-03-21
URL	http://arxiv.org/abs/1603.08296v1
PDF	http://arxiv.org/pdf/1603.08296v1.pdf
PWC	https://paperswithcode.com/paper/the-svm-classifier-based-on-the-modified
Repo
Framework

The Mehler-Fock Transform and some Applications in Texture Analysis and Color Processing


Title	The Mehler-Fock Transform and some Applications in Texture Analysis and Color Processing
Authors	Reiner Lenz
Abstract	Many stochastic processes are defined on special geometrical objects like spheres and cones. We describe how tools from harmonic analysis, i.e. Fourier analysis on groups, can be used to investigate probability density functions (pdfs) on groups and homogeneous spaces. We consider the special case of the Lorentz group SU(1,1) and the unit disk with its hyperbolic geometry, but the procedure can be generalized to a much wider class of Lie-groups. We mainly concentrate on the Mehler-Fock transform which is the radial part of the Fourier transform on the disk. Some of the characteristic features of this transform are the relation to group-convolutions, the isometry between signal and transform space, the relation to the Laplace-Beltrami operator and the relation to group representation theory. We will give an overview over these properties and their applications in signal processing. We will illustrate the theory with two examples from low-level vision and color image processing.
Tasks	Texture Classification
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04573v1
PDF	http://arxiv.org/pdf/1612.04573v1.pdf
PWC	https://paperswithcode.com/paper/the-mehler-fock-transform-and-some
Repo
Framework

Graph Based Sinogram Denoising for Tomographic Reconstructions


Title	Graph Based Sinogram Denoising for Tomographic Reconstructions
Authors	Faisal Mahmood, Nauman Shahid, Pierre Vandergheynst, Ulf Skoglund
Abstract	Limited data and low dose constraints are common problems in a variety of tomographic reconstruction paradigms which lead to noisy and incomplete data. Over the past few years sinogram denoising has become an essential pre-processing step for low dose Computed Tomographic (CT) reconstructions. We propose a novel sinogram denoising algorithm inspired by the modern field of signal processing on graphs. Graph based methods often perform better than standard filtering operations since they can exploit the signal structure. This makes the sinogram an ideal candidate for graph based denoising since it generally has a piecewise smooth structure. We test our method with a variety of phantoms and different reconstruction methods. Our numerical study shows that the proposed algorithm improves the performance of analytical filtered back-projection (FBP) and iterative methods ART (Kaczmarz) and SIRT (Cimmino).We observed that graph denoised sinogram always minimizes the error measure and improves the accuracy of the solution as compared to regular reconstructions.
Tasks	Denoising, Tomographic Reconstructions
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04203v1
PDF	http://arxiv.org/pdf/1603.04203v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-sinogram-denoising-for
Repo
Framework

The Multiscale Laplacian Graph Kernel


Title	The Multiscale Laplacian Graph Kernel
Authors	Risi Kondor, Horace Pan
Abstract	Many real world graphs, such as the graphs of molecules, exhibit structure at multiple different scales, but most existing kernels between graphs are either purely local or purely global in character. In contrast, by building a hierarchy of nested subgraphs, the Multiscale Laplacian Graph kernels (MLG kernels) that we define in this paper can account for structure at a range of different scales. At the heart of the MLG construction is another new graph kernel, called the Feature Space Laplacian Graph kernel (FLG kernel), which has the property that it can lift a base kernel defined on the vertices of two graphs to a kernel between the graphs. The MLG kernel applies such FLG kernels to subgraphs recursively. To make the MLG kernel computationally feasible, we also introduce a randomized projection procedure, similar to the Nystr"om method, but for RKHS operators.
Tasks	Graph Classification
Published	2016-03-20
URL	http://arxiv.org/abs/1603.06186v2
PDF	http://arxiv.org/pdf/1603.06186v2.pdf
PWC	https://paperswithcode.com/paper/the-multiscale-laplacian-graph-kernel
Repo
Framework