May 6, 2019

3069 words 15 mins read

Paper Group ANR 366

K-Bit-Swap: A New Operator For Real-Coded Evolutionary Algorithms. Perceptually Consistent Color-to-Gray Image Conversion. A short note on extension theorems and their connection to universal consistency in machine learning. Abducing Compliance of Incomplete Event Logs. An Empirical Evaluation of various Deep Learning Architectures for Bi-Sequence …

K-Bit-Swap: A New Operator For Real-Coded Evolutionary Algorithms


Title	K-Bit-Swap: A New Operator For Real-Coded Evolutionary Algorithms
Authors	Aram Ter-Sarkisov, Stephen Marsland
Abstract	There has been a variety of crossover operators proposed for Real-Coded Genetic Algorithms (RCGAs), which recombine values from the same location in pairs of strings. In this article we present a recombination operator for RC- GAs that selects the locations randomly in both parents, and compare it to mainstream crossover operators in a set of experiments on a range of standard multidimensional optimization problems and a clustering problem. We present two variants of the operator, either selecting both bits uniformly at random in the strings, or sampling the second bit from a normal distribution centered at the selected location in the first string. While the operator is biased towards exploitation of fitness space, the random selection of the second bit for swap- ping makes it slightly less exploitation-biased. Extensive statistical analysis using a non-parametric test shows the advantage of the new recombination operators on a range of test functions.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06607v1
PDF	http://arxiv.org/pdf/1604.06607v1.pdf
PWC	https://paperswithcode.com/paper/k-bit-swap-a-new-operator-for-real-coded
Repo
Framework

Perceptually Consistent Color-to-Gray Image Conversion


Title	Perceptually Consistent Color-to-Gray Image Conversion
Authors	Shaodi You, Nick Barnes, Janine Walker
Abstract	In this paper, we propose a color to grayscale image conversion algorithm (C2G) that aims to preserve the perceptual properties of the color image as much as possible. To this end, we propose measures for two perceptual properties based on contemporary research in vision science: brightness and multi-scale contrast. The brightness measurement is based on the idea that the brightness of a grayscale image will affect the perception of the probability of color information. The color contrast measurement is based on the idea that the contrast of a given pixel to its surroundings can be measured as a linear combination of color contrast at different scales. Based on these measures we propose a graph based optimization framework to balance the brightness and contrast measurements. To solve the optimization, an $\ell_1$-norm based method is provided which converts color discontinuities to brightness discontinuities. To validate our methods, we evaluate against the existing \cadik and Color250 datasets, and against NeoColor, a new dataset that improves over existing C2G datasets. NeoColor contains around 300 images from typical C2G scenarios, including: commercial photograph, printing, books, magazines, masterpiece artworks and computer designed graphics. We show improvements in metrics of performance, and further through a user study, we validate the performance of both the algorithm and the metric.
Tasks
Published	2016-05-06
URL	http://arxiv.org/abs/1605.01843v1
PDF	http://arxiv.org/pdf/1605.01843v1.pdf
PWC	https://paperswithcode.com/paper/perceptually-consistent-color-to-gray-image
Repo
Framework

A short note on extension theorems and their connection to universal consistency in machine learning


Title	A short note on extension theorems and their connection to universal consistency in machine learning
Authors	Andreas Christmann, Florian Dumpert, Dao-Hong Xiang
Abstract	Statistical machine learning plays an important role in modern statistics and computer science. One main goal of statistical machine learning is to provide universally consistent algorithms, i.e., the estimator converges in probability or in some stronger sense to the Bayes risk or to the Bayes decision function. Kernel methods based on minimizing the regularized risk over a reproducing kernel Hilbert space (RKHS) belong to these statistical machine learning methods. It is in general unknown which kernel yields optimal results for a particular data set or for the unknown probability measure. Hence various kernel learning methods were proposed to choose the kernel and therefore also its RKHS in a data adaptive manner. Nevertheless, many practitioners often use the classical Gaussian RBF kernel or certain Sobolev kernels with good success. The goal of this short note is to offer one possible theoretical explanation for this empirical fact.
Tasks
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04505v1
PDF	http://arxiv.org/pdf/1604.04505v1.pdf
PWC	https://paperswithcode.com/paper/a-short-note-on-extension-theorems-and-their
Repo
Framework

Abducing Compliance of Incomplete Event Logs


Title	Abducing Compliance of Incomplete Event Logs
Authors	Federico Chesani, Riccardo De Masellis, Chiara Di Francescomarino, Chiara Ghidini, Paola Mello, Marco Montali, Sergio Tessaris
Abstract	The capability to store data about business processes execution in so-called Event Logs has brought to the diffusion of tools for the analysis of process executions and for the assessment of the goodness of a process model. Nonetheless, these tools are often very rigid in dealing with with Event Logs that include incomplete information about the process execution. Thus, while the ability of handling incomplete event data is one of the challenges mentioned in the process mining manifesto, the evaluation of compliance of an execution trace still requires an end-to-end complete trace to be performed. This paper exploits the power of abduction to provide a flexible, yet computationally effective, framework to deal with different forms of incompleteness in an Event Log. Moreover it proposes a refinement of the classical notion of compliance into strong and conditional compliance to take into account incomplete logs. Finally, performances evaluation in an experimental setting shows the feasibility of the presented approach.
Tasks
Published	2016-06-17
URL	http://arxiv.org/abs/1606.05446v1
PDF	http://arxiv.org/pdf/1606.05446v1.pdf
PWC	https://paperswithcode.com/paper/abducing-compliance-of-incomplete-event-logs
Repo
Framework

An Empirical Evaluation of various Deep Learning Architectures for Bi-Sequence Classification Tasks


Title	An Empirical Evaluation of various Deep Learning Architectures for Bi-Sequence Classification Tasks
Authors	Anirban Laha, Vikas Raykar
Abstract	Several tasks in argumentation mining and debating, question-answering, and natural language inference involve classifying a sequence in the context of another sequence (referred as bi-sequence classification). For several single sequence classification tasks, the current state-of-the-art approaches are based on recurrent and convolutional neural networks. On the other hand, for bi-sequence classification problems, there is not much understanding as to the best deep learning architecture. In this paper, we attempt to get an understanding of this category of problems by extensive empirical evaluation of 19 different deep learning architectures (specifically on different ways of handling context) for various problems originating in natural language processing like debating, textual entailment and question-answering. Following the empirical evaluation, we offer our insights and conclusions regarding the architectures we have considered. We also establish the first deep learning baselines for three argumentation mining tasks.
Tasks	Natural Language Inference, Question Answering
Published	2016-07-17
URL	http://arxiv.org/abs/1607.04853v2
PDF	http://arxiv.org/pdf/1607.04853v2.pdf
PWC	https://paperswithcode.com/paper/an-empirical-evaluation-of-various-deep
Repo
Framework

Causal Compression


Title	Causal Compression
Authors	Aleksander Wieczorek, Volker Roth
Abstract	We propose a new method of discovering causal relationships in temporal data based on the notion of causal compression. To this end, we adopt the Pearlian graph setting and the directed information as an information theoretic tool for quantifying causality. We introduce chain rule for directed information and use it to motivate causal sparsity. We show two applications of the proposed method: causal time series segmentation which selects time points capturing the incoming and outgoing causal flow between time points belonging to different signals, and causal bipartite graph recovery. We prove that modelling of causality in the adopted set-up only requires estimating the copula density of the data distribution and thus does not depend on its marginals. We evaluate the method on time resolved gene expression data.
Tasks	Time Series
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00261v1
PDF	http://arxiv.org/pdf/1611.00261v1.pdf
PWC	https://paperswithcode.com/paper/causal-compression
Repo
Framework

Combining multiscale features for classification of hyperspectral images: a sequence based kernel approach


Title	Combining multiscale features for classification of hyperspectral images: a sequence based kernel approach
Authors	Yanwei Cui, Laetitia Chapel, Sébastien Lefèvre
Abstract	Nowadays, hyperspectral image classification widely copes with spatial information to improve accuracy. One of the most popular way to integrate such information is to extract hierarchical features from a multiscale segmentation. In the classification context, the extracted features are commonly concatenated into a long vector (also called stacked vector), on which is applied a conventional vector-based machine learning technique (e.g. SVM with Gaussian kernel). In this paper, we rather propose to use a sequence structured kernel: the spectrum kernel. We show that the conventional stacked vector-based kernel is actually a special case of this kernel. Experiments conducted on various publicly available hyperspectral datasets illustrate the improvement of the proposed kernel w.r.t. conventional ones using the same hierarchical spatial features.
Tasks	Classification Of Hyperspectral Images, Hyperspectral Image Classification, Image Classification
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04985v1
PDF	http://arxiv.org/pdf/1606.04985v1.pdf
PWC	https://paperswithcode.com/paper/combining-multiscale-features-for
Repo
Framework

Partial Sum Minimization of Singular Values Representation on Grassmann Manifolds


Title	Partial Sum Minimization of Singular Values Representation on Grassmann Manifolds
Authors	Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
Abstract	As a significant subspace clustering method, low rank representation (LRR) has attracted great attention in recent years. To further improve the performance of LRR and extend its applications, there are several issues to be resolved. The nuclear norm in LRR does not sufficiently use the prior knowledge of the rank which is known in many practical problems. The LRR is designed for vectorial data from linear spaces, thus not suitable for high dimensional data with intrinsic non-linear manifold structure. This paper proposes an extended LRR model for manifold-valued Grassmann data which incorporates prior knowledge by minimizing partial sum of singular values instead of the nuclear norm, namely Partial Sum minimization of Singular Values Representation (GPSSVR). The new model not only enforces the global structure of data in low rank, but also retains important information by minimizing only smaller singular values. To further maintain the local structures among Grassmann points, we also integrate the Laplacian penalty with GPSSVR. An effective algorithm is proposed to solve the optimization problem based on the GPSSVR model. The proposed model and algorithms are assessed on some widely used human action video datasets and a real scenery dataset. The experimental results show that the proposed methods obviously outperform other state-of-the-art methods.
Tasks
Published	2016-01-21
URL	http://arxiv.org/abs/1601.05613v3
PDF	http://arxiv.org/pdf/1601.05613v3.pdf
PWC	https://paperswithcode.com/paper/partial-sum-minimization-of-singular-values
Repo
Framework

A polynomial-time relaxation of the Gromov-Hausdorff distance


Title	A polynomial-time relaxation of the Gromov-Hausdorff distance
Authors	Soledad Villar, Afonso S. Bandeira, Andrew J. Blumberg, Rachel Ward
Abstract	The Gromov-Hausdorff distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately, computing this metric directly is believed to be computationally intractable. Motivated by applications in shape matching and point-cloud comparison, we study a semidefinite programming relaxation of the Gromov-Hausdorff metric. This relaxation can be computed in polynomial time, and somewhat surprisingly is itself a pseudometric. We describe the induced topology on the set of compact metric spaces. Finally, we demonstrate the numerical performance of various algorithms for computing the relaxed distance and apply these algorithms to several relevant data sets. In particular we propose a greedy algorithm for finding the best correspondence between finite metric spaces that can handle hundreds of points.
Tasks
Published	2016-10-17
URL	http://arxiv.org/abs/1610.05214v2
PDF	http://arxiv.org/pdf/1610.05214v2.pdf
PWC	https://paperswithcode.com/paper/a-polynomial-time-relaxation-of-the-gromov
Repo
Framework

Exploring Context with Deep Structured models for Semantic Segmentation


Title	Exploring Context with Deep Structured models for Semantic Segmentation
Authors	Guosheng Lin, Chunhua Shen, Anton van den Hengel, Ian Reid
Abstract	State-of-the-art semantic image segmentation methods are mostly based on training deep convolutional neural networks (CNNs). In this work, we proffer to improve semantic segmentation with the use of contextual information. In particular, we explore `patch-patch' context and` patch-background’ context in deep CNNs. We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions. Specifically, we formulate CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied in order to avoid repeated expensive CRF inference during the course of back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image inputs and sliding pyramid pooling is very effective for improving performance. We perform comprehensive evaluation of the proposed method. We achieve new state-of-the-art performance on a number of challenging semantic segmentation datasets including $NYUDv2$, $PASCAL$-$VOC2012$, $Cityscapes$, $PASCAL$-$Context$, $SUN$-$RGBD$, $SIFT$-$flow$, and $KITTI$ datasets. Particularly, we report an intersection-over-union score of $77.8$ on the $PASCAL$-$VOC2012$ dataset.
Tasks	Semantic Segmentation
Published	2016-03-10
URL	http://arxiv.org/abs/1603.03183v3
PDF	http://arxiv.org/pdf/1603.03183v3.pdf
PWC	https://paperswithcode.com/paper/exploring-context-with-deep-structured-models
Repo
Framework

Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers


Title	Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers
Authors	Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Ryan Tibshirani
Abstract	We consider the problem of estimating a function defined over $n$ locations on a $d$-dimensional grid (having all side lengths equal to $n^{1/d}$). When the function is constrained to have discrete total variation bounded by $C_n$, we derive the minimax optimal (squared) $\ell_2$ estimation error rate, parametrized by $n$ and $C_n$. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Several simpler estimators exist, such as Laplacian smoothing and Laplacian eigenmaps. A natural question is: can these simpler estimators perform just as well? We prove that these estimators, and more broadly all estimators given by linear transformations of the input data, are suboptimal over the class of functions with bounded variation. This extends fundamental findings of Donoho and Johnstone [1998] on 1-dimensional total variation spaces to higher dimensions. The implication is that the computationally simpler methods cannot be used for such sophisticated denoising tasks, without sacrificing statistical accuracy. We also derive minimax rates for discrete Sobolev spaces over $d$-dimensional grids, which are, in some sense, smaller than the total variation function spaces. Indeed, these are small enough spaces that linear estimators can be optimal—and a few well-known ones are, such as Laplacian smoothing and Laplacian eigenmaps, as we show. Lastly, we investigate the problem of adaptivity of the total variation denoiser to these smaller Sobolev function spaces.
Tasks	Denoising
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08400v1
PDF	http://arxiv.org/pdf/1605.08400v1.pdf
PWC	https://paperswithcode.com/paper/total-variation-classes-beyond-1d-minimax
Repo
Framework

Convergence guarantees for kernel-based quadrature rules in misspecified settings


Title	Convergence guarantees for kernel-based quadrature rules in misspecified settings
Authors	Motonobu Kanagawa, Bharath K. Sriperumbudur, Kenji Fukumizu
Abstract	Kernel-based quadrature rules are becoming important in machine learning and statistics, as they achieve super-$\sqrt{n}$ convergence rates in numerical integration, and thus provide alternatives to Monte Carlo integration in challenging settings where integrands are expensive to evaluate or where integrands are high dimensional. These rules are based on the assumption that the integrand has a certain degree of smoothness, which is expressed as that the integrand belongs to a certain reproducing kernel Hilbert space (RKHS). However, this assumption can be violated in practice (e.g., when the integrand is a black box function), and no general theory has been established for the convergence of kernel quadratures in such misspecified settings. Our contribution is in proving that kernel quadratures can be consistent even when the integrand does not belong to the assumed RKHS, i.e., when the integrand is less smooth than assumed. Specifically, we derive convergence rates that depend on the (unknown) lesser smoothness of the integrand, where the degree of smoothness is expressed via powers of RKHSs or via Sobolev spaces.
Tasks
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07254v2
PDF	http://arxiv.org/pdf/1605.07254v2.pdf
PWC	https://paperswithcode.com/paper/convergence-guarantees-for-kernel-based
Repo
Framework

A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization


Title	A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization
Authors	Hildo Bijl, Thomas B. Schön, Jan-Willem van Wingerden, Michel Verhaegen
Abstract	Bayesian optimization through Gaussian process regression is an effective method of optimizing an unknown function for which every measurement is expensive. It approximates the objective function and then recommends a new measurement point to try out. This recommendation is usually selected by optimizing a given acquisition function. After a sufficient number of measurements, a recommendation about the maximum is made. However, a key realization is that the maximum of a Gaussian process is not a deterministic point, but a random variable with a distribution of its own. This distribution cannot be calculated analytically. Our main contribution is an algorithm, inspired by sequential Monte Carlo samplers, that approximates this maximum distribution. Subsequently, by taking samples from this distribution, we enable Thompson sampling to be applied to (armed-bandit) optimization problems with a continuous input space. All this is done without requiring the optimization of a nonlinear acquisition function. Experiments have shown that the resulting optimization method has a competitive performance at keeping the cumulative regret limited.
Tasks
Published	2016-04-01
URL	http://arxiv.org/abs/1604.00169v3
PDF	http://arxiv.org/pdf/1604.00169v3.pdf
PWC	https://paperswithcode.com/paper/a-sequential-monte-carlo-approach-to-thompson
Repo
Framework

From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning


Title	From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning
Authors	Lieke Gelderloos, Grzegorz Chrupała
Abstract	We present a model of visually-grounded language learning based on stacked gated recurrent neural networks which learns to predict visual features given an image description in the form of a sequence of phonemes. The learning task resembles that faced by human language learners who need to discover both structure and meaning from noisy and ambiguous data across modalities. We show that our model indeed learns to predict features of the visual context given phonetically transcribed image descriptions, and show that it represents linguistic information in a hierarchy of levels: lower layers in the stack are comparatively more sensitive to form, whereas higher layers are more sensitive to meaning.
Tasks
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03342v1
PDF	http://arxiv.org/pdf/1610.03342v1.pdf
PWC	https://paperswithcode.com/paper/from-phonemes-to-images-levels-of
Repo
Framework

Dynamic Multi-Objectives Optimization with a Changing Number of Objectives


Title	Dynamic Multi-Objectives Optimization with a Changing Number of Objectives
Authors	Renzhi Chen, Ke Li, Xin Yao
Abstract	Existing studies on dynamic multi-objective optimization focus on problems with time-dependent objective functions, while the ones with a changing number of objectives have rarely been considered in the literature. Instead of changing the shape or position of the Pareto-optimal front/set when having time-dependent objective functions, increasing or decreasing the number of objectives usually leads to the expansion or contraction of the dimension of the Pareto-optimal front/set manifold. Unfortunately, most existing dynamic handling techniques can hardly be adapted to this type of dynamics. In this paper, we report our attempt toward tackling the dynamic multi-objective optimization problems with a changing number of objectives. We implement a new two-archive evolutionary algorithm which maintains two co-evolving populations simultaneously. In particular, these two populations are complementary to each other: one concerns more about the convergence while the other concerns more about the diversity. The compositions of these two populations are adaptively reconstructed once the environment changes. In addition, these two populations interact with each other via a mating selection mechanism. Comprehensive experiments are conducted on various benchmark problems with a time-dependent number of objectives. Empirical results fully demonstrate the effectiveness of our proposed algorithm.
Tasks
Published	2016-08-23
URL	http://arxiv.org/abs/1608.06514v2
PDF	http://arxiv.org/pdf/1608.06514v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-multi-objectives-optimization-with-a
Repo
Framework