May 5, 2019

2783 words 14 mins read

Paper Group ANR 527

Graphical Model Sketch. Using Apache Lucene to Search Vector of Locally Aggregated Descriptors. Smoothing parameter estimation framework for IBM word alignment models. High Dimensional Inference with Random Maximum A-Posteriori Perturbations. ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms. Unsupervised preprocessing for Tactile Data. Linear Hy …

Graphical Model Sketch


Title	Graphical Model Sketch
Authors	Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, Siqi Sun
Abstract	Structured high-cardinality data arises in many domains, and poses a major challenge for both modeling and inference. Graphical models are a popular approach to modeling structured data but they are unsuitable for high-cardinality variables. The count-min (CM) sketch is a popular approach to estimating probabilities in high-cardinality data but it does not scale well beyond a few variables. In this work, we bring together the ideas of graphical models and count sketches; and propose and analyze several approaches to estimating probabilities in structured high-cardinality streams of data. The key idea of our approximations is to use the structure of a graphical model and approximately estimate its factors by “sketches”, which hash high-cardinality variables using random projections. Our approximations are computationally efficient and their space complexity is independent of the cardinality of variables. Our error bounds are multiplicative and significantly improve upon those of the CM sketch, a state-of-the-art approach to estimating probabilities in streams. We evaluate our approximations on synthetic and real-world problems, and report an order of magnitude improvements over the CM sketch.
Tasks
Published	2016-02-09
URL	http://arxiv.org/abs/1602.03105v2
PDF	http://arxiv.org/pdf/1602.03105v2.pdf
PWC	https://paperswithcode.com/paper/graphical-model-sketch
Repo
Framework

Using Apache Lucene to Search Vector of Locally Aggregated Descriptors


Title	Using Apache Lucene to Search Vector of Locally Aggregated Descriptors
Authors	Giuseppe Amato, Paolo Bolettieri, Fabrizio Falchi, Claudio Gennaro, Lucia Vadicamo
Abstract	Surrogate Text Representation (STR) is a profitable solution to efficient similarity search on metric space using conventional text search engines, such as Apache Lucene. This technique is based on comparing the permutations of some reference objects in place of the original metric distance. However, the Achilles heel of STR approach is the need to reorder the result set of the search according to the metric distance. This forces to use a support database to store the original objects, which requires efficient random I/O on a fast secondary memory (such as flash-based storages). In this paper, we propose to extend the Surrogate Text Representation to specifically address a class of visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD). This approach is based on representing the individual sub-vectors forming the VLAD vector with the STR, providing a finer representation of the vector and enabling us to get rid of the reordering phase. The experiments on a publicly available dataset show that the extended STR outperforms the baseline STR achieving satisfactory performance near to the one obtained with the original VLAD vectors.
Tasks
Published	2016-04-19
URL	http://arxiv.org/abs/1604.05576v1
PDF	http://arxiv.org/pdf/1604.05576v1.pdf
PWC	https://paperswithcode.com/paper/using-apache-lucene-to-search-vector-of
Repo
Framework

Smoothing parameter estimation framework for IBM word alignment models


Title	Smoothing parameter estimation framework for IBM word alignment models
Authors	Vuong Van Bui, Cuong Anh Le
Abstract	IBM models are very important word alignment models in Machine Translation. Following the Maximum Likelihood Estimation principle to estimate their parameters, the models will easily overfit the training data when the data are sparse. While smoothing is a very popular solution in Language Model, there still lacks studies on smoothing for word alignment. In this paper, we propose a framework which generalizes the notable work Moore [2004] of applying additive smoothing to word alignment models. The framework allows developers to customize the smoothing amount for each pair of word. The added amount will be scaled appropriately by a common factor which reflects how much the framework trusts the adding strategy according to the performance on data. We also carefully examine various performance criteria and propose a smoothened version of the error count, which generally gives the best result.
Tasks	Language Modelling, Machine Translation, Word Alignment
Published	2016-01-14
URL	http://arxiv.org/abs/1601.03650v4
PDF	http://arxiv.org/pdf/1601.03650v4.pdf
PWC	https://paperswithcode.com/paper/smoothing-parameter-estimation-framework-for
Repo
Framework

High Dimensional Inference with Random Maximum A-Posteriori Perturbations


Title	High Dimensional Inference with Random Maximum A-Posteriori Perturbations
Authors	Tamir Hazan, Francesco Orabona, Anand D. Sarwate, Subhransu Maji, Tommi Jaakkola
Abstract	This paper presents a new approach, called perturb-max, for high-dimensional statistical inference that is based on applying random perturbations followed by optimization. This framework injects randomness to maximum a-posteriori (MAP) predictors by randomly perturbing the potential function for the input. A classic result from extreme value statistics asserts that perturb-max operations generate unbiased samples from the Gibbs distribution using high-dimensional perturbations. Unfortunately, the computational cost of generating so many high-dimensional random variables can be prohibitive. However, when the perturbations are of low dimension, sampling the perturb-max prediction is as efficient as MAP optimization. This paper shows that the expected value of perturb-max inference with low dimensional perturbations can be used sequentially to generate unbiased samples from the Gibbs distribution. Furthermore the expected value of the maximal perturbations is a natural bound on the entropy of such perturb-max models. A measure concentration result for perturb-max values shows that the deviation of their sampled average from its expectation decays exponentially in the number of samples, allowing effective approximation of the expectation.
Tasks
Published	2016-02-10
URL	http://arxiv.org/abs/1602.03571v3
PDF	http://arxiv.org/pdf/1602.03571v3.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-inference-with-random
Repo
Framework

ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms


Title	ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms
Authors	Enrico Santus, Tin-Shing Chiu, Qin Lu, Alessandro Lenci, Chu-Ren Huang
Abstract	In this paper, we describe ROOT13, a supervised system for the classification of hypernyms, co-hyponyms and random words. The system relies on a Random Forest algorithm and 13 unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT13 achieves an F1 score of 88.3%, against a baseline of 57.6% (vector cosine). When the classification is binary, ROOT13 achieves the following results: hypernyms-co-hyponyms (93.4% vs. 60.2%), hypernymsrandom (92.3% vs. 65.5%) and co-hyponyms-random (97.3% vs. 81.5%). Our results are competitive with stateof-the-art models.
Tasks
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08705v1
PDF	http://arxiv.org/pdf/1603.08705v1.pdf
PWC	https://paperswithcode.com/paper/root13-spotting-hypernyms-co-hyponyms-and
Repo
Framework

Unsupervised preprocessing for Tactile Data


Title	Unsupervised preprocessing for Tactile Data
Authors	Maximilian Karl, Justin Bayer, Patrick van der Smagt
Abstract	Tactile information is important for gripping, stable grasp, and in-hand manipulation, yet the complexity of tactile data prevents widespread use of such sensors. We make use of an unsupervised learning algorithm that transforms the complex tactile data into a compact, latent representation without the need to record ground truth reference data. These compact representations can either be used directly in a reinforcement learning based controller or can be used to calibrate the tactile sensor to physical quantities with only a few datapoints. We show the quality of our latent representation by predicting important features and with a simple control task.
Tasks
Published	2016-06-23
URL	http://arxiv.org/abs/1606.07312v1
PDF	http://arxiv.org/pdf/1606.07312v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-preprocessing-for-tactile-data
Repo
Framework

Linear Hypothesis Testing in Dense High-Dimensional Linear Models


Title	Linear Hypothesis Testing in Dense High-Dimensional Linear Models
Authors	Yinchu Zhu, Jelena Bradic
Abstract	We propose a methodology for testing linear hypothesis in high-dimensional linear models. The proposed test does not impose any restriction on the size of the model, i.e. model sparsity or the loading vector representing the hypothesis. Providing asymptotically valid methods for testing general linear functions of the regression parameters in high-dimensions is extremely challenging – especially without making restrictive or unverifiable assumptions on the number of non-zero elements. We propose to test the moment conditions related to the newly designed restructured regression, where the inputs are transformed and augmented features. These new features incorporate the structure of the null hypothesis directly. The test statistics are constructed in such a way that lack of sparsity in the original model parameter does not present a problem for the theoretical justification of our procedures. We establish asymptotically exact control on Type I error without imposing any sparsity assumptions on model parameter or the vector representing the linear hypothesis. Our method is also shown to achieve certain optimality in detecting deviations from the null hypothesis. We demonstrate the favorable finite-sample performance of the proposed methods, via a number of numerical and a real data example.
Tasks
Published	2016-10-10
URL	http://arxiv.org/abs/1610.02987v2
PDF	http://arxiv.org/pdf/1610.02987v2.pdf
PWC	https://paperswithcode.com/paper/linear-hypothesis-testing-in-dense-high
Repo
Framework

Communication-efficient Distributed Sparse Linear Discriminant Analysis


Title	Communication-efficient Distributed Sparse Linear Discriminant Analysis
Authors	Lu Tian, Quanquan Gu
Abstract	We propose a communication-efficient distributed estimation method for sparse linear discriminant analysis (LDA) in the high dimensional regime. Our method distributes the data of size $N$ into $m$ machines, and estimates a local sparse LDA estimator on each machine using the data subset of size $N/m$. After the distributed estimation, our method aggregates the debiased local estimators from $m$ machines, and sparsifies the aggregated estimator. We show that the aggregated estimator attains the same statistical rate as the centralized estimation method, as long as the number of machines $m$ is chosen appropriately. Moreover, we prove that our method can attain the model selection consistency under a milder condition than the centralized method. Experiments on both synthetic and real datasets corroborate our theory.
Tasks	Model Selection
Published	2016-10-15
URL	http://arxiv.org/abs/1610.04798v1
PDF	http://arxiv.org/pdf/1610.04798v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-sparse
Repo
Framework

Does Distributionally Robust Supervised Learning Give Robust Classifiers?


Title	Does Distributionally Robust Supervised Learning Give Robust Classifiers?
Authors	Weihua Hu, Gang Niu, Issei Sato, Masashi Sugiyama
Abstract	Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. DRSL with f-divergences explicitly considers the worst-case distribution shift by minimizing the adversarially reweighted training loss. In this paper, we analyze this DRSL, focusing on the classification scenario. Since the DRSL is explicitly formulated for a distribution shift scenario, we naturally expect it to give a robust classifier that can aggressively handle shifted distributions. However, surprisingly, we prove that the DRSL just ends up giving a classifier that exactly fits the given training distribution, which is too pessimistic. This pessimism comes from two sources: the particular losses used in classification and the fact that the variety of distributions to which the DRSL tries to be robust is too wide. Motivated by our analysis, we propose simple DRSL that overcomes this pessimism and empirically demonstrate its effectiveness.
Tasks
Published	2016-11-07
URL	http://arxiv.org/abs/1611.02041v6
PDF	http://arxiv.org/pdf/1611.02041v6.pdf
PWC	https://paperswithcode.com/paper/does-distributionally-robust-supervised
Repo
Framework

Simple and Efficient Learning using Privileged Information


Title	Simple and Efficient Learning using Privileged Information
Authors	Xinxing Xu, Joey Tianyi Zhou, IvorW. Tsang, Zheng Qin, Rick Siow Mong Goh, Yong Liu
Abstract	The Support Vector Machine using Privileged Information (SVM+) has been proposed to train a classifier to utilize the additional privileged information that is only available in the training phase but not available in the test phase. In this work, we propose an efficient solution for SVM+ by simply utilizing the squared hinge loss instead of the hinge loss as in the existing SVM+ formulation, which interestingly leads to a dual form with less variables and in the same form with the dual of the standard SVM. The proposed algorithm is utilized to leverage the additional web knowledge that is only available during training for the image categorization tasks. The extensive experimental results on both Caltech101 andWebQueries datasets show that our proposed method can achieve a factor of up to hundred times speedup with the comparable accuracy when compared with the existing SVM+ method.
Tasks	Image Categorization
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01518v1
PDF	http://arxiv.org/pdf/1604.01518v1.pdf
PWC	https://paperswithcode.com/paper/simple-and-efficient-learning-using
Repo
Framework

A Constraint-Handling Technique for Genetic Algorithms using a Violation Factor


Title	A Constraint-Handling Technique for Genetic Algorithms using a Violation Factor
Authors	Adam Chehouri, Rafic Younes, Jean Perron, Adrian Ilinca
Abstract	Over the years, several meta-heuristic algorithms were proposed and are now emerging as common methods for constrained optimization problems. Among them, genetic algorithms (GA’s) shine as popular evolutionary algorithms (EA’s) in engineering optimization. Most engineering design problems are difficult to resolve with conventional optimization algorithms because they are highly nonlinear and contain constraints. In order to handle these constraints, the most common technique is to apply penalty functions. The major drawback is that they require tuning of parameters, which can be very challenging. In this paper, we present a constraint-handling technique for GA’s solely using the violation factor, called VCH (Violation Constraint-Handling) method. Several benchmark problems from the literature are examined. The VCH technique was able to provide a consistent performance and match results from other GA-based techniques.
Tasks
Published	2016-10-04
URL	http://arxiv.org/abs/1610.00976v1
PDF	http://arxiv.org/pdf/1610.00976v1.pdf
PWC	https://paperswithcode.com/paper/a-constraint-handling-technique-for-genetic
Repo
Framework

Evaluation of the Effect of Improper Segmentation on Word Spotting


Title	Evaluation of the Effect of Improper Segmentation on Word Spotting
Authors	Sounak Dey, Anguelos Nicolaou, Josep Llados, Umapada Pal
Abstract	Word spotting is an important recognition task in historical document analysis. In most cases methods are developed and evaluated assuming perfect word segmentations. In this paper we propose an experimental framework to quantify the effect of goodness of word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We apply the framework on the George Washington and Barcelona Marriage Dataset and on several established and state-of-the-art methods. The experiments allow for an estimate of the end-to-end performance of word spotting methods.
Tasks
Published	2016-04-21
URL	http://arxiv.org/abs/1604.06243v1
PDF	http://arxiv.org/pdf/1604.06243v1.pdf
PWC	https://paperswithcode.com/paper/evaluation-of-the-effect-of-improper
Repo
Framework

Linear-time Outlier Detection via Sensitivity


Title	Linear-time Outlier Detection via Sensitivity
Authors	Mario Lucic, Olivier Bachem, Andreas Krause
Abstract	Outliers are ubiquitous in modern data sets. Distance-based techniques are a popular non-parametric approach to outlier detection as they require no prior assumptions on the data generating distribution and are simple to implement. Scaling these techniques to massive data sets without sacrificing accuracy is a challenging task. We propose a novel algorithm based on the intuition that outliers have a significant influence on the quality of divergence-based clustering solutions. We propose sensitivity - the worst-case impact of a data point on the clustering objective - as a measure of outlierness. We then prove that influence, a (non-trivial) upper-bound on the sensitivity, can be computed by a simple linear time algorithm. To scale beyond a single machine, we propose a communication efficient distributed algorithm. In an extensive experimental evaluation, we demonstrate the effectiveness and establish the statistical significance of the proposed approach. In particular, it outperforms the most popular distance-based approaches while being several orders of magnitude faster.
Tasks	Outlier Detection
Published	2016-05-02
URL	http://arxiv.org/abs/1605.00519v1
PDF	http://arxiv.org/pdf/1605.00519v1.pdf
PWC	https://paperswithcode.com/paper/linear-time-outlier-detection-via-sensitivity
Repo
Framework

Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints


Title	Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
Authors	Greg Durrett, Taylor Berg-Kirkpatrick, Dan Klein
Abstract	We present a discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints. Our model selects textual units to include in the summary based on a rich set of sparse features whose weights are learned on a large corpus. We allow for the deletion of content within a sentence when that deletion is licensed by compression rules; in our framework, these are implemented as dependencies between subsentential units of text. Anaphoricity constraints then improve cross-sentence coherence by guaranteeing that, for each pronoun included in the summary, the pronoun’s antecedent is included as well or the pronoun is rewritten as a full mention. When trained end-to-end, our final system outperforms prior work on both ROUGE as well as on human judgments of linguistic quality.
Tasks	Document Summarization
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08887v2
PDF	http://arxiv.org/pdf/1603.08887v2.pdf
PWC	https://paperswithcode.com/paper/learning-based-single-document-summarization
Repo
Framework

Chinese Song Iambics Generation with Neural Attention-based Model


Title	Chinese Song Iambics Generation with Neural Attention-based Model
Authors	Qixin Wang, Tianyi Luo, Dong Wang, Chao Xing
Abstract	Learning and generating Chinese poems is a charming yet challenging task. Traditional approaches involve various language modeling and machine translation techniques, however, they perform not as well when generating poems with complex pattern constraints, for example Song iambics, a famous type of poems that involve variable-length sentences and strict rhythmic patterns. This paper applies the attention-based sequence-to-sequence model to generate Chinese Song iambics. Specifically, we encode the cue sentences by a bi-directional Long-Short Term Memory (LSTM) model and then predict the entire iambic with the information provided by the encoder, in the form of an attention-based LSTM that can regularize the generation process by the fine structure of the input cues. Several techniques are investigated to improve the model, including global context integration, hybrid style training, character vector initialization and adaptation. Both the automatic and subjective evaluation results show that our model indeed can learn the complex structural and rhythmic patterns of Song iambics, and the generation is rather successful.
Tasks	Language Modelling, Machine Translation
Published	2016-04-21
URL	http://arxiv.org/abs/1604.06274v2
PDF	http://arxiv.org/pdf/1604.06274v2.pdf
PWC	https://paperswithcode.com/paper/chinese-song-iambics-generation-with-neural
Repo
Framework