May 6, 2019

2975 words 14 mins read

Paper Group ANR 252

Towards Geo-Distributed Machine Learning. Online Isotonic Regression. Self-localization from Images with Small Overlap. Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values. Predicting the future relevance of research institutions - The winning solution of the KDD Cup 2016. Sparse Activity and Sparse Connectiv …

Towards Geo-Distributed Machine Learning


Title	Towards Geo-Distributed Machine Learning
Authors	Ignacio Cano, Markus Weimer, Dhruv Mahajan, Carlo Curino, Giovanni Matteo Fumarola
Abstract	Latency to end-users and regulatory requirements push large companies to build data centers all around the world. The resulting data is “born” geographically distributed. On the other hand, many machine learning applications require a global view of such data in order to achieve the best results. These types of applications form a new class of learning problems, which we call Geo-Distributed Machine Learning (GDML). Such applications need to cope with: 1) scarce and expensive cross-data center bandwidth, and 2) growing privacy concerns that are pushing for stricter data sovereignty regulations. Current solutions to learning from geo-distributed data sources revolve around the idea of first centralizing the data in one data center, and then training locally. As machine learning algorithms are communication-intensive, the cost of centralizing the data is thought to be offset by the lower cost of intra-data center communication during training. In this work, we show that the current centralized practice can be far from optimal, and propose a system for doing geo-distributed training. Furthermore, we argue that the geo-distributed approach is structurally more amenable to dealing with regulatory constraints, as raw data never leaves the source data center. Our empirical evaluation on three real datasets confirms the general validity of our approach, and shows that GDML is not only possible but also advisable in many scenarios.
Tasks
Published	2016-03-30
URL	http://arxiv.org/abs/1603.09035v1
PDF	http://arxiv.org/pdf/1603.09035v1.pdf
PWC	https://paperswithcode.com/paper/towards-geo-distributed-machine-learning
Repo
Framework

Online Isotonic Regression


Title	Online Isotonic Regression
Authors	Wojciech Kotłowski, Wouter M. Koolen, Alan Malek
Abstract	We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total squared loss compared against the best isotonic (non-decreasing) function in hindsight. We survey several standard online learning algorithms and show that none of them achieve the optimal regret exponent; in fact, most of them (including Online Gradient Descent, Follow the Leader and Exponential Weights) incur linear regret. We then prove that the Exponential Weights algorithm played over a covering net of isotonic functions has a regret bounded by $O\big(T^{1/3} \log^{2/3}(T)\big)$ and present a matching $\Omega(T^{1/3})$ lower bound on regret. We provide a computationally efficient version of this algorithm. We also analyze the noise-free case, in which the revealed labels are isotonic, and show that the bound can be improved to $O(\log T)$ or even to $O(1)$ (when the labels are revealed in isotonic order). Finally, we extend the analysis beyond squared loss and give bounds for entropic loss and absolute loss.
Tasks
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04190v2
PDF	http://arxiv.org/pdf/1603.04190v2.pdf
PWC	https://paperswithcode.com/paper/online-isotonic-regression
Repo
Framework

Self-localization from Images with Small Overlap


Title	Self-localization from Images with Small Overlap
Authors	Tanaka Kanji
Abstract	With the recent success of visual features from deep convolutional neural networks (DCNN) in visual robot self-localization, it has become important and practical to address more general self-localization scenarios. In this paper, we address the scenario of self-localization from images with small overlap. We explicitly introduce a localization difficulty index as a decreasing function of view overlap between query and relevant database images and investigate performance versus difficulty for challenging cross-view self-localization tasks. We then reformulate the self-localization as a scalable bag-of-visual-features (BoVF) scene retrieval and present an efficient solution called PCA-NBNN, aiming to facilitate fast and yet discriminative correspondence between partially overlapping images. The proposed approach adopts recent findings in discriminativity preserving encoding of DCNN features using principal component analysis (PCA) and cross-domain scene matching using naive Bayes nearest neighbor distance metric (NBNN). We experimentally demonstrate that the proposed PCA-NBNN framework frequently achieves comparable results to previous DCNN features and that the BoVF model is significantly more efficient. We further address an important alternative scenario of “self-localization from images with NO overlap” and report the result.
Tasks
Published	2016-03-03
URL	http://arxiv.org/abs/1603.00993v1
PDF	http://arxiv.org/pdf/1603.00993v1.pdf
PWC	https://paperswithcode.com/paper/self-localization-from-images-with-small
Repo
Framework

Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values


Title	Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values
Authors	Iman Alodah, Jennifer Neville
Abstract	Gradient boosting of regression trees is a competitive procedure for learning predictive models of continuous data that fits the data with an additive non-parametric model. The classic version of gradient boosting assumes that the data is independent and identically distributed. However, relational data with interdependent, linked instances is now common and the dependencies in such data can be exploited to improve predictive performance. Collective inference is one approach to exploit relational correlation patterns and significantly reduce classification error. However, much of the work on collective learning and inference has focused on discrete prediction tasks rather than continuous. %target values has not got that attention in terms of collective inference. In this work, we investigate how to combine these two paradigms together to improve regression in relational domains. Specifically, we propose a boosting algorithm for learning a collective inference model that predicts a continuous target variable. In the algorithm, we learn a basic relational model, collectively infer the target values, and then iteratively learn relational models to predict the residuals. We evaluate our proposed algorithm on a real network dataset and show that it outperforms alternative boosting methods. However, our investigation also revealed that the relational features interact together to produce better predictions.
Tasks
Published	2016-07-01
URL	http://arxiv.org/abs/1607.00110v1
PDF	http://arxiv.org/pdf/1607.00110v1.pdf
PWC	https://paperswithcode.com/paper/combining-gradient-boosting-machines-with
Repo
Framework

Predicting the future relevance of research institutions - The winning solution of the KDD Cup 2016


Title	Predicting the future relevance of research institutions - The winning solution of the KDD Cup 2016
Authors	Vlad Sandulescu, Mihai Chiru
Abstract	The world’s collective knowledge is evolving through research and new scientific discoveries. It is becoming increasingly difficult to objectively rank the impact research institutes have on global advancements. However, since the funding, governmental support, staff and students quality all mirror the projected quality of the institution, it becomes essential to measure the affiliation’s rating in a transparent and widely accepted way. We propose and investigate several methods to rank affiliations based on the number of their accepted papers at future academic conferences. We carry out our investigation using publicly available datasets such as the Microsoft Academic Graph, a heterogeneous graph which contains various information about academic papers. We analyze several models, starting with a simple probabilities-based method and then gradually expand our training dataset, engineer many more features and use mixed models and gradient boosted decision trees models to improve our predictions.
Tasks
Published	2016-09-09
URL	http://arxiv.org/abs/1609.02728v1
PDF	http://arxiv.org/pdf/1609.02728v1.pdf
PWC	https://paperswithcode.com/paper/predicting-the-future-relevance-of-research
Repo
Framework

Sparse Activity and Sparse Connectivity in Supervised Learning


Title	Sparse Activity and Sparse Connectivity in Supervised Learning
Authors	Markus Thom, Günther Palm
Abstract	Sparseness is a useful regularizer for learning in a wide range of applications, in particular in neural networks. This paper proposes a model targeted at classification tasks, where sparse activity and sparse connectivity are used to enhance classification capabilities. The tool for achieving this is a sparseness-enforcing projection operator which finds the closest vector with a pre-defined sparseness for any given vector. In the theoretical part of this paper, a comprehensive theory for such a projection is developed. In conclusion, it is shown that the projection is differentiable almost everywhere and can thus be implemented as a smooth neuronal transfer function. The entire model can hence be tuned end-to-end using gradient-based methods. Experiments on the MNIST database of handwritten digits show that classification performance can be boosted by sparse activity or sparse connectivity. With a combination of both, performance can be significantly better compared to classical non-sparse approaches.
Tasks	Image Classification
Published	2016-03-28
URL	http://arxiv.org/abs/1603.08367v1
PDF	http://arxiv.org/pdf/1603.08367v1.pdf
PWC	https://paperswithcode.com/paper/sparse-activity-and-sparse-connectivity-in
Repo
Framework

Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling


Title	Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling
Authors	Gakuto Kurata, Bing Xiang, Bowen Zhou, Mo Yu
Abstract	Recurrent Neural Network (RNN) and one of its specific architectures, Long Short-Term Memory (LSTM), have been widely used for sequence labeling. In this paper, we first enhance LSTM-based sequence labeling to explicitly model label dependencies. Then we propose another enhancement to incorporate the global information spanning over the whole input sequence. The latter proposed method, encoder-labeler LSTM, first encodes the whole input sequence into a fixed length vector with the encoder LSTM, and then uses this encoded vector as the initial state of another LSTM for sequence labeling. Combining these methods, we can predict the label sequence with considering label dependencies and information of whole input sequence. In the experiments of a slot filling task, which is an essential component of natural language understanding, with using the standard ATIS corpus, we achieved the state-of-the-art F1-score of 95.66%.
Tasks	Slot Filling
Published	2016-01-07
URL	http://arxiv.org/abs/1601.01530v4
PDF	http://arxiv.org/pdf/1601.01530v4.pdf
PWC	https://paperswithcode.com/paper/leveraging-sentence-level-information-with
Repo
Framework

Characterizing Realizability in Abstract Argumentation


Title	Characterizing Realizability in Abstract Argumentation
Authors	Thomas Linsbichler, Jörg Pührer, Hannes Strass
Abstract	Realizability for knowledge representation formalisms studies the following question: given a semantics and a set of interpretations, is there a knowledge base whose semantics coincides exactly with the given interpretation set? We introduce a general framework for analyzing realizability in abstract dialectical frameworks (ADFs) and various of its subclasses. In particular, the framework applies to Dung argumentation frameworks, SETAFs by Nielsen and Parsons, and bipolar ADFs. We present a uniform characterization method for the admissible, complete, preferred and model/stable semantics. We employ this method to devise an algorithm that decides realizability for the mentioned formalisms and semantics; moreover the algorithm allows for constructing a desired knowledge base whenever one exists. The algorithm is built in a modular way and thus easily extensible to new formalisms and semantics. We have also implemented our approach in answer set programming, and used the implementation to obtain several novel results on the relative expressiveness of the abovementioned formalisms.
Tasks	Abstract Argumentation
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09545v1
PDF	http://arxiv.org/pdf/1603.09545v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-realizability-in-abstract
Repo
Framework

Distribution-Specific Hardness of Learning Neural Networks


Title	Distribution-Specific Hardness of Learning Neural Networks
Authors	Ohad Shamir
Abstract	Although neural networks are routinely and successfully trained in practice using simple gradient-based methods, most existing theoretical results are negative, showing that learning such networks is difficult, in a worst-case sense over all data distributions. In this paper, we take a more nuanced view, and consider whether specific assumptions on the “niceness” of the input distribution, or “niceness” of the target function (e.g. in terms of smoothness, non-degeneracy, incoherence, random choice of parameters etc.), are sufficient to guarantee learnability using gradient-based methods. We provide evidence that neither class of assumptions alone is sufficient: On the one hand, for any member of a class of “nice” target functions, there are difficult input distributions. On the other hand, we identify a family of simple target functions, which are difficult to learn even if the input distribution is “nice”. To prove our results, we develop some tools which may be of independent interest, such as extending Fourier-based hardness techniques developed in the context of statistical queries \cite{blum1994weakly}, from the Boolean cube to Euclidean space and to more general classes of functions.
Tasks
Published	2016-09-05
URL	http://arxiv.org/abs/1609.01037v2
PDF	http://arxiv.org/pdf/1609.01037v2.pdf
PWC	https://paperswithcode.com/paper/distribution-specific-hardness-of-learning
Repo
Framework

New Steps on the Exact Learning of CNF


Title	New Steps on the Exact Learning of CNF
Authors	Montserrat Hermo, Ana Ozaki
Abstract	A major problem in computational learning theory is whether the class of formulas in conjunctive normal form (CNF) is efficiently learnable. Although it is known that this class cannot be polynomially learned using either membership or equivalence queries alone, it is open whether CNF can be polynomially learned using both types of queries. One of the most important results concerning a restriction of the class CNF is that propositional Horn formulas are polynomial time learnable in Angluin’s exact learning model with membership and equivalence queries. In this work we push this boundary and show that the class of multivalued dependency formulas (MVDF) is polynomially learnable from interpretations. We then provide a notion of reduction between learning problems in Angluin’s model, showing that a transformation of the algorithm suffices to efficiently learn multivalued database dependencies from data relations. We also show via reductions that our main result extends well known previous results and allows us to find alternative solutions for them.
Tasks
Published	2016-09-10
URL	http://arxiv.org/abs/1609.03054v1
PDF	http://arxiv.org/pdf/1609.03054v1.pdf
PWC	https://paperswithcode.com/paper/new-steps-on-the-exact-learning-of-cnf
Repo
Framework

Getting Started with Neural Models for Semantic Matching in Web Search


Title	Getting Started with Neural Models for Semantic Matching in Web Search
Authors	Kezban Dilek Onal, Ismail Sengor Altingovde, Pinar Karagoz, Maarten de Rijke
Abstract	The vocabulary mismatch problem is a long-standing problem in information retrieval. Semantic matching holds the promise of solving the problem. Recent advances in language technology have given rise to unsupervised neural models for learning representations of words as well as bigger textual units. Such representations enable powerful semantic matching methods. This survey is meant as an introduction to the use of neural models for semantic matching. To remain focused we limit ourselves to web search. We detail the required background and terminology, a taxonomy grouping the rapidly growing body of work in the area, and then survey work on neural models for semantic matching in the context of three tasks: query suggestion, ad retrieval, and document retrieval. We include a section on resources and best practices that we believe will help readers who are new to the area. We conclude with an assessment of the state-of-the-art and suggestions for future work.
Tasks	Information Retrieval
Published	2016-11-08
URL	http://arxiv.org/abs/1611.03305v1
PDF	http://arxiv.org/pdf/1611.03305v1.pdf
PWC	https://paperswithcode.com/paper/getting-started-with-neural-models-for
Repo
Framework

Fuzzy paraphrases in learning word representations with a lexicon


Title	Fuzzy paraphrases in learning word representations with a lexicon
Authors	Yuanzhi Ke, Masafumi Hagiwara
Abstract	A synonym of a polysemous word is usually only the paraphrase of one sense among many. When lexicons are used to improve vector-space word representations, such paraphrases are unreliable and bring noise to the vector-space. The prior works use a coefficient to adjust the overall learning of the lexicons. They regard the paraphrases equally. In this paper, we propose a novel approach that regards the paraphrases diversely to alleviate the adverse effects of polysemy. We annotate each paraphrase with a degree of reliability. The paraphrases are randomly eliminated according to the degrees when our model learns word representations. In this way, our approach drops the unreliable paraphrases, keeping more reliable paraphrases at the same time. The experimental results show that the proposed method improves the word vectors. Our approach is an attempt to address the polysemy problem keeping one vector per word. It makes the approach easier to use than the conventional methods that estimate multiple vectors for a word. Our approach also outperforms the prior works in the experiments.
Tasks
Published	2016-11-02
URL	http://arxiv.org/abs/1611.00674v9
PDF	http://arxiv.org/pdf/1611.00674v9.pdf
PWC	https://paperswithcode.com/paper/fuzzy-paraphrases-in-learning-word
Repo
Framework

Reinforcement-based Simultaneous Algorithm and its Hyperparameters Selection


Title	Reinforcement-based Simultaneous Algorithm and its Hyperparameters Selection
Authors	Valeria Efimova, Andrey Filchenkov, Anatoly Shalyto
Abstract	Many algorithms for data analysis exist, especially for classification problems. To solve a data analysis problem, a proper algorithm should be chosen, and also its hyperparameters should be selected. In this paper, we present a new method for the simultaneous selection of an algorithm and its hyperparameters. In order to do so, we reduced this problem to the multi-armed bandit problem. We consider an algorithm as an arm and algorithm hyperparameters search during a fixed time as the corresponding arm play. We also suggest a problem-specific reward function. We performed the experiments on 10 real datasets and compare the suggested method with the existing one implemented in Auto-WEKA. The results show that our method is significantly better in most of the cases and never worse than the Auto-WEKA.
Tasks
Published	2016-11-07
URL	http://arxiv.org/abs/1611.02053v1
PDF	http://arxiv.org/pdf/1611.02053v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-based-simultaneous-algorithm
Repo
Framework

Learning Generic Sentence Representations Using Convolutional Neural Networks


Title	Learning Generic Sentence Representations Using Convolutional Neural Networks
Authors	Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin
Abstract	We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a continuous vector, and using a long short-term memory recurrent neural network as a decoder. Several tasks are considered, including sentence reconstruction and future sentence prediction. Further, a hierarchical encoder-decoder model is proposed to encode a sentence to predict multiple future sentences. By training our models on a large collection of novels, we obtain a highly generic convolutional sentence encoder that performs well in practice. Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods.
Tasks
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07897v2
PDF	http://arxiv.org/pdf/1611.07897v2.pdf
PWC	https://paperswithcode.com/paper/learning-generic-sentence-representations
Repo
Framework

Neural Networks Models for Entity Discovery and Linking


Title	Neural Networks Models for Entity Discovery and Linking
Authors	Dan Liu, Wei Lin, Shiliang Zhang, Si Wei, Hui Jiang
Abstract	This paper describes the USTC_NELSLIP systems submitted to the Trilingual Entity Detection and Linking (EDL) track in 2016 TAC Knowledge Base Population (KBP) contests. We have built two systems for entity discovery and mention detection (MD): one uses the conditional RNNLM and the other one uses the attention-based encoder-decoder framework. The entity linking (EL) system consists of two modules: a rule based candidate generation and a neural networks probability ranking model. Moreover, some simple string matching rules are used for NIL clustering. At the end, our best system has achieved an F1 score of 0.624 in the end-to-end typed mention ceaf plus metric.
Tasks	Entity Linking, Knowledge Base Population
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03558v1
PDF	http://arxiv.org/pdf/1611.03558v1.pdf
PWC	https://paperswithcode.com/paper/neural-networks-models-for-entity-discovery
Repo
Framework