Paper Group ANR 401
MML is not consistent for Neyman-Scott. Defending Non-Bayesian Learning against Adversarial Attacks. Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data. Geometric descent method for convex composite minimization. Making Tree Ensembles Interpretable. Building a robust sentiment lexicon with (almost) no resource. Imp …
MML is not consistent for Neyman-Scott
Title | MML is not consistent for Neyman-Scott |
Authors | Michael Brand |
Abstract | Strict Minimum Message Length (SMML) is an information-theoretic statistical inference method widely cited (but only with informal arguments) as providing estimations that are consistent for general estimation problems. It is, however, almost invariably intractable to compute, for which reason only approximations of it (known as MML algorithms) are ever used in practice. Using novel techniques that allow for the first time direct, non-approximated analysis of SMML solutions, we investigate the Neyman-Scott estimation problem, an oft-cited showcase for the consistency of MML, and show that even with a natural choice of prior neither SMML nor its popular approximations are consistent for it, thereby providing a counterexample to the general claim. This is the first known explicit construction of an SMML solution for a natural, high-dimensional problem. |
Tasks | |
Published | 2016-10-14 |
URL | https://arxiv.org/abs/1610.04336v5 |
https://arxiv.org/pdf/1610.04336v5.pdf | |
PWC | https://paperswithcode.com/paper/mml-is-not-consistent-for-neyman-scott |
Repo | |
Framework | |
Defending Non-Bayesian Learning against Adversarial Attacks
Title | Defending Non-Bayesian Learning against Adversarial Attacks |
Authors | Lili Su, Nitin H. Vaidya |
Abstract | This paper addresses the problem of non-Bayesian learning over multi-agent networks, where agents repeatedly collect partially informative observations about an unknown state of the world, and try to collaboratively learn the true state. We focus on the impact of the adversarial agents on the performance of consensus-based non-Bayesian learning, where non-faulty agents combine local learning updates with consensus primitives. In particular, we consider the scenario where an unknown subset of agents suffer Byzantine faults – agents suffering Byzantine faults behave arbitrarily. Two different learning rules are proposed. |
Tasks | |
Published | 2016-06-28 |
URL | http://arxiv.org/abs/1606.08883v1 |
http://arxiv.org/pdf/1606.08883v1.pdf | |
PWC | https://paperswithcode.com/paper/defending-non-bayesian-learning-against |
Repo | |
Framework | |
Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data
Title | Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data |
Authors | Teng Long, Ryan Lowe, Jackie Chi Kit Cheung, Doina Precup |
Abstract | Recent work in learning vector-space embeddings for multi-relational data has focused on combining relational information derived from knowledge bases with distributional information derived from large text corpora. We propose a simple approach that leverages the descriptions of entities or phrases available in lexical resources, in conjunction with distributional semantics, in order to derive a better initialization for training relational models. Applying this initialization to the TransE model results in significant new state-of-the-art performances on the WordNet dataset, decreasing the mean rank from the previous best of 212 to 51. It also results in faster convergence of the entity representations. We find that there is a trade-off between improving the mean rank and the hits@10 with this approach. This illustrates that much remains to be understood regarding performance improvements in relational models. |
Tasks | Entity Embeddings |
Published | 2016-05-18 |
URL | http://arxiv.org/abs/1605.05416v1 |
http://arxiv.org/pdf/1605.05416v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-lexical-resources-for-learning |
Repo | |
Framework | |
Geometric descent method for convex composite minimization
Title | Geometric descent method for convex composite minimization |
Authors | Shixiang Chen, Shiqian Ma, Wei Liu |
Abstract | In this paper, we extend the geometric descent method recently proposed by Bubeck, Lee and Singh to tackle nonsmooth and strongly convex composite problems. We prove that our proposed algorithm, dubbed geometric proximal gradient method (GeoPG), converges with a linear rate $(1-1/\sqrt{\kappa})$ and thus achieves the optimal rate among first-order methods, where $\kappa$ is the condition number of the problem. Numerical results on linear regression and logistic regression with elastic net regularization show that GeoPG compares favorably with Nesterov’s accelerated proximal gradient method, especially when the problem is ill-conditioned. |
Tasks | |
Published | 2016-12-29 |
URL | http://arxiv.org/abs/1612.09034v4 |
http://arxiv.org/pdf/1612.09034v4.pdf | |
PWC | https://paperswithcode.com/paper/geometric-descent-method-for-convex-composite |
Repo | |
Framework | |
Making Tree Ensembles Interpretable
Title | Making Tree Ensembles Interpretable |
Authors | Satoshi Hara, Kohei Hayashi |
Abstract | Tree ensembles, such as random forest and boosted trees, are renowned for their high prediction performance, whereas their interpretability is critically limited. In this paper, we propose a post processing method that improves the model interpretability of tree ensembles. After learning a complex tree ensembles in a standard way, we approximate it by a simpler model that is interpretable for human. To obtain the simpler model, we derive the EM algorithm minimizing the KL divergence from the complex ensemble. A synthetic experiment showed that a complicated tree ensemble was approximated reasonably as interpretable. |
Tasks | |
Published | 2016-06-17 |
URL | http://arxiv.org/abs/1606.05390v1 |
http://arxiv.org/pdf/1606.05390v1.pdf | |
PWC | https://paperswithcode.com/paper/making-tree-ensembles-interpretable |
Repo | |
Framework | |
Building a robust sentiment lexicon with (almost) no resource
Title | Building a robust sentiment lexicon with (almost) no resource |
Authors | Mickael Rouvier, Benoit Favre |
Abstract | Creating sentiment polarity lexicons is labor intensive. Automatically translating them from resourceful languages requires in-domain machine translation systems, which rely on large quantities of bi-texts. In this paper, we propose to replace machine translation by transferring words from the lexicon through word embeddings aligned across languages with a simple linear transform. The approach leads to no degradation, compared to machine translation, when tested on sentiment polarity classification on tweets from four languages. |
Tasks | Machine Translation, Word Embeddings |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05202v1 |
http://arxiv.org/pdf/1612.05202v1.pdf | |
PWC | https://paperswithcode.com/paper/building-a-robust-sentiment-lexicon-with |
Repo | |
Framework | |
Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning
Title | Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning |
Authors | Nanyun Peng, Mark Dredze |
Abstract | Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings. For languages where word boundaries are not readily identified in text, word segmentation is a key first step to generating features for an NER system. While using word boundary tags as features are helpful, the signals that aid in identifying these boundaries may provide richer information for an NER system. New state-of-the-art word segmentation systems use neural models to learn representations for predicting word boundaries. We show that these same representations, jointly trained with an NER system, yield significant improvements in NER for Chinese social media. In our experiments, jointly training NER and word segmentation with an LSTM-CRF model yields nearly 5% absolute improvement over previously published results. |
Tasks | Named Entity Recognition, Representation Learning |
Published | 2016-03-02 |
URL | http://arxiv.org/abs/1603.00786v2 |
http://arxiv.org/pdf/1603.00786v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-named-entity-recognition-for |
Repo | |
Framework | |
Bayesian Inference of Bijective Non-Rigid Shape Correspondence
Title | Bayesian Inference of Bijective Non-Rigid Shape Correspondence |
Authors | Matthias Vestner, Roee Litman, Alex Bronstein, Emanuele Rodolà, Daniel Cremers |
Abstract | Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a postprocessing stage in the functional correspondence framework. In this paper, we show that such frequently used techniques in practice suffer from lack of accuracy and result in poor surjectivity. We propose an alternative recovery technique guaranteeing a bijective correspondence and producing significantly higher accuracy. We derive the proposed method from a statistical framework of Bayesian inference and demonstrate its performance on several challenging deformable 3D shape matching datasets. |
Tasks | Bayesian Inference |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03425v1 |
http://arxiv.org/pdf/1607.03425v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-inference-of-bijective-non-rigid |
Repo | |
Framework | |
On Reward Function for Survival
Title | On Reward Function for Survival |
Authors | Naoto Yoshida |
Abstract | Obtaining a survival strategy (policy) is one of the fundamental problems of biological agents. In this paper, we generalize the formulation of previous research related to the survival of an agent and we formulate the survival problem as a maximization of the multi-step survival probability in future time steps. We introduce a method for converting the maximization of multi-step survival probability into a classical reinforcement learning problem. Using this conversion, the reward function (negative temporal cost function) is expressed as the log of the temporal survival probability. And we show that the objective function of the reinforcement learning in this sense is proportional to the variational lower bound of the original problem. Finally, We empirically demonstrate that the agent learns survival behavior by using the reward function introduced in this paper. |
Tasks | |
Published | 2016-06-18 |
URL | http://arxiv.org/abs/1606.05767v2 |
http://arxiv.org/pdf/1606.05767v2.pdf | |
PWC | https://paperswithcode.com/paper/on-reward-function-for-survival |
Repo | |
Framework | |
A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification
Title | A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification |
Authors | Steven Cheng-Xian Li, Benjamin Marlin |
Abstract | We present a general framework for classification of sparse and irregularly-sampled time series. The properties of such time series can result in substantial uncertainty about the values of the underlying temporal processes, while making the data difficult to deal with using standard classification methods that assume fixed-dimensional feature spaces. To address these challenges, we propose an uncertainty-aware classification framework based on a special computational layer we refer to as the Gaussian process adapter that can connect irregularly sampled time series data to any black-box classifier learnable using gradient descent. We show how to scale up the required computations based on combining the structured kernel interpolation framework and the Lanczos approximation method, and how to discriminatively train the Gaussian process adapter in combination with a number of classifiers end-to-end using backpropagation. |
Tasks | Time Series, Time Series Classification |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04443v2 |
http://arxiv.org/pdf/1606.04443v2.pdf | |
PWC | https://paperswithcode.com/paper/a-scalable-end-to-end-gaussian-process |
Repo | |
Framework | |
Learning Bound for Parameter Transfer Learning
Title | Learning Bound for Parameter Transfer Learning |
Authors | Wataru Kumagai |
Abstract | We consider a transfer-learning problem by using the parameter transfer approach, where a suitable parameter of feature mapping is learned through one task and applied to another objective task. Then, we introduce the notion of the local stability and parameter transfer learnability of parametric feature mapping,and thereby derive a learning bound for parameter transfer algorithms. As an application of parameter transfer learning, we discuss the performance of sparse coding in self-taught learning. Although self-taught learning algorithms with plentiful unlabeled data often show excellent empirical performance, their theoretical analysis has not been studied. In this paper, we also provide the first theoretical learning bound for self-taught learning. |
Tasks | Transfer Learning |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08696v3 |
http://arxiv.org/pdf/1610.08696v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-bound-for-parameter-transfer |
Repo | |
Framework | |
Lecture Notes on Spectral Graph Methods
Title | Lecture Notes on Spectral Graph Methods |
Authors | Michael W. Mahoney |
Abstract | These are lecture notes that are based on the lectures from a class I taught on the topic of Spectral Graph Methods at UC Berkeley during the Spring 2015 semester. |
Tasks | |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.04845v1 |
http://arxiv.org/pdf/1608.04845v1.pdf | |
PWC | https://paperswithcode.com/paper/lecture-notes-on-spectral-graph-methods |
Repo | |
Framework | |
The Dem@Care Experiments and Datasets: a Technical Report
Title | The Dem@Care Experiments and Datasets: a Technical Report |
Authors | Anastasios Karakostas, Alexia Briassouli, Konstantinos Avgerinakis, Ioannis Kompatsiaris, Magda Tsolaki |
Abstract | The objective of Dem@Care is the development of a complete system providing personal health services to people with dementia, as well as medical professionals and caregivers, by using a multitude of sensors, for context-aware, multi-parametric monitoring of lifestyle, ambient environment, and health parameters. Multi-sensor data analysis, combined with intelligent decision making mechanisms, will allow an accurate representation of the person’s current status and will provide the appropriate feedback, both to the person and the associated caregivers, enhancing the standard clinical workflow. Within the project framework, several data collection activities have taken place to assist technical development and evaluation tasks. In all these activities, particular attention has been paid to adhere to ethical guidelines and preserve the participants’ privacy. This technical report describes shorty the (a) the main objectives of the project, (b) the main ethical principles and (c) the datasets that have been already created. |
Tasks | Decision Making |
Published | 2016-12-17 |
URL | http://arxiv.org/abs/1701.01142v1 |
http://arxiv.org/pdf/1701.01142v1.pdf | |
PWC | https://paperswithcode.com/paper/the-demcare-experiments-and-datasets-a |
Repo | |
Framework | |
ggRandomForests: Exploring Random Forest Survival
Title | ggRandomForests: Exploring Random Forest Survival |
Authors | John Ehrlinger |
Abstract | Random forest (Leo Breiman 2001a) (RF) is a non-parametric statistical method requiring no distributional assumptions on covariate relation to the response. RF is a robust, nonlinear technique that optimizes predictive accuracy by fitting an ensemble of trees to stabilize model estimates. Random survival forests (RSF) (Ishwaran and Kogalur 2007; Ishwaran et al. 2008) are an extension of Breimans RF techniques allowing efficient nonparametric analysis of time to event data. The randomForestSRC package (Ishwaran and Kogalur 2014) is a unified treatment of Breimans random forest for survival, regression and classification problems. Predictive accuracy makes RF an attractive alternative to parametric models, though complexity and interpretability of the forest hinder wider application of the method. We introduce the ggRandomForests package, tools for visually understand random forest models grown in R (R Core Team 2014) with the randomForestSRC package. The ggRandomForests package is structured to extract intermediate data objects from randomForestSRC objects and generate figures using the ggplot2 (Wickham 2009) graphics package. This document is structured as a tutorial for building random forest for survival with the randomForestSRC package and using the ggRandomForests package for investigating how the forest is constructed. We analyse the Primary Biliary Cirrhosis of the liver data from a clinical trial at the Mayo Clinic (Fleming and Harrington 1991). Our aim is to demonstrate the strength of using Random Forest methods for both prediction and information retrieval, specifically in time to event data settings. |
Tasks | Information Retrieval |
Published | 2016-12-28 |
URL | http://arxiv.org/abs/1612.08974v1 |
http://arxiv.org/pdf/1612.08974v1.pdf | |
PWC | https://paperswithcode.com/paper/ggrandomforests-exploring-random-forest |
Repo | |
Framework | |
Randomized Mechanisms for Selling Reserved Instances in Cloud
Title | Randomized Mechanisms for Selling Reserved Instances in Cloud |
Authors | Jia Zhang, Weidong Ma, Tao Qin, Xiaoming Sun, Tie-Yan Liu |
Abstract | Selling reserved instances (or virtual machines) is a basic service in cloud computing. In this paper, we consider a more flexible pricing model for instance reservation, in which a customer can propose the time length and number of resources of her request, while in today’s industry, customers can only choose from several predefined reservation packages. Under this model, we design randomized mechanisms for customers coming online to optimize social welfare and providers’ revenue. We first consider a simple case, where the requests from the customers do not vary too much in terms of both length and value density. We design a randomized mechanism that achieves a competitive ratio $\frac{1}{42}$ for both \emph{social welfare} and \emph{revenue}, which is a improvement as there is usually no revenue guarantee in previous works such as \cite{azar2015ec,wang2015selling}. This ratio can be improved up to $\frac{1}{11}$ when we impose a realistic constraint on the maximum number of resources used by each request. On the hardness side, we show an upper bound $\frac{1}{3}$ on competitive ratio for any randomized mechanism. We then extend our mechanism to the general case and achieve a competitive ratio $\frac{1}{42\log k\log T}$ for both social welfare and revenue, where $T$ is the ratio of the maximum request length to the minimum request length and $k$ is the ratio of the maximum request value density to the minimum request value density. This result outperforms the previous upper bound $\frac{1}{CkT}$ for deterministic mechanisms \cite{wang2015selling}. We also prove an upper bound $\frac{2}{\log 8kT}$ for any randomized mechanism. All the mechanisms we provide are in a greedy style. They are truthful and easy to be integrated into practical cloud systems. |
Tasks | |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07379v1 |
http://arxiv.org/pdf/1611.07379v1.pdf | |
PWC | https://paperswithcode.com/paper/randomized-mechanisms-for-selling-reserved |
Repo | |
Framework | |