Paper Group NANR 146
Boosted Fitted Q-Iteration. Enumerating Distinct Decision Trees. Comparison of String Similarity Measures for Obscenity Filtering. An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages. SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling. BE Is Not the Unique Homomorphism That Makes the Partee Tr …
Boosted Fitted Q-Iteration
Title | Boosted Fitted Q-Iteration |
Authors | Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, Marcello Restelli |
Abstract | This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bellman residuals across all iterations. The advantage of such approach w.r.t. to other AVI methods is twofold: (1) while keeping the same function space at each iteration, B-FQI can represent more complex functions by considering an additive model; (2) since the Bellman residual decreases as the optimal value function is approached, regression problems become easier as iterations proceed. We study B-FQI both theoretically, providing also a finite-sample error upper bound for it, and empirically, by comparing its performance to the one of FQI in different domains and using different regression techniques. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=676 |
http://proceedings.mlr.press/v70/tosatto17a/tosatto17a.pdf | |
PWC | https://paperswithcode.com/paper/boosted-fitted-q-iteration |
Repo | |
Framework | |
Enumerating Distinct Decision Trees
Title | Enumerating Distinct Decision Trees |
Authors | Salvatore Ruggieri |
Abstract | The search space for the feature selection problem in decision tree learning is the lattice of subsets of the available features. We provide an exact enumeration procedure of the subsets that lead to all and only the distinct decision trees. The procedure can be adopted to prune the search space of complete and heuristics search methods in wrapper models for feature selection. Based on this, we design a computational optimization of the sequential backward elimination heuristics with a performance improvement of up to 100X. |
Tasks | Feature Selection |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=538 |
http://proceedings.mlr.press/v70/ruggieri17a/ruggieri17a.pdf | |
PWC | https://paperswithcode.com/paper/enumerating-distinct-decision-trees |
Repo | |
Framework | |
Comparison of String Similarity Measures for Obscenity Filtering
Title | Comparison of String Similarity Measures for Obscenity Filtering |
Authors | Ekaterina Chernyak |
Abstract | In this paper we address the problem of filtering obscene lexis in Russian texts. We use string similarity measures to find words similar or identical to words from a stop list and establish both a test collection and a baseline for the task. Our experiments show that a novel string similarity measure based on the notion of an annotated suffix tree outperforms some of the other well known measures. |
Tasks | Information Retrieval, Sentiment Analysis, Spelling Correction |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1415/ |
https://www.aclweb.org/anthology/W17-1415 | |
PWC | https://paperswithcode.com/paper/comparison-of-string-similarity-measures-for |
Repo | |
Framework | |
An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages
Title | An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages |
Authors | Georg Heigold, Guenter Neumann, Josef van Genabith |
Abstract | This paper investigates neural character-based morphological tagging for languages with complex morphology and large tag sets. Character-based approaches are attractive as they can handle rarely- and unseen words gracefully. We evaluate on 14 languages and observe consistent gains over a state-of-the-art morphological tagger across all languages except for English and French, where we match the state-of-the-art. We compare two architectures for computing character-based word vectors using recurrent (RNN) and convolutional (CNN) nets. We show that the CNN based approach performs slightly worse and less consistently than the RNN based approach. Small but systematic gains are observed when combining the two architectures by ensembling. |
Tasks | Language Modelling, Machine Translation, Morphological Analysis, Morphological Tagging, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1048/ |
https://www.aclweb.org/anthology/E17-1048 | |
PWC | https://paperswithcode.com/paper/an-extensive-empirical-evaluation-of |
Repo | |
Framework | |
SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling
Title | SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling |
Authors | Jun-ichiro Hirayama, Aapo Hyvärinen, Motoaki Kawanabe |
Abstract | We present a novel probabilistic framework for a hierarchical extension of independent component analysis (ICA), with a particular motivation in neuroscientific data analysis and modeling. The framework incorporates a general subspace pooling with linear ICA-like layers stacked recursively. Unlike related previous models, our generative model is fully tractable: both the likelihood and the posterior estimates of latent variables can readily be computed with analytically simple formulae. The model is particularly simple in the case of complex-valued data since the pooling can be reduced to taking the modulus of complex numbers. Experiments on electroencephalography (EEG) and natural images demonstrate the validity of the method. |
Tasks | EEG |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=691 |
http://proceedings.mlr.press/v70/hirayama17a/hirayama17a.pdf | |
PWC | https://paperswithcode.com/paper/splice-fully-tractable-hierarchical-extension |
Repo | |
Framework | |
BE Is Not the Unique Homomorphism That Makes the Partee Triangle Commute
Title | BE Is Not the Unique Homomorphism That Makes the Partee Triangle Commute |
Authors | Junri Shimada |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/W17-3401/ |
https://www.aclweb.org/anthology/W17-3401 | |
PWC | https://paperswithcode.com/paper/be-is-not-the-unique-homomorphism-that-makes |
Repo | |
Framework | |
Unsupervised Bilingual Segmentation using MDL for Machine Translation
Title | Unsupervised Bilingual Segmentation using MDL for Machine Translation |
Authors | Bin Shan, Hao Wang, Yves Lepage |
Abstract | |
Tasks | Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1015/ |
https://www.aclweb.org/anthology/Y17-1015 | |
PWC | https://paperswithcode.com/paper/unsupervised-bilingual-segmentation-using-mdl |
Repo | |
Framework | |
Latent-Variable PCFGs: Background and Applications
Title | Latent-Variable PCFGs: Background and Applications |
Authors | Shay Cohen |
Abstract | |
Tasks | Machine Translation, Question Answering, Text Generation |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/W17-3405/ |
https://www.aclweb.org/anthology/W17-3405 | |
PWC | https://paperswithcode.com/paper/latent-variable-pcfgs-background-and |
Repo | |
Framework | |
Evaluating Bayesian Models with Posterior Dispersion Indices
Title | Evaluating Bayesian Models with Posterior Dispersion Indices |
Authors | Alp Kucukelbir, Yixin Wang, David M. Blei |
Abstract | Probabilistic modeling is cyclical: we specify a model, infer its posterior, and evaluate its performance. Evaluation drives the cycle, as we revise our model based on how it performs. This requires a metric. Traditionally, predictive accuracy prevails. Yet, predictive accuracy does not tell the whole story. We propose to evaluate a model through posterior dispersion. The idea is to analyze how each datapoint fares in relation to posterior uncertainty around the hidden structure. This highlights datapoints the model struggles to explain and provides complimentary insight to datapoints with low predictive accuracy. We present a family of posterior dispersion indices (PDI) that capture this idea. We show how a PDI identifies patterns of model mismatch in three real data examples: voting preferences, supermarket shopping, and population genetics. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=514 |
http://proceedings.mlr.press/v70/kucukelbir17a/kucukelbir17a.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-bayesian-models-with-posterior |
Repo | |
Framework | |
Adapting SimpleNLG to Spanish
Title | Adapting SimpleNLG to Spanish |
Authors | Alej Ramos-Soto, ro, Julio Janeiro-Gallardo, Alberto Bugar{'\i}n Diz |
Abstract | We describe SimpleNLG-ES, an adaptation of the SimpleNLG realization library for the Spanish language. Our implementation is based on the bilingual English-French SimpleNLG-EnFr adaptation. The library has been tested using a battery of examples that ensure that the most common syntax, morphology and orthography rules for Spanish are met. The library is currently being used in three different projects for the development of data-to-text systems in the meteorological, statistical data information, and business intelligence application domains. |
Tasks | Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3521/ |
https://www.aclweb.org/anthology/W17-3521 | |
PWC | https://paperswithcode.com/paper/adapting-simplenlg-to-spanish |
Repo | |
Framework | |
Latent Feature Lasso
Title | Latent Feature Lasso |
Authors | Ian En-Hsu Yen, Wei-Cheng Lee, Sung-En Chang, Arun Sai Suggala, Shou-De Lin, Pradeep Ravikumar |
Abstract | The latent feature model (LFM), proposed in (Griffiths & Ghahramani, 2005), but possibly with earlier origins, is a generalization of a mixture model, where each instance is generated not from a single latent class but from a combination of latent features. Thus, each instance has an associated latent binary feature incidence vector indicating the presence or absence of a feature. Due to its combinatorial nature, inference of LFMs is considerably intractable, and accordingly, most of the attention has focused on nonparametric LFMs, with priors such as the Indian Buffet Process (IBP) on infinite binary matrices. Recent efforts to tackle this complexity either still have computational complexity that is exponential, or sample complexity that is high-order polynomial w.r.t. the number of latent features. In this paper, we address this outstanding problem of tractable estimation of LFMs via a novel atomic-norm regularization, which gives an algorithm with polynomial run-time and sample complexity without impractical assumptions on the data distribution. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=867 |
http://proceedings.mlr.press/v70/yen17a/yen17a.pdf | |
PWC | https://paperswithcode.com/paper/latent-feature-lasso |
Repo | |
Framework | |
SRHR at SemEval-2017 Task 6: Word Associations for Humour Recognition
Title | SRHR at SemEval-2017 Task 6: Word Associations for Humour Recognition |
Authors | Andrew Cattle, Xiaojuan Ma |
Abstract | This paper explores the role of semantic relatedness features, such as word associations, in humour recognition. Specifically, we examine the task of inferring pairwise humour judgments in Twitter hashtag wars. We examine a variety of word association features derived from University of Southern Florida Free Association Norms (USF) and the Edinburgh Associative Thesaurus (EAT) and find that word association-based features outperform Word2Vec similarity, a popular semantic relatedness measure. Our system achieves an accuracy of 56.42{%} using a combination of unigram perplexity, bigram perplexity, EAT difference (tweet-avg), USF forward (max), EAT difference (word-avg), USF difference (word-avg), EAT forward (min), USF difference (tweet-max), and EAT backward (min). |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2067/ |
https://www.aclweb.org/anthology/S17-2067 | |
PWC | https://paperswithcode.com/paper/srhr-at-semeval-2017-task-6-word-associations |
Repo | |
Framework | |
Entity-Centric Information Access with Human in the Loop for the Biomedical Domain
Title | Entity-Centric Information Access with Human in the Loop for the Biomedical Domain |
Authors | Seid Muhie Yimam, Steffen Remus, Alex Panchenko, er, Andreas Holzinger, Chris Biemann |
Abstract | In this paper, we describe the concept of entity-centric information access for the biomedical domain. With entity recognition technologies approaching acceptable levels of accuracy, we put forward a paradigm of document browsing and searching where the entities of the domain and their relations are explicitly modeled to provide users the possibility of collecting exhaustive information on relations of interest. We describe three working prototypes along these lines: NEW/S/LEAK, which was developed for investigative journalists who need a quick overview of large leaked document collections; STORYFINDER, which is a personalized organizer for information found in web pages that allows adding entities as well as relations, and is capable of personalized information management; and adaptive annotation capabilities of WEBANNO, which is a general-purpose linguistic annotation tool. We will discuss future steps towards the adaptation of these tools to biomedical data, which is subject to a recently started project on biomedical knowledge acquisition. A key difference to other approaches is the centering around the user in a Human-in-the-Loop machine learning approach, where users define and extend categories and enable the system to improve via feedback and interaction. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-8006/ |
https://doi.org/10.26615/978-954-452-044-1_006 | |
PWC | https://paperswithcode.com/paper/entity-centric-information-access-with-human |
Repo | |
Framework | |
When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness
Title | When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness |
Authors | Chris Russell, Matt J. Kusner, Joshua Loftus, Ricardo Silva |
Abstract | Machine learning is now being used to make crucial decisions about people’s lives. For nearly all of these decisions there is a risk that individuals of a certain race, gender, sexual orientation, or any other subpopulation are unfairly discriminated against. Our recent method has demonstrated how to use techniques from counterfactual inference to make predictions fair across different subpopulations. This method requires that one provides the causal model that generated the data at hand. In general, validating all causal implications of the model is not possible without further assumptions. Hence, it is desirable to integrate competing causal models to provide counterfactually fair decisions, regardless of which causal “world” is the correct one. In this paper, we show how it is possible to make predictions that are approximately fair with respect to multiple possible causal models at once, thus mitigating the problem of exact causal specification. We frame the goal of learning a fair classifier as an optimization problem with fairness constraints entailed by competing causal explanations. We show how this optimization problem can be efficiently solved using gradient-based methods. We demonstrate the flexibility of our model on two real-world fair classification problems. We show that our model can seamlessly balance fairness in multiple worlds with prediction accuracy. |
Tasks | Counterfactual Inference |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7220-when-worlds-collide-integrating-different-counterfactual-assumptions-in-fairness |
http://papers.nips.cc/paper/7220-when-worlds-collide-integrating-different-counterfactual-assumptions-in-fairness.pdf | |
PWC | https://paperswithcode.com/paper/when-worlds-collide-integrating-different |
Repo | |
Framework | |
Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions
Title | Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions |
Authors | Yichen Chen, Dongdong Ge, Mengdi Wang, Zizhuo Wang, Yinyu Ye, Hao Yin |
Abstract | Consider the regularized sparse minimization problem, which involves empirical sums of loss functions for $n$ data points (each of dimension $d$) and a nonconvex sparsity penalty. We prove that finding an $\mathcal{O}(n^{c_1}d^{c_2})$-optimal solution to the regularized sparse optimization problem is strongly NP-hard for any $c_1, c_2\in [0,1)$ such that $c_1+c_2<1$. The result applies to a broad class of loss functions and sparse penalty functions. It suggests that one cannot even approximately solve the sparse optimization problem in polynomial time, unless P $=$ NP. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=726 |
http://proceedings.mlr.press/v70/chen17d/chen17d.pdf | |
PWC | https://paperswithcode.com/paper/strong-np-hardness-for-sparse-optimization |
Repo | |
Framework | |