July 26, 2019

2062 words 10 mins read

Paper Group NANR 146

Boosted Fitted Q-Iteration. Enumerating Distinct Decision Trees. Comparison of String Similarity Measures for Obscenity Filtering. An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages. SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling. BE Is Not the Unique Homomorphism That Makes the Partee Tr …

Boosted Fitted Q-Iteration


Title	Boosted Fitted Q-Iteration
Authors	Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, Marcello Restelli
Abstract	This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bellman residuals across all iterations. The advantage of such approach w.r.t. to other AVI methods is twofold: (1) while keeping the same function space at each iteration, B-FQI can represent more complex functions by considering an additive model; (2) since the Bellman residual decreases as the optimal value function is approached, regression problems become easier as iterations proceed. We study B-FQI both theoretically, providing also a finite-sample error upper bound for it, and empirically, by comparing its performance to the one of FQI in different domains and using different regression techniques.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=676
PDF	http://proceedings.mlr.press/v70/tosatto17a/tosatto17a.pdf
PWC	https://paperswithcode.com/paper/boosted-fitted-q-iteration
Repo
Framework

Enumerating Distinct Decision Trees


Title	Enumerating Distinct Decision Trees
Authors	Salvatore Ruggieri
Abstract	The search space for the feature selection problem in decision tree learning is the lattice of subsets of the available features. We provide an exact enumeration procedure of the subsets that lead to all and only the distinct decision trees. The procedure can be adopted to prune the search space of complete and heuristics search methods in wrapper models for feature selection. Based on this, we design a computational optimization of the sequential backward elimination heuristics with a performance improvement of up to 100X.
Tasks	Feature Selection
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=538
PDF	http://proceedings.mlr.press/v70/ruggieri17a/ruggieri17a.pdf
PWC	https://paperswithcode.com/paper/enumerating-distinct-decision-trees
Repo
Framework

Comparison of String Similarity Measures for Obscenity Filtering


Title	Comparison of String Similarity Measures for Obscenity Filtering
Authors	Ekaterina Chernyak
Abstract	In this paper we address the problem of filtering obscene lexis in Russian texts. We use string similarity measures to find words similar or identical to words from a stop list and establish both a test collection and a baseline for the task. Our experiments show that a novel string similarity measure based on the notion of an annotated suffix tree outperforms some of the other well known measures.
Tasks	Information Retrieval, Sentiment Analysis, Spelling Correction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1415/
PDF	https://www.aclweb.org/anthology/W17-1415
PWC	https://paperswithcode.com/paper/comparison-of-string-similarity-measures-for
Repo
Framework

An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages


Title	An Extensive Empirical Evaluation of Character-Based Morphological Tagging for 14 Languages
Authors	Georg Heigold, Guenter Neumann, Josef van Genabith
Abstract	This paper investigates neural character-based morphological tagging for languages with complex morphology and large tag sets. Character-based approaches are attractive as they can handle rarely- and unseen words gracefully. We evaluate on 14 languages and observe consistent gains over a state-of-the-art morphological tagger across all languages except for English and French, where we match the state-of-the-art. We compare two architectures for computing character-based word vectors using recurrent (RNN) and convolutional (CNN) nets. We show that the CNN based approach performs slightly worse and less consistently than the RNN based approach. Small but systematic gains are observed when combining the two architectures by ensembling.
Tasks	Language Modelling, Machine Translation, Morphological Analysis, Morphological Tagging, Named Entity Recognition, Part-Of-Speech Tagging
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1048/
PDF	https://www.aclweb.org/anthology/E17-1048
PWC	https://paperswithcode.com/paper/an-extensive-empirical-evaluation-of
Repo
Framework

SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling


Title	SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling
Authors	Jun-ichiro Hirayama, Aapo Hyvärinen, Motoaki Kawanabe
Abstract	We present a novel probabilistic framework for a hierarchical extension of independent component analysis (ICA), with a particular motivation in neuroscientific data analysis and modeling. The framework incorporates a general subspace pooling with linear ICA-like layers stacked recursively. Unlike related previous models, our generative model is fully tractable: both the likelihood and the posterior estimates of latent variables can readily be computed with analytically simple formulae. The model is particularly simple in the case of complex-valued data since the pooling can be reduced to taking the modulus of complex numbers. Experiments on electroencephalography (EEG) and natural images demonstrate the validity of the method.
Tasks	EEG
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=691
PDF	http://proceedings.mlr.press/v70/hirayama17a/hirayama17a.pdf
PWC	https://paperswithcode.com/paper/splice-fully-tractable-hierarchical-extension
Repo
Framework

BE Is Not the Unique Homomorphism That Makes the Partee Triangle Commute


Title	BE Is Not the Unique Homomorphism That Makes the Partee Triangle Commute
Authors	Junri Shimada
Abstract
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/W17-3401/
PDF	https://www.aclweb.org/anthology/W17-3401
PWC	https://paperswithcode.com/paper/be-is-not-the-unique-homomorphism-that-makes
Repo
Framework

Unsupervised Bilingual Segmentation using MDL for Machine Translation


Title	Unsupervised Bilingual Segmentation using MDL for Machine Translation
Authors	Bin Shan, Hao Wang, Yves Lepage
Abstract
Tasks	Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1015/
PDF	https://www.aclweb.org/anthology/Y17-1015
PWC	https://paperswithcode.com/paper/unsupervised-bilingual-segmentation-using-mdl
Repo
Framework

Latent-Variable PCFGs: Background and Applications


Title	Latent-Variable PCFGs: Background and Applications
Authors	Shay Cohen
Abstract
Tasks	Machine Translation, Question Answering, Text Generation
Published	2017-07-01
URL	https://www.aclweb.org/anthology/W17-3405/
PDF	https://www.aclweb.org/anthology/W17-3405
PWC	https://paperswithcode.com/paper/latent-variable-pcfgs-background-and
Repo
Framework

Evaluating Bayesian Models with Posterior Dispersion Indices


Title	Evaluating Bayesian Models with Posterior Dispersion Indices
Authors	Alp Kucukelbir, Yixin Wang, David M. Blei
Abstract	Probabilistic modeling is cyclical: we specify a model, infer its posterior, and evaluate its performance. Evaluation drives the cycle, as we revise our model based on how it performs. This requires a metric. Traditionally, predictive accuracy prevails. Yet, predictive accuracy does not tell the whole story. We propose to evaluate a model through posterior dispersion. The idea is to analyze how each datapoint fares in relation to posterior uncertainty around the hidden structure. This highlights datapoints the model struggles to explain and provides complimentary insight to datapoints with low predictive accuracy. We present a family of posterior dispersion indices (PDI) that capture this idea. We show how a PDI identifies patterns of model mismatch in three real data examples: voting preferences, supermarket shopping, and population genetics.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=514
PDF	http://proceedings.mlr.press/v70/kucukelbir17a/kucukelbir17a.pdf
PWC	https://paperswithcode.com/paper/evaluating-bayesian-models-with-posterior
Repo
Framework

Adapting SimpleNLG to Spanish


Title	Adapting SimpleNLG to Spanish
Authors	Alej Ramos-Soto, ro, Julio Janeiro-Gallardo, Alberto Bugar{'\i}n Diz
Abstract	We describe SimpleNLG-ES, an adaptation of the SimpleNLG realization library for the Spanish language. Our implementation is based on the bilingual English-French SimpleNLG-EnFr adaptation. The library has been tested using a battery of examples that ensure that the most common syntax, morphology and orthography rules for Spanish are met. The library is currently being used in three different projects for the development of data-to-text systems in the meteorological, statistical data information, and business intelligence application domains.
Tasks	Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3521/
PDF	https://www.aclweb.org/anthology/W17-3521
PWC	https://paperswithcode.com/paper/adapting-simplenlg-to-spanish
Repo
Framework

Latent Feature Lasso


Title	Latent Feature Lasso
Authors	Ian En-Hsu Yen, Wei-Cheng Lee, Sung-En Chang, Arun Sai Suggala, Shou-De Lin, Pradeep Ravikumar
Abstract	The latent feature model (LFM), proposed in (Griffiths & Ghahramani, 2005), but possibly with earlier origins, is a generalization of a mixture model, where each instance is generated not from a single latent class but from a combination of latent features. Thus, each instance has an associated latent binary feature incidence vector indicating the presence or absence of a feature. Due to its combinatorial nature, inference of LFMs is considerably intractable, and accordingly, most of the attention has focused on nonparametric LFMs, with priors such as the Indian Buffet Process (IBP) on infinite binary matrices. Recent efforts to tackle this complexity either still have computational complexity that is exponential, or sample complexity that is high-order polynomial w.r.t. the number of latent features. In this paper, we address this outstanding problem of tractable estimation of LFMs via a novel atomic-norm regularization, which gives an algorithm with polynomial run-time and sample complexity without impractical assumptions on the data distribution.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=867
PDF	http://proceedings.mlr.press/v70/yen17a/yen17a.pdf
PWC	https://paperswithcode.com/paper/latent-feature-lasso
Repo
Framework

SRHR at SemEval-2017 Task 6: Word Associations for Humour Recognition


Title	SRHR at SemEval-2017 Task 6: Word Associations for Humour Recognition
Authors	Andrew Cattle, Xiaojuan Ma
Abstract	This paper explores the role of semantic relatedness features, such as word associations, in humour recognition. Specifically, we examine the task of inferring pairwise humour judgments in Twitter hashtag wars. We examine a variety of word association features derived from University of Southern Florida Free Association Norms (USF) and the Edinburgh Associative Thesaurus (EAT) and find that word association-based features outperform Word2Vec similarity, a popular semantic relatedness measure. Our system achieves an accuracy of 56.42{%} using a combination of unigram perplexity, bigram perplexity, EAT difference (tweet-avg), USF forward (max), EAT difference (word-avg), USF difference (word-avg), EAT forward (min), USF difference (tweet-max), and EAT backward (min).
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2067/
PDF	https://www.aclweb.org/anthology/S17-2067
PWC	https://paperswithcode.com/paper/srhr-at-semeval-2017-task-6-word-associations
Repo
Framework

Entity-Centric Information Access with Human in the Loop for the Biomedical Domain


Title	Entity-Centric Information Access with Human in the Loop for the Biomedical Domain
Authors	Seid Muhie Yimam, Steffen Remus, Alex Panchenko, er, Andreas Holzinger, Chris Biemann
Abstract	In this paper, we describe the concept of entity-centric information access for the biomedical domain. With entity recognition technologies approaching acceptable levels of accuracy, we put forward a paradigm of document browsing and searching where the entities of the domain and their relations are explicitly modeled to provide users the possibility of collecting exhaustive information on relations of interest. We describe three working prototypes along these lines: NEW/S/LEAK, which was developed for investigative journalists who need a quick overview of large leaked document collections; STORYFINDER, which is a personalized organizer for information found in web pages that allows adding entities as well as relations, and is capable of personalized information management; and adaptive annotation capabilities of WEBANNO, which is a general-purpose linguistic annotation tool. We will discuss future steps towards the adaptation of these tools to biomedical data, which is subject to a recently started project on biomedical knowledge acquisition. A key difference to other approaches is the centering around the user in a Human-in-the-Loop machine learning approach, where users define and extend categories and enable the system to improve via feedback and interaction.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-8006/
PDF	https://doi.org/10.26615/978-954-452-044-1_006
PWC	https://paperswithcode.com/paper/entity-centric-information-access-with-human
Repo
Framework

When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness


Title	When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness
Authors	Chris Russell, Matt J. Kusner, Joshua Loftus, Ricardo Silva
Abstract	Machine learning is now being used to make crucial decisions about people’s lives. For nearly all of these decisions there is a risk that individuals of a certain race, gender, sexual orientation, or any other subpopulation are unfairly discriminated against. Our recent method has demonstrated how to use techniques from counterfactual inference to make predictions fair across different subpopulations. This method requires that one provides the causal model that generated the data at hand. In general, validating all causal implications of the model is not possible without further assumptions. Hence, it is desirable to integrate competing causal models to provide counterfactually fair decisions, regardless of which causal “world” is the correct one. In this paper, we show how it is possible to make predictions that are approximately fair with respect to multiple possible causal models at once, thus mitigating the problem of exact causal specification. We frame the goal of learning a fair classifier as an optimization problem with fairness constraints entailed by competing causal explanations. We show how this optimization problem can be efficiently solved using gradient-based methods. We demonstrate the flexibility of our model on two real-world fair classification problems. We show that our model can seamlessly balance fairness in multiple worlds with prediction accuracy.
Tasks	Counterfactual Inference
Published	2017-12-01
URL	http://papers.nips.cc/paper/7220-when-worlds-collide-integrating-different-counterfactual-assumptions-in-fairness
PDF	http://papers.nips.cc/paper/7220-when-worlds-collide-integrating-different-counterfactual-assumptions-in-fairness.pdf
PWC	https://paperswithcode.com/paper/when-worlds-collide-integrating-different
Repo
Framework

Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions


Title	Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions
Authors	Yichen Chen, Dongdong Ge, Mengdi Wang, Zizhuo Wang, Yinyu Ye, Hao Yin
Abstract	Consider the regularized sparse minimization problem, which involves empirical sums of loss functions for $n$ data points (each of dimension $d$) and a nonconvex sparsity penalty. We prove that finding an $\mathcal{O}(n^{c_1}d^{c_2})$-optimal solution to the regularized sparse optimization problem is strongly NP-hard for any $c_1, c_2\in [0,1)$ such that $c_1+c_2<1$. The result applies to a broad class of loss functions and sparse penalty functions. It suggests that one cannot even approximately solve the sparse optimization problem in polynomial time, unless P $=$ NP.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=726
PDF	http://proceedings.mlr.press/v70/chen17d/chen17d.pdf
PWC	https://paperswithcode.com/paper/strong-np-hardness-for-sparse-optimization
Repo
Framework