May 5, 2019

1989 words 10 mins read

Paper Group NANR 11

Joint quantile regression in vector-valued RKHSs. How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?. Linear Relaxations for Finding Diverse Elements in Metric Spaces. Vectors or Graphs? On Differences of Representations for Distributional Semantic Models. SemEval-2016 Task 10: Detecting Minimal Semantic Units and their …

Joint quantile regression in vector-valued RKHSs


Title	Joint quantile regression in vector-valued RKHSs
Authors	Maxime Sangnier, Olivier Fercoq, Florence D’Alché-Buc
Abstract	Addressing the will to give a more complete picture than an average relationship provided by standard regression, a novel framework for estimating and predicting simultaneously several conditional quantiles is introduced. The proposed methodology leverages kernel-based multi-task learning to curb the embarrassing phenomenon of quantile crossing, with a one-step estimation procedure and no post-processing. Moreover, this framework comes along with theoretical guarantees and an efficient coordinate descent learning algorithm. Numerical experiments on benchmark and real datasets highlight the enhancements of our approach regarding the prediction error, the crossing occurrences and the training time.
Tasks	Multi-Task Learning
Published	2016-12-01
URL	http://papers.nips.cc/paper/6239-joint-quantile-regression-in-vector-valued-rkhss
PDF	http://papers.nips.cc/paper/6239-joint-quantile-regression-in-vector-valued-rkhss.pdf
PWC	https://paperswithcode.com/paper/joint-quantile-regression-in-vector-valued
Repo
Framework

How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?


Title	How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?
Authors	Wuying Liu, Lin Wang
Abstract	Vietnamese word segmentation (VWS) is a challenging basic issue for natural language processing. This paper addresses the problem of how does dictionary size influence VWS performance, proposes two novel measures: square overlap ratio (SOR) and relaxed square overlap ratio (RSOR), and validates their effectiveness. The SOR measure is the product of dictionary overlap ratio and corpus overlap ratio, and the RSOR measure is the relaxed version of SOR measure under an unsupervised condition. The two measures both indicate the suitable degree between segmentation dictionary and object corpus waiting for segmentation. The experimental results show that the more suitable, neither smaller nor larger, dictionary size is better to achieve the state-of-the-art performance for dictionary-based Vietnamese word segmenters.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1172/
PDF	https://www.aclweb.org/anthology/L16-1172
PWC	https://paperswithcode.com/paper/how-does-dictionary-size-influence
Repo
Framework

Linear Relaxations for Finding Diverse Elements in Metric Spaces


Title	Linear Relaxations for Finding Diverse Elements in Metric Spaces
Authors	Aditya Bhaskara, Mehrdad Ghadiri, Vahab Mirrokni, Ola Svensson
Abstract	Choosing a diverse subset of a large collection of points in a metric space is a fundamental problem, with applications in feature selection, recommender systems, web search, data summarization, etc. Various notions of diversity have been proposed, tailored to different applications. The general algorithmic goal is to find a subset of points that maximize diversity, while obeying a cardinality (or more generally, matroid) constraint. The goal of this paper is to develop a novel linear programming (LP) framework that allows us to design approximation algorithms for such problems. We study an objective known as {\em sum-min} diversity, which is known to be effective in many applications, and give the first constant factor approximation algorithm. Our LP framework allows us to easily incorporate additional constraints, as well as secondary objectives. We also prove a hardness result for two natural diversity objectives, under the so-called {\em planted clique} assumption. Finally, we study the empirical performance of our algorithm on several standard datasets. We first study the approximation quality of the algorithm by comparing with the LP objective. Then, we compare the quality of the solutions produced by our method with other popular diversity maximization algorithms.
Tasks	Data Summarization, Feature Selection, Recommendation Systems
Published	2016-12-01
URL	http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces
PDF	http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces.pdf
PWC	https://paperswithcode.com/paper/linear-relaxations-for-finding-diverse
Repo
Framework

Vectors or Graphs? On Differences of Representations for Distributional Semantic Models


Title	Vectors or Graphs? On Differences of Representations for Distributional Semantic Models
Authors	Chris Biemann
Abstract	Distributional Semantic Models (DSMs) have recently received increased attention, together with the rise of neural architectures for scalable training of dense vector embeddings. While some of the literature even includes terms like {`}vectors{'} and {`}dimensionality{'} in the definition of DSMs, there are some good reasons why we should consider alternative formulations of distributional models. As an instance, I present a scalable graph-based solution to distributional semantics. The model belongs to the family of {`}count-based{'} DSMs, keeps its representation sparse and explicit, and thus fully interpretable. I will highlight some important differences between sparse graph-based and dense vector approaches to DSMs: while dense vector-based models are computationally easier to handle and provide a nice uniform representation that can be compared and combined in many ways, they lack interpretability, provenance and robustness. On the other hand, graph-based sparse models have a more straightforward interpretation, handle sense distinctions more naturally and can straightforwardly be linked to knowledge bases, while lacking the ability to compare arbitrary lexical units and a compositionality operation. Since both representations have their merits, I opt for exploring their combination in the outlook. \|
Tasks	Information Retrieval
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5301/
PDF	https://www.aclweb.org/anthology/W16-5301
PWC	https://paperswithcode.com/paper/vectors-or-graphs-on-differences-of
Repo
Framework

SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)


Title	SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)
Authors	Nathan Schneider, Dirk Hovy, Anders Johannsen, Marine Carpuat
Abstract
Tasks	Part-Of-Speech Tagging, Word Sense Disambiguation
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1084/
PDF	https://www.aclweb.org/anthology/S16-1084
PWC	https://paperswithcode.com/paper/semeval-2016-task-10-detecting-minimal
Repo
Framework

Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments


Title	Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments
Authors	Mariano Felice, Christopher Bryant, Ted Briscoe
Abstract	We propose a new method of automatically extracting learner errors from parallel English as a Second Language (ESL) sentences in an effort to regularise annotation formats and reduce inconsistencies. Specifically, given an original and corrected sentence, our method first uses a linguistically enhanced alignment algorithm to determine the most likely mappings between tokens, and secondly employs a rule-based function to decide which alignments should be merged. Our method beats all previous approaches on the tested datasets, achieving state-of-the-art results for automatic error extraction.
Tasks	Grammatical Error Correction, Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1079/
PDF	https://www.aclweb.org/anthology/C16-1079
PWC	https://paperswithcode.com/paper/automatic-extraction-of-learner-errors-in-esl
Repo
Framework

Incremental Variational Sparse Gaussian Process Regression


Title	Incremental Variational Sparse Gaussian Process Regression
Authors	Ching-An Cheng, Byron Boots
Abstract	Recent work on scaling up Gaussian process regression (GPR) to large datasets has primarily focused on sparse GPR, which leverages a small set of basis functions to approximate the full Gaussian process during inference. However, the majority of these approaches are batch methods that operate on the entire training dataset at once, precluding the use of datasets that are streaming or too large to fit into memory. Although previous work has considered incrementally solving variational sparse GPR, most algorithms fail to update the basis functions and therefore perform suboptimally. We propose a novel incremental learning algorithm for variational sparse GPR based on stochastic mirror ascent of probability densities in reproducing kernel Hilbert space. This new formulation allows our algorithm to update basis functions online in accordance with the manifold structure of probability densities for fast convergence. We conduct several experiments and show that our proposed approach achieves better empirical performance in terms of prediction error than the recent state-of-the-art incremental solutions to variational sparse GPR.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6473-incremental-variational-sparse-gaussian-process-regression
PDF	http://papers.nips.cc/paper/6473-incremental-variational-sparse-gaussian-process-regression.pdf
PWC	https://paperswithcode.com/paper/incremental-variational-sparse-gaussian
Repo
Framework

Adapting Event Embedding for Implicit Discourse Relation Recognition


Title	Adapting Event Embedding for Implicit Discourse Relation Recognition
Authors	Maria Leonor Pacheco, I-Ta Lee, Xiao Zhang, Abdullah Khan Zehady, Pranjal Daga, Di Jin, Ayush Parolia, Dan Goldwasser
Abstract
Tasks	Feature Engineering
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-2019/
PDF	https://www.aclweb.org/anthology/K16-2019
PWC	https://paperswithcode.com/paper/adapting-event-embedding-for-implicit
Repo
Framework

Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity.


Title	Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity.
Authors	Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak, Piotr Andruszkiewicz
Abstract
Tasks	Machine Translation, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1091/
PDF	https://www.aclweb.org/anthology/S16-1091
PWC	https://paperswithcode.com/paper/samsung-poland-nlp-team-at-semeval-2016-task
Repo
Framework

Leveraging Captions in the Wild to Improve Object Detection


Title	Leveraging Captions in the Wild to Improve Object Detection
Authors	Mert Kilickaya, Nazli Ikizler-Cinbis, Erkut Erdem, Aykut Erdem
Abstract
Tasks	Object Detection
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3204/
PDF	https://www.aclweb.org/anthology/W16-3204
PWC	https://paperswithcode.com/paper/leveraging-captions-in-the-wild-to-improve
Repo
Framework

Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain


Title	Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain
Authors	Timothy Rubin, Oluwasanmi O. Koyejo, Michael N. Jones, Tal Yarkoni
Abstract	This paper presents Generalized Correspondence-LDA (GC-LDA), a generalization of the Correspondence-LDA model that allows for variable spatial representations to be associated with topics, and increased flexibility in terms of the strength of the correspondence between data types induced by the model. We present three variants of GC-LDA, each of which associates topics with a different spatial representation, and apply them to a corpus of neuroimaging data. In the context of this dataset, each topic corresponds to a functional brain region, where the region’s spatial extent is captured by a probability distribution over neural activity, and the region’s cognitive function is captured by a probability distribution over linguistic terms. We illustrate the qualitative improvements offered by GC-LDA in terms of the types of topics extracted with alternative spatial representations, as well as the model’s ability to incorporate a-priori knowledge from the neuroimaging literature. We furthermore demonstrate that the novel features of GC-LDA improve predictions for missing data.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6274-generalized-correspondence-lda-models-gc-lda-for-identifying-functional-regions-in-the-brain
PDF	http://papers.nips.cc/paper/6274-generalized-correspondence-lda-models-gc-lda-for-identifying-functional-regions-in-the-brain.pdf
PWC	https://paperswithcode.com/paper/generalized-correspondence-lda-models-gc-lda
Repo
Framework

SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles


Title	SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles
Authors	Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith
Abstract
Tasks	Machine Translation, Semantic Textual Similarity
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1095/
PDF	https://www.aclweb.org/anthology/S16-1095
PWC	https://paperswithcode.com/paper/saarsheff-at-semeval-2016-task-1-semantic
Repo
Framework

Recognizing Salient Entities in Shopping Queries


Title	Recognizing Salient Entities in Shopping Queries
Authors	Zornitsa Kozareva, Qi Li, Ke Zhai, Weiwei Guo
Abstract
Tasks	Feature Engineering, Structured Prediction, Word Embeddings
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2018/
PDF	https://www.aclweb.org/anthology/P16-2018
PWC	https://paperswithcode.com/paper/recognizing-salient-entities-in-shopping
Repo
Framework

Towards a resource based on users’ knowledge to overcome the Tip of the Tongue problem.


Title	Towards a resource based on users’ knowledge to overcome the Tip of the Tongue problem.
Authors	Michael Zock, Chris Biemann
Abstract	Language production is largely a matter of words which, in the case of access problems, can be searched for in an external resource (lexicon, thesaurus). In this kind of dialogue the user provides the momentarily available knowledge concerning the target and the system responds with the best guess(es) it can make given this input. As tip-of-the-tongue (ToT)-studies have shown, people always have some knowledge concerning the target (meaning fragments, number of syllables, …) even if its complete form is eluding them. We will show here how to tap on this knowledge to build a resource likely to help authors (speakers/writers) to overcome the ToT-problem. Yet, before doing so we need a better understanding of the various kinds of knowledge people have when looking for a word. To this end, we asked crowdworkers to provide some cues to describe a given target and to specify then how each one of them relates to the target, in the hope that this could help others to find the elusive word. Next, we checked how well a given search strategy worked when being applied to differently built lexical networks. The results showed quite dramatic differences, which is not really surprising. After all, different networks are built for different purposes; hence each one of them is more or less suited for a given task. What was more surprising though is the fact that the relational information given by the users did not allow us to find the elusive word in WordNet better than without it.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5308/
PDF	https://www.aclweb.org/anthology/W16-5308
PWC	https://paperswithcode.com/paper/towards-a-resource-based-on-users-knowledge
Repo
Framework

Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions


Title	Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions
Authors	Anja Belz, Adrian Muscat, Br Birmingham, on
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3209/
PDF	https://www.aclweb.org/anthology/W16-3209
PWC	https://paperswithcode.com/paper/exploring-different-preposition-sets-models
Repo
Framework