May 5, 2019

1989 words 10 mins read

Paper Group NANR 11

Paper Group NANR 11

Joint quantile regression in vector-valued RKHSs. How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?. Linear Relaxations for Finding Diverse Elements in Metric Spaces. Vectors or Graphs? On Differences of Representations for Distributional Semantic Models. SemEval-2016 Task 10: Detecting Minimal Semantic Units and their …

Joint quantile regression in vector-valued RKHSs

Title Joint quantile regression in vector-valued RKHSs
Authors Maxime Sangnier, Olivier Fercoq, Florence D’Alché-Buc
Abstract Addressing the will to give a more complete picture than an average relationship provided by standard regression, a novel framework for estimating and predicting simultaneously several conditional quantiles is introduced. The proposed methodology leverages kernel-based multi-task learning to curb the embarrassing phenomenon of quantile crossing, with a one-step estimation procedure and no post-processing. Moreover, this framework comes along with theoretical guarantees and an efficient coordinate descent learning algorithm. Numerical experiments on benchmark and real datasets highlight the enhancements of our approach regarding the prediction error, the crossing occurrences and the training time.
Tasks Multi-Task Learning
Published 2016-12-01
URL http://papers.nips.cc/paper/6239-joint-quantile-regression-in-vector-valued-rkhss
PDF http://papers.nips.cc/paper/6239-joint-quantile-regression-in-vector-valued-rkhss.pdf
PWC https://paperswithcode.com/paper/joint-quantile-regression-in-vector-valued
Repo
Framework

How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?

Title How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?
Authors Wuying Liu, Lin Wang
Abstract Vietnamese word segmentation (VWS) is a challenging basic issue for natural language processing. This paper addresses the problem of how does dictionary size influence VWS performance, proposes two novel measures: square overlap ratio (SOR) and relaxed square overlap ratio (RSOR), and validates their effectiveness. The SOR measure is the product of dictionary overlap ratio and corpus overlap ratio, and the RSOR measure is the relaxed version of SOR measure under an unsupervised condition. The two measures both indicate the suitable degree between segmentation dictionary and object corpus waiting for segmentation. The experimental results show that the more suitable, neither smaller nor larger, dictionary size is better to achieve the state-of-the-art performance for dictionary-based Vietnamese word segmenters.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1172/
PDF https://www.aclweb.org/anthology/L16-1172
PWC https://paperswithcode.com/paper/how-does-dictionary-size-influence
Repo
Framework

Linear Relaxations for Finding Diverse Elements in Metric Spaces

Title Linear Relaxations for Finding Diverse Elements in Metric Spaces
Authors Aditya Bhaskara, Mehrdad Ghadiri, Vahab Mirrokni, Ola Svensson
Abstract Choosing a diverse subset of a large collection of points in a metric space is a fundamental problem, with applications in feature selection, recommender systems, web search, data summarization, etc. Various notions of diversity have been proposed, tailored to different applications. The general algorithmic goal is to find a subset of points that maximize diversity, while obeying a cardinality (or more generally, matroid) constraint. The goal of this paper is to develop a novel linear programming (LP) framework that allows us to design approximation algorithms for such problems. We study an objective known as {\em sum-min} diversity, which is known to be effective in many applications, and give the first constant factor approximation algorithm. Our LP framework allows us to easily incorporate additional constraints, as well as secondary objectives. We also prove a hardness result for two natural diversity objectives, under the so-called {\em planted clique} assumption. Finally, we study the empirical performance of our algorithm on several standard datasets. We first study the approximation quality of the algorithm by comparing with the LP objective. Then, we compare the quality of the solutions produced by our method with other popular diversity maximization algorithms.
Tasks Data Summarization, Feature Selection, Recommendation Systems
Published 2016-12-01
URL http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces
PDF http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces.pdf
PWC https://paperswithcode.com/paper/linear-relaxations-for-finding-diverse
Repo
Framework

Vectors or Graphs? On Differences of Representations for Distributional Semantic Models

Title Vectors or Graphs? On Differences of Representations for Distributional Semantic Models
Authors Chris Biemann
Abstract Distributional Semantic Models (DSMs) have recently received increased attention, together with the rise of neural architectures for scalable training of dense vector embeddings. While some of the literature even includes terms like {}vectors{'} and {}dimensionality{'} in the definition of DSMs, there are some good reasons why we should consider alternative formulations of distributional models. As an instance, I present a scalable graph-based solution to distributional semantics. The model belongs to the family of {`}count-based{'} DSMs, keeps its representation sparse and explicit, and thus fully interpretable. I will highlight some important differences between sparse graph-based and dense vector approaches to DSMs: while dense vector-based models are computationally easier to handle and provide a nice uniform representation that can be compared and combined in many ways, they lack interpretability, provenance and robustness. On the other hand, graph-based sparse models have a more straightforward interpretation, handle sense distinctions more naturally and can straightforwardly be linked to knowledge bases, while lacking the ability to compare arbitrary lexical units and a compositionality operation. Since both representations have their merits, I opt for exploring their combination in the outlook. |
Tasks Information Retrieval
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5301/
PDF https://www.aclweb.org/anthology/W16-5301
PWC https://paperswithcode.com/paper/vectors-or-graphs-on-differences-of
Repo
Framework

SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

Title SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)
Authors Nathan Schneider, Dirk Hovy, Anders Johannsen, Marine Carpuat
Abstract
Tasks Part-Of-Speech Tagging, Word Sense Disambiguation
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1084/
PDF https://www.aclweb.org/anthology/S16-1084
PWC https://paperswithcode.com/paper/semeval-2016-task-10-detecting-minimal
Repo
Framework

Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments

Title Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments
Authors Mariano Felice, Christopher Bryant, Ted Briscoe
Abstract We propose a new method of automatically extracting learner errors from parallel English as a Second Language (ESL) sentences in an effort to regularise annotation formats and reduce inconsistencies. Specifically, given an original and corrected sentence, our method first uses a linguistically enhanced alignment algorithm to determine the most likely mappings between tokens, and secondly employs a rule-based function to decide which alignments should be merged. Our method beats all previous approaches on the tested datasets, achieving state-of-the-art results for automatic error extraction.
Tasks Grammatical Error Correction, Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1079/
PDF https://www.aclweb.org/anthology/C16-1079
PWC https://paperswithcode.com/paper/automatic-extraction-of-learner-errors-in-esl
Repo
Framework

Incremental Variational Sparse Gaussian Process Regression

Title Incremental Variational Sparse Gaussian Process Regression
Authors Ching-An Cheng, Byron Boots
Abstract Recent work on scaling up Gaussian process regression (GPR) to large datasets has primarily focused on sparse GPR, which leverages a small set of basis functions to approximate the full Gaussian process during inference. However, the majority of these approaches are batch methods that operate on the entire training dataset at once, precluding the use of datasets that are streaming or too large to fit into memory. Although previous work has considered incrementally solving variational sparse GPR, most algorithms fail to update the basis functions and therefore perform suboptimally. We propose a novel incremental learning algorithm for variational sparse GPR based on stochastic mirror ascent of probability densities in reproducing kernel Hilbert space. This new formulation allows our algorithm to update basis functions online in accordance with the manifold structure of probability densities for fast convergence. We conduct several experiments and show that our proposed approach achieves better empirical performance in terms of prediction error than the recent state-of-the-art incremental solutions to variational sparse GPR.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6473-incremental-variational-sparse-gaussian-process-regression
PDF http://papers.nips.cc/paper/6473-incremental-variational-sparse-gaussian-process-regression.pdf
PWC https://paperswithcode.com/paper/incremental-variational-sparse-gaussian
Repo
Framework

Adapting Event Embedding for Implicit Discourse Relation Recognition

Title Adapting Event Embedding for Implicit Discourse Relation Recognition
Authors Maria Leonor Pacheco, I-Ta Lee, Xiao Zhang, Abdullah Khan Zehady, Pranjal Daga, Di Jin, Ayush Parolia, Dan Goldwasser
Abstract
Tasks Feature Engineering
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-2019/
PDF https://www.aclweb.org/anthology/K16-2019
PWC https://paperswithcode.com/paper/adapting-event-embedding-for-implicit
Repo
Framework

Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity.

Title Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity.
Authors Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak, Piotr Andruszkiewicz
Abstract
Tasks Machine Translation, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1091/
PDF https://www.aclweb.org/anthology/S16-1091
PWC https://paperswithcode.com/paper/samsung-poland-nlp-team-at-semeval-2016-task
Repo
Framework

Leveraging Captions in the Wild to Improve Object Detection

Title Leveraging Captions in the Wild to Improve Object Detection
Authors Mert Kilickaya, Nazli Ikizler-Cinbis, Erkut Erdem, Aykut Erdem
Abstract
Tasks Object Detection
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3204/
PDF https://www.aclweb.org/anthology/W16-3204
PWC https://paperswithcode.com/paper/leveraging-captions-in-the-wild-to-improve
Repo
Framework

Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain

Title Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain
Authors Timothy Rubin, Oluwasanmi O. Koyejo, Michael N. Jones, Tal Yarkoni
Abstract This paper presents Generalized Correspondence-LDA (GC-LDA), a generalization of the Correspondence-LDA model that allows for variable spatial representations to be associated with topics, and increased flexibility in terms of the strength of the correspondence between data types induced by the model. We present three variants of GC-LDA, each of which associates topics with a different spatial representation, and apply them to a corpus of neuroimaging data. In the context of this dataset, each topic corresponds to a functional brain region, where the region’s spatial extent is captured by a probability distribution over neural activity, and the region’s cognitive function is captured by a probability distribution over linguistic terms. We illustrate the qualitative improvements offered by GC-LDA in terms of the types of topics extracted with alternative spatial representations, as well as the model’s ability to incorporate a-priori knowledge from the neuroimaging literature. We furthermore demonstrate that the novel features of GC-LDA improve predictions for missing data.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6274-generalized-correspondence-lda-models-gc-lda-for-identifying-functional-regions-in-the-brain
PDF http://papers.nips.cc/paper/6274-generalized-correspondence-lda-models-gc-lda-for-identifying-functional-regions-in-the-brain.pdf
PWC https://paperswithcode.com/paper/generalized-correspondence-lda-models-gc-lda
Repo
Framework

SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles

Title SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles
Authors Liling Tan, Carolina Scarton, Lucia Specia, Josef van Genabith
Abstract
Tasks Machine Translation, Semantic Textual Similarity
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1095/
PDF https://www.aclweb.org/anthology/S16-1095
PWC https://paperswithcode.com/paper/saarsheff-at-semeval-2016-task-1-semantic
Repo
Framework

Recognizing Salient Entities in Shopping Queries

Title Recognizing Salient Entities in Shopping Queries
Authors Zornitsa Kozareva, Qi Li, Ke Zhai, Weiwei Guo
Abstract
Tasks Feature Engineering, Structured Prediction, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2018/
PDF https://www.aclweb.org/anthology/P16-2018
PWC https://paperswithcode.com/paper/recognizing-salient-entities-in-shopping
Repo
Framework

Towards a resource based on users’ knowledge to overcome the Tip of the Tongue problem.

Title Towards a resource based on users’ knowledge to overcome the Tip of the Tongue problem.
Authors Michael Zock, Chris Biemann
Abstract Language production is largely a matter of words which, in the case of access problems, can be searched for in an external resource (lexicon, thesaurus). In this kind of dialogue the user provides the momentarily available knowledge concerning the target and the system responds with the best guess(es) it can make given this input. As tip-of-the-tongue (ToT)-studies have shown, people always have some knowledge concerning the target (meaning fragments, number of syllables, …) even if its complete form is eluding them. We will show here how to tap on this knowledge to build a resource likely to help authors (speakers/writers) to overcome the ToT-problem. Yet, before doing so we need a better understanding of the various kinds of knowledge people have when looking for a word. To this end, we asked crowdworkers to provide some cues to describe a given target and to specify then how each one of them relates to the target, in the hope that this could help others to find the elusive word. Next, we checked how well a given search strategy worked when being applied to differently built lexical networks. The results showed quite dramatic differences, which is not really surprising. After all, different networks are built for different purposes; hence each one of them is more or less suited for a given task. What was more surprising though is the fact that the relational information given by the users did not allow us to find the elusive word in WordNet better than without it.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5308/
PDF https://www.aclweb.org/anthology/W16-5308
PWC https://paperswithcode.com/paper/towards-a-resource-based-on-users-knowledge
Repo
Framework

Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions

Title Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions
Authors Anja Belz, Adrian Muscat, Br Birmingham, on
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3209/
PDF https://www.aclweb.org/anthology/W16-3209
PWC https://paperswithcode.com/paper/exploring-different-preposition-sets-models
Repo
Framework
comments powered by Disqus