May 4, 2019

1613 words 8 mins read

Paper Group NANR 153

Paper Group NANR 153

Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking. Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease. Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text. Quantized Random Projections and Non-Linear Estimation of Cosine Similarity. R …

Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking

Title Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking
Authors Kateryna Tymoshenko, Daniele Bonadiman, Aless Moschitti, ro
Abstract
Tasks Feature Engineering, Learning-To-Rank, Question Answering
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1152/
PDF https://www.aclweb.org/anthology/N16-1152
PWC https://paperswithcode.com/paper/convolutional-neural-networks-vs-convolution
Repo
Framework

Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease

Title Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease
Authors Hao Zhou, Vamsi K. Ithapu, Sathya Narayanan Ravi, Vikas Singh, Grace Wahba, Sterling C. Johnson
Abstract Consider samples from two different data sources ${\mathbf{x_s^i}} \sim P_{\rm source}$ and ${\mathbf{x_t^i}} \sim P_{\rm target}$. We only observe their transformed versions $h(\mathbf{x_s^i})$ and $g(\mathbf{x_t^i})$, for some known function class $h(\cdot)$ and $g(\cdot)$. Our goal is to perform a statistical test checking if $P_{\rm source}$ = $P_{\rm target}$ while removing the distortions induced by the transformations. This problem is closely related to concepts underlying numerous domain adaptation algorithms, and in our case, is motivated by the need to combine clinical and imaging based biomarkers from multiple sites and/or batches, where this problem is fairly common and an impediment in the conduct of analyses with much larger sample sizes. We develop a framework that addresses this problem using ideas from hypothesis testing on the transformed measurements, where in the distortions need to be estimated {\it in tandem} with the testing. We derive a simple algorithm and study its convergence and consistency properties in detail, and we also provide lower-bound strategies based on recent work in continuous optimization. On a dataset of individuals at risk for neurological disease, our results are competitive with alternative procedures that are twice as expensive and in some cases operationally infeasible to implement.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2016-12-01
URL http://papers.nips.cc/paper/6209-hypothesis-testing-in-unsupervised-domain-adaptation-with-applications-in-alzheimers-disease
PDF http://papers.nips.cc/paper/6209-hypothesis-testing-in-unsupervised-domain-adaptation-with-applications-in-alzheimers-disease.pdf
PWC https://paperswithcode.com/paper/hypothesis-testing-in-unsupervised-domain
Repo
Framework

Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text

Title Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text
Authors Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Shrivastava, Radhika Mamidi, Dipti M. Sharma
Abstract
Tasks Language Identification
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1159/
PDF https://www.aclweb.org/anthology/N16-1159
PWC https://paperswithcode.com/paper/shallow-parsing-pipeline-hindi-english-code
Repo
Framework

Quantized Random Projections and Non-Linear Estimation of Cosine Similarity

Title Quantized Random Projections and Non-Linear Estimation of Cosine Similarity
Authors Ping Li, Michael Mitzenmacher, Martin Slawski
Abstract Random projections constitute a simple, yet effective technique for dimensionality reduction with applications in learning and search problems. In the present paper, we consider the problem of estimating cosine similarities when the projected data undergo scalar quantization to $b$ bits. We here argue that the maximum likelihood estimator (MLE) is a principled approach to deal with the non-linearity resulting from quantization, and subsequently study its computational and statistical properties. A specific focus is on the on the trade-off between bit depth and the number of projections given a fixed budget of bits for storage or transmission. Along the way, we also touch upon the existence of a qualitative counterpart to the Johnson-Lindenstrauss lemma in the presence of quantization.
Tasks Dimensionality Reduction, Quantization
Published 2016-12-01
URL http://papers.nips.cc/paper/6492-quantized-random-projections-and-non-linear-estimation-of-cosine-similarity
PDF http://papers.nips.cc/paper/6492-quantized-random-projections-and-non-linear-estimation-of-cosine-similarity.pdf
PWC https://paperswithcode.com/paper/quantized-random-projections-and-non-linear
Repo
Framework

Retrofitting Sense-Specific Word Vectors Using Parallel Text

Title Retrofitting Sense-Specific Word Vectors Using Parallel Text
Authors Allyson Ettinger, Philip Resnik, Marine Carpuat
Abstract
Tasks Word Alignment, Word Sense Disambiguation
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1163/
PDF https://www.aclweb.org/anthology/N16-1163
PWC https://paperswithcode.com/paper/retrofitting-sense-specific-word-vectors
Repo
Framework

An Empirical Study of Arabic Formulaic Sequence Extraction Methods

Title An Empirical Study of Arabic Formulaic Sequence Extraction Methods
Authors Ayman Alghamdi, Eric Atwell, Claire Brierley
Abstract This paper aims to implement what is referred to as the collocation of the Arabic keywords approach for extracting formulaic sequences (FSs) in the form of high frequency but semantically regular formulas that are not restricted to any syntactic construction or semantic domain. The study applies several distributional semantic models in order to automatically extract relevant FSs related to Arabic keywords. The data sets used in this experiment are rendered from a new developed corpus-based Arabic wordlist consisting of 5,189 lexical items which represent a variety of modern standard Arabic (MSA) genres and regions, the new wordlist being based on an overlapping frequency based on a comprehensive comparison of four large Arabic corpora with a total size of over 8 billion running words. Empirical n-best precision evaluation methods are used to determine the best association measures (AMs) for extracting high frequency and meaningful FSs. The gold standard reference FSs list was developed in previous studies and manually evaluated against well-established quantitative and qualitative criteria. The results demonstrate that the MI.log{_}f AM achieved the highest results in extracting significant FSs from the large MSA corpus, while the T-score association measure achieved the worst results.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1080/
PDF https://www.aclweb.org/anthology/L16-1080
PWC https://paperswithcode.com/paper/an-empirical-study-of-arabic-formulaic
Repo
Framework

Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS

Title Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS
Authors Wejdene Khiari, Mathieu Roche, Asma Bouhafs Hafsia
Abstract With the explosive growth of online social media (forums, blogs, and social networks), exploitation of these new information sources has become essential. Our work is based on the sud4science project. The goal of this project is to perform multidisciplinary work on a corpus of authentic SMS, in French, collected in 2011 and anonymised (88milSMS corpus: http://88milsms.huma-num.fr). This paper highlights a new method to integrate opinion detection knowledge from an SMS corpus by combining lexical and semantic information. More precisely, our approach gives more weight to words with a sentiment (i.e. presence of words in a dedicated dictionary) for a classification task based on three classes: positive, negative, and neutral. The experiments were conducted on two corpora: an elongated SMS corpus (i.e. repetitions of characters in messages) and a non-elongated SMS corpus. We noted that non-elongated SMS were much better classified than elongated SMS. Overall, this study highlighted that the integration of semantic knowledge always improves classification.
Tasks Sentiment Analysis
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1188/
PDF https://www.aclweb.org/anthology/L16-1188
PWC https://paperswithcode.com/paper/integration-of-lexical-and-semantic-knowledge
Repo
Framework

End-to-End Argumentation Mining in Student Essays

Title End-to-End Argumentation Mining in Student Essays
Authors Isaac Persing, Vincent Ng
Abstract
Tasks Argument Mining
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1164/
PDF https://www.aclweb.org/anthology/N16-1164
PWC https://paperswithcode.com/paper/end-to-end-argumentation-mining-in-student
Repo
Framework

Activity Modeling in Email

Title Activity Modeling in Email
Authors Ashequl Qadir, Michael Gamon, Patrick Pantel, Ahmed Hassan Awadallah
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1171/
PDF https://www.aclweb.org/anthology/N16-1171
PWC https://paperswithcode.com/paper/activity-modeling-in-email
Repo
Framework

Modeling Complement Types in Phrase-Based SMT

Title Modeling Complement Types in Phrase-Based SMT
Authors Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2205/
PDF https://www.aclweb.org/anthology/W16-2205
PWC https://paperswithcode.com/paper/modeling-complement-types-in-phrase-based-smt
Repo
Framework

Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t.

Title Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t.
Authors Anna Gladkova, Aleks Drozd, r, Satoshi Matsuoka
Abstract
Tasks Morphological Analysis, Word Embeddings, Word Sense Disambiguation
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-2002/
PDF https://www.aclweb.org/anthology/N16-2002
PWC https://paperswithcode.com/paper/analogy-based-detection-of-morphological-and
Repo
Framework

ArgRewrite: A Web-based Revision Assistant for Argumentative Writings

Title ArgRewrite: A Web-based Revision Assistant for Argumentative Writings
Authors Fan Zhang, Rebecca Hwa, Diane Litman, Homa B. Hashemi
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3008/
PDF https://www.aclweb.org/anthology/N16-3008
PWC https://paperswithcode.com/paper/argrewrite-a-web-based-revision-assistant-for
Repo
Framework

Improving Translation Selection with Supersenses

Title Improving Translation Selection with Supersenses
Authors Haiqing Tang, Deyi Xiong, Oier Lopez de Lacalle, Eneko Agirre
Abstract Selecting appropriate translations for source words with multiple meanings still remains a challenge for statistical machine translation (SMT). One reason for this is that most SMT systems are not good at detecting the proper sense for a polysemic word when it appears in different contexts. In this paper, we adopt a supersense tagging method to annotate source words with coarse-grained ontological concepts. In order to enable the system to choose an appropriate translation for a word or phrase according to the annotated supersense of the word or phrase, we propose two translation models with supersense knowledge: a maximum entropy based model and a supersense embedding model. The effectiveness of our proposed models is validated on a large-scale English-to-Spanish translation task. Results indicate that our method can significantly improve translation quality via correctly conveying the meaning of the source language to the target language.
Tasks Machine Translation, Word Sense Disambiguation
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1293/
PDF https://www.aclweb.org/anthology/C16-1293
PWC https://paperswithcode.com/paper/improving-translation-selection-with
Repo
Framework

Latent Topic Embedding

Title Latent Topic Embedding
Authors Di Jiang, Lei Shi, Rongzhong Lian, Hua Wu
Abstract Topic modeling and word embedding are two important techniques for deriving latent semantics from data. General-purpose topic models typically work in coarse granularity by capturing word co-occurrence at the document/sentence level. In contrast, word embedding models usually work in much finer granularity by modeling word co-occurrence within small sliding windows. With the aim of deriving latent semantics by considering word co-occurrence at different levels of granularity, we propose a novel model named \textit{Latent Topic Embedding} (LTE), which seamlessly integrates topic generation and embedding learning in one unified framework. We further propose an efficient Monte Carlo EM algorithm to estimate the parameters of interest. By retaining the individual advantages of topic modeling and word embedding, LTE results in better latent topics and word embedding. Extensive experiments verify the superiority of LTE over the state-of-the-arts.
Tasks Topic Models, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1253/
PDF https://www.aclweb.org/anthology/C16-1253
PWC https://paperswithcode.com/paper/latent-topic-embedding
Repo
Framework

Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation

Title Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation
Authors Zde{\v{n}}ka Ure{\v{s}}ov{'a}, Eduard Bej{\v{c}}ek, Jan Haji{\v{c}}
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1812/
PDF https://www.aclweb.org/anthology/W16-1812
PWC https://paperswithcode.com/paper/inherently-pronominal-verbs-in-czech
Repo
Framework
comments powered by Disqus