Paper Group NANR 153
Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking. Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease. Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text. Quantized Random Projections and Non-Linear Estimation of Cosine Similarity. R …
Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking
Title | Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking |
Authors | Kateryna Tymoshenko, Daniele Bonadiman, Aless Moschitti, ro |
Abstract | |
Tasks | Feature Engineering, Learning-To-Rank, Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1152/ |
https://www.aclweb.org/anthology/N16-1152 | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-vs-convolution |
Repo | |
Framework | |
Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease
Title | Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer’s Disease |
Authors | Hao Zhou, Vamsi K. Ithapu, Sathya Narayanan Ravi, Vikas Singh, Grace Wahba, Sterling C. Johnson |
Abstract | Consider samples from two different data sources ${\mathbf{x_s^i}} \sim P_{\rm source}$ and ${\mathbf{x_t^i}} \sim P_{\rm target}$. We only observe their transformed versions $h(\mathbf{x_s^i})$ and $g(\mathbf{x_t^i})$, for some known function class $h(\cdot)$ and $g(\cdot)$. Our goal is to perform a statistical test checking if $P_{\rm source}$ = $P_{\rm target}$ while removing the distortions induced by the transformations. This problem is closely related to concepts underlying numerous domain adaptation algorithms, and in our case, is motivated by the need to combine clinical and imaging based biomarkers from multiple sites and/or batches, where this problem is fairly common and an impediment in the conduct of analyses with much larger sample sizes. We develop a framework that addresses this problem using ideas from hypothesis testing on the transformed measurements, where in the distortions need to be estimated {\it in tandem} with the testing. We derive a simple algorithm and study its convergence and consistency properties in detail, and we also provide lower-bound strategies based on recent work in continuous optimization. On a dataset of individuals at risk for neurological disease, our results are competitive with alternative procedures that are twice as expensive and in some cases operationally infeasible to implement. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6209-hypothesis-testing-in-unsupervised-domain-adaptation-with-applications-in-alzheimers-disease |
http://papers.nips.cc/paper/6209-hypothesis-testing-in-unsupervised-domain-adaptation-with-applications-in-alzheimers-disease.pdf | |
PWC | https://paperswithcode.com/paper/hypothesis-testing-in-unsupervised-domain |
Repo | |
Framework | |
Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text
Title | Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text |
Authors | Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Shrivastava, Radhika Mamidi, Dipti M. Sharma |
Abstract | |
Tasks | Language Identification |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1159/ |
https://www.aclweb.org/anthology/N16-1159 | |
PWC | https://paperswithcode.com/paper/shallow-parsing-pipeline-hindi-english-code |
Repo | |
Framework | |
Quantized Random Projections and Non-Linear Estimation of Cosine Similarity
Title | Quantized Random Projections and Non-Linear Estimation of Cosine Similarity |
Authors | Ping Li, Michael Mitzenmacher, Martin Slawski |
Abstract | Random projections constitute a simple, yet effective technique for dimensionality reduction with applications in learning and search problems. In the present paper, we consider the problem of estimating cosine similarities when the projected data undergo scalar quantization to $b$ bits. We here argue that the maximum likelihood estimator (MLE) is a principled approach to deal with the non-linearity resulting from quantization, and subsequently study its computational and statistical properties. A specific focus is on the on the trade-off between bit depth and the number of projections given a fixed budget of bits for storage or transmission. Along the way, we also touch upon the existence of a qualitative counterpart to the Johnson-Lindenstrauss lemma in the presence of quantization. |
Tasks | Dimensionality Reduction, Quantization |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6492-quantized-random-projections-and-non-linear-estimation-of-cosine-similarity |
http://papers.nips.cc/paper/6492-quantized-random-projections-and-non-linear-estimation-of-cosine-similarity.pdf | |
PWC | https://paperswithcode.com/paper/quantized-random-projections-and-non-linear |
Repo | |
Framework | |
Retrofitting Sense-Specific Word Vectors Using Parallel Text
Title | Retrofitting Sense-Specific Word Vectors Using Parallel Text |
Authors | Allyson Ettinger, Philip Resnik, Marine Carpuat |
Abstract | |
Tasks | Word Alignment, Word Sense Disambiguation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1163/ |
https://www.aclweb.org/anthology/N16-1163 | |
PWC | https://paperswithcode.com/paper/retrofitting-sense-specific-word-vectors |
Repo | |
Framework | |
An Empirical Study of Arabic Formulaic Sequence Extraction Methods
Title | An Empirical Study of Arabic Formulaic Sequence Extraction Methods |
Authors | Ayman Alghamdi, Eric Atwell, Claire Brierley |
Abstract | This paper aims to implement what is referred to as the collocation of the Arabic keywords approach for extracting formulaic sequences (FSs) in the form of high frequency but semantically regular formulas that are not restricted to any syntactic construction or semantic domain. The study applies several distributional semantic models in order to automatically extract relevant FSs related to Arabic keywords. The data sets used in this experiment are rendered from a new developed corpus-based Arabic wordlist consisting of 5,189 lexical items which represent a variety of modern standard Arabic (MSA) genres and regions, the new wordlist being based on an overlapping frequency based on a comprehensive comparison of four large Arabic corpora with a total size of over 8 billion running words. Empirical n-best precision evaluation methods are used to determine the best association measures (AMs) for extracting high frequency and meaningful FSs. The gold standard reference FSs list was developed in previous studies and manually evaluated against well-established quantitative and qualitative criteria. The results demonstrate that the MI.log{_}f AM achieved the highest results in extracting significant FSs from the large MSA corpus, while the T-score association measure achieved the worst results. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1080/ |
https://www.aclweb.org/anthology/L16-1080 | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-arabic-formulaic |
Repo | |
Framework | |
Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS
Title | Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS |
Authors | Wejdene Khiari, Mathieu Roche, Asma Bouhafs Hafsia |
Abstract | With the explosive growth of online social media (forums, blogs, and social networks), exploitation of these new information sources has become essential. Our work is based on the sud4science project. The goal of this project is to perform multidisciplinary work on a corpus of authentic SMS, in French, collected in 2011 and anonymised (88milSMS corpus: http://88milsms.huma-num.fr). This paper highlights a new method to integrate opinion detection knowledge from an SMS corpus by combining lexical and semantic information. More precisely, our approach gives more weight to words with a sentiment (i.e. presence of words in a dedicated dictionary) for a classification task based on three classes: positive, negative, and neutral. The experiments were conducted on two corpora: an elongated SMS corpus (i.e. repetitions of characters in messages) and a non-elongated SMS corpus. We noted that non-elongated SMS were much better classified than elongated SMS. Overall, this study highlighted that the integration of semantic knowledge always improves classification. |
Tasks | Sentiment Analysis |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1188/ |
https://www.aclweb.org/anthology/L16-1188 | |
PWC | https://paperswithcode.com/paper/integration-of-lexical-and-semantic-knowledge |
Repo | |
Framework | |
End-to-End Argumentation Mining in Student Essays
Title | End-to-End Argumentation Mining in Student Essays |
Authors | Isaac Persing, Vincent Ng |
Abstract | |
Tasks | Argument Mining |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1164/ |
https://www.aclweb.org/anthology/N16-1164 | |
PWC | https://paperswithcode.com/paper/end-to-end-argumentation-mining-in-student |
Repo | |
Framework | |
Activity Modeling in Email
Title | Activity Modeling in Email |
Authors | Ashequl Qadir, Michael Gamon, Patrick Pantel, Ahmed Hassan Awadallah |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1171/ |
https://www.aclweb.org/anthology/N16-1171 | |
PWC | https://paperswithcode.com/paper/activity-modeling-in-email |
Repo | |
Framework | |
Modeling Complement Types in Phrase-Based SMT
Title | Modeling Complement Types in Phrase-Based SMT |
Authors | Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2205/ |
https://www.aclweb.org/anthology/W16-2205 | |
PWC | https://paperswithcode.com/paper/modeling-complement-types-in-phrase-based-smt |
Repo | |
Framework | |
Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t.
Title | Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. |
Authors | Anna Gladkova, Aleks Drozd, r, Satoshi Matsuoka |
Abstract | |
Tasks | Morphological Analysis, Word Embeddings, Word Sense Disambiguation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-2002/ |
https://www.aclweb.org/anthology/N16-2002 | |
PWC | https://paperswithcode.com/paper/analogy-based-detection-of-morphological-and |
Repo | |
Framework | |
ArgRewrite: A Web-based Revision Assistant for Argumentative Writings
Title | ArgRewrite: A Web-based Revision Assistant for Argumentative Writings |
Authors | Fan Zhang, Rebecca Hwa, Diane Litman, Homa B. Hashemi |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3008/ |
https://www.aclweb.org/anthology/N16-3008 | |
PWC | https://paperswithcode.com/paper/argrewrite-a-web-based-revision-assistant-for |
Repo | |
Framework | |
Improving Translation Selection with Supersenses
Title | Improving Translation Selection with Supersenses |
Authors | Haiqing Tang, Deyi Xiong, Oier Lopez de Lacalle, Eneko Agirre |
Abstract | Selecting appropriate translations for source words with multiple meanings still remains a challenge for statistical machine translation (SMT). One reason for this is that most SMT systems are not good at detecting the proper sense for a polysemic word when it appears in different contexts. In this paper, we adopt a supersense tagging method to annotate source words with coarse-grained ontological concepts. In order to enable the system to choose an appropriate translation for a word or phrase according to the annotated supersense of the word or phrase, we propose two translation models with supersense knowledge: a maximum entropy based model and a supersense embedding model. The effectiveness of our proposed models is validated on a large-scale English-to-Spanish translation task. Results indicate that our method can significantly improve translation quality via correctly conveying the meaning of the source language to the target language. |
Tasks | Machine Translation, Word Sense Disambiguation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1293/ |
https://www.aclweb.org/anthology/C16-1293 | |
PWC | https://paperswithcode.com/paper/improving-translation-selection-with |
Repo | |
Framework | |
Latent Topic Embedding
Title | Latent Topic Embedding |
Authors | Di Jiang, Lei Shi, Rongzhong Lian, Hua Wu |
Abstract | Topic modeling and word embedding are two important techniques for deriving latent semantics from data. General-purpose topic models typically work in coarse granularity by capturing word co-occurrence at the document/sentence level. In contrast, word embedding models usually work in much finer granularity by modeling word co-occurrence within small sliding windows. With the aim of deriving latent semantics by considering word co-occurrence at different levels of granularity, we propose a novel model named \textit{Latent Topic Embedding} (LTE), which seamlessly integrates topic generation and embedding learning in one unified framework. We further propose an efficient Monte Carlo EM algorithm to estimate the parameters of interest. By retaining the individual advantages of topic modeling and word embedding, LTE results in better latent topics and word embedding. Extensive experiments verify the superiority of LTE over the state-of-the-arts. |
Tasks | Topic Models, Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1253/ |
https://www.aclweb.org/anthology/C16-1253 | |
PWC | https://paperswithcode.com/paper/latent-topic-embedding |
Repo | |
Framework | |
Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation
Title | Inherently Pronominal Verbs in Czech: Description and Conversion Based on Treebank Annotation |
Authors | Zde{\v{n}}ka Ure{\v{s}}ov{'a}, Eduard Bej{\v{c}}ek, Jan Haji{\v{c}} |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1812/ |
https://www.aclweb.org/anthology/W16-1812 | |
PWC | https://paperswithcode.com/paper/inherently-pronominal-verbs-in-czech |
Repo | |
Framework | |