Paper Group NANR 122
Prediction of Prospective User Engagement with Intelligent Assistants. Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task. RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics. Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus. Loc …
Prediction of Prospective User Engagement with Intelligent Assistants
Title | Prediction of Prospective User Engagement with Intelligent Assistants |
Authors | Shumpei Sano, Nobuhiro Kaji, Manabu Sassano |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1114/ |
https://www.aclweb.org/anthology/P16-1114 | |
PWC | https://paperswithcode.com/paper/prediction-of-prospective-user-engagement |
Repo | |
Framework | |
Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task
Title | Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task |
Authors | Ping Jian, Xiaohan She, Chenwei Zhang, Pengcheng Zhang, Jian Feng |
Abstract | |
Tasks | Relation Classification |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-2022/ |
https://www.aclweb.org/anthology/K16-2022 | |
PWC | https://paperswithcode.com/paper/discourse-relation-sense-classification-1 |
Repo | |
Framework | |
RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics
Title | RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics |
Authors | Ergun Bi{\c{c}}ici |
Abstract | |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1117/ |
https://www.aclweb.org/anthology/S16-1117 | |
PWC | https://paperswithcode.com/paper/rtm-at-semeval-2016-task-1-predicting-1 |
Repo | |
Framework | |
Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Title | Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus |
Authors | Nicholas Asher, Julie Hunter, Mathieu Morey, Benamara Farah, Stergos Afantenos |
Abstract | This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especially when these goals are opposed. The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides full discourse structures for multi-party dialogues. It has other remarkable features that make it an interesting resource for other topics: interleaved threads, creative language, and interactions between linguistic and extra-linguistic contexts. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1432/ |
https://www.aclweb.org/anthology/L16-1432 | |
PWC | https://paperswithcode.com/paper/discourse-structure-and-dialogue-acts-in |
Repo | |
Framework | |
Local-Global Vectors to Improve Unigram Terminology Extraction
Title | Local-Global Vectors to Improve Unigram Terminology Extraction |
Authors | Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, Farahnaz Faez |
Abstract | The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand. |
Tasks | Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4702/ |
https://www.aclweb.org/anthology/W16-4702 | |
PWC | https://paperswithcode.com/paper/local-global-vectors-to-improve-unigram |
Repo | |
Framework | |
GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization
Title | GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization |
Authors | Antoine Tixier, Konstantinos Skianis, Michalis Vazirgiannis |
Abstract | |
Tasks | Ad-Hoc Information Retrieval, Community Detection, Document Classification, Document Summarization, Graph Clustering, Information Retrieval, Keyword Extraction |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4026/ |
https://www.aclweb.org/anthology/P16-4026 | |
PWC | https://paperswithcode.com/paper/gowvis-a-web-application-for-graph-of-words |
Repo | |
Framework | |
Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data
Title | Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data |
Authors | Zhen Hai, Peilin Zhao, Peng Cheng, Peng Yang, Xiao-Li Li, Guangxia Li |
Abstract | |
Tasks | Feature Engineering, Multi-Task Learning |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1187/ |
https://www.aclweb.org/anthology/D16-1187 | |
PWC | https://paperswithcode.com/paper/deceptive-review-spam-detection-via |
Repo | |
Framework | |
Improving Semantic Parsing via Answer Type Inference
Title | Improving Semantic Parsing via Answer Type Inference |
Authors | Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, Xifeng Yan |
Abstract | |
Tasks | Knowledge Base Population, Question Answering, Relation Extraction, Semantic Parsing |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1015/ |
https://www.aclweb.org/anthology/D16-1015 | |
PWC | https://paperswithcode.com/paper/improving-semantic-parsing-via-answer-type |
Repo | |
Framework | |
Topically-focused Blog Corpora for Multiple Languages
Title | Topically-focused Blog Corpora for Multiple Languages |
Authors | Andrew Salway, Dag Elgesem, Knut Hofland, Ãystein Reigem, Lubos Steskal |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/papers/W16-2603/w16-2603 |
https://www.aclweb.org/anthology/W16-2603 | |
PWC | https://paperswithcode.com/paper/topically-focused-blog-corpora-for-multiple |
Repo | |
Framework | |
MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking
Title | MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking |
Authors | Michael Schlichtkrull, H{'e}ctor Mart{'\i}nez Alonso |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1209/ |
https://www.aclweb.org/anthology/S16-1209 | |
PWC | https://paperswithcode.com/paper/msejrku-at-semeval-2016-task-14-taxonomy |
Repo | |
Framework | |
Dimensionality Reduction of Massive Sparse Datasets Using Coresets
Title | Dimensionality Reduction of Massive Sparse Datasets Using Coresets |
Authors | Dan Feldman, Mikhail Volkov, Daniela Rus |
Abstract | In this paper we present a practical solution with performance guarantees to the problem of dimensionality reduction for very large scale sparse matrices. We show applications of our approach to computing the Principle Component Analysis (PCA) of any $n\times d$ matrix, using one pass over the stream of its rows. Our solution uses coresets: a scaled subset of the $n$ rows that approximates their sum of squared distances to \emph{every} $k$-dimensional \emph{affine} subspace. An open theoretical problem has been to compute such a coreset that is independent of both $n$ and $d$. An open practical problem has been to compute a non-trivial approximation to the PCA of very large but sparse databases such as the Wikipedia document-term matrix in a reasonable time. We answer both of these questions affirmatively. Our main technical result is a new framework for deterministic coreset constructions based on a reduction to the problem of counting items in a stream. |
Tasks | Dimensionality Reduction |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets |
http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-of-massive-sparse |
Repo | |
Framework | |
L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs
Title | L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs |
Authors | Sun Hee Park |
Abstract | |
Tasks | |
Published | 2016-10-01 |
URL | https://www.aclweb.org/anthology/Y16-3022/ |
https://www.aclweb.org/anthology/Y16-3022 | |
PWC | https://paperswithcode.com/paper/l2-acquisition-of-korean-locative |
Repo | |
Framework | |
Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning
Title | Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning |
Authors | Wafia Adouane, Nasredine Semmar, Richard Johansson |
Abstract | The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960{'}s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75{%} and 97.77{%} for RB and RA respectively. |
Tasks | Language Identification, Transliteration |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4807/ |
https://www.aclweb.org/anthology/W16-4807 | |
PWC | https://paperswithcode.com/paper/romanized-berber-and-romanized-arabic |
Repo | |
Framework | |
A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging
Title | A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging |
Authors | Nina Zhou, AiTi Aw, Nattadaporn Lertcheva, Xuancong Wang |
Abstract | Previous studies on Thai Sentence Boundary Detection (SBD) mostly assumed sentence ends at a space disambiguation problem, which classified space either as an indicator for Sentence Boundary (SB) or non-Sentence Boundary (nSB). In this paper, we propose a word labeling approach which treats space as a normal word, and detects SB between any two words. This removes the restriction for SB to be oc-curred only at space and makes our system more robust for modern Thai writing. It is because in modern Thai writing, space is not consistently used to indicate SB. As syntactic information contributes to better SBD, we further propose a joint Part-Of-Speech (POS) tagging and SBD framework based on Factorial Conditional Random Field (FCRF) model. We compare the performance of our proposed ap-proach with reported methods on ORCHID corpus. We also performed experiments of FCRF model on the TaLAPi corpus. The results show that the word labelling approach has better performance than pre-vious space-based classification approaches and FCRF joint model outperforms LCRF model in terms of SBD in all experiments. |
Tasks | Boundary Detection, Machine Translation, Part-Of-Speech Tagging |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1031/ |
https://www.aclweb.org/anthology/C16-1031 | |
PWC | https://paperswithcode.com/paper/a-word-labeling-approach-to-thai-sentence |
Repo | |
Framework | |
A Probabilistic Programming Approach To Probabilistic Data Analysis
Title | A Probabilistic Programming Approach To Probabilistic Data Analysis |
Authors | Feras Saad, Vikash K. Mansinghka |
Abstract | Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler’s Third Law by composing causal probabilistic programs with non-parametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries. |
Tasks | Probabilistic Programming |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis |
http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-programming-approach-to |
Repo | |
Framework | |