May 5, 2019

1612 words 8 mins read

Paper Group NANR 122

Prediction of Prospective User Engagement with Intelligent Assistants. Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task. RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics. Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus. Loc …

Prediction of Prospective User Engagement with Intelligent Assistants


Title	Prediction of Prospective User Engagement with Intelligent Assistants
Authors	Shumpei Sano, Nobuhiro Kaji, Manabu Sassano
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1114/
PDF	https://www.aclweb.org/anthology/P16-1114
PWC	https://paperswithcode.com/paper/prediction-of-prospective-user-engagement
Repo
Framework

Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task


Title	Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task
Authors	Ping Jian, Xiaohan She, Chenwei Zhang, Pengcheng Zhang, Jian Feng
Abstract
Tasks	Relation Classification
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-2022/
PDF	https://www.aclweb.org/anthology/K16-2022
PWC	https://paperswithcode.com/paper/discourse-relation-sense-classification-1
Repo
Framework


Title	RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics
Authors	Ergun Bi{\c{c}}ici
Abstract
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1117/
PDF	https://www.aclweb.org/anthology/S16-1117
PWC	https://paperswithcode.com/paper/rtm-at-semeval-2016-task-1-predicting-1
Repo
Framework

Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus


Title	Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Authors	Nicholas Asher, Julie Hunter, Mathieu Morey, Benamara Farah, Stergos Afantenos
Abstract	This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especially when these goals are opposed. The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides full discourse structures for multi-party dialogues. It has other remarkable features that make it an interesting resource for other topics: interleaved threads, creative language, and interactions between linguistic and extra-linguistic contexts.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1432/
PDF	https://www.aclweb.org/anthology/L16-1432
PWC	https://paperswithcode.com/paper/discourse-structure-and-dialogue-acts-in
Repo
Framework

Local-Global Vectors to Improve Unigram Terminology Extraction


Title	Local-Global Vectors to Improve Unigram Terminology Extraction
Authors	Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, Farahnaz Faez
Abstract	The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand.
Tasks	Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4702/
PDF	https://www.aclweb.org/anthology/W16-4702
PWC	https://paperswithcode.com/paper/local-global-vectors-to-improve-unigram
Repo
Framework

GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization


Title	GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization
Authors	Antoine Tixier, Konstantinos Skianis, Michalis Vazirgiannis
Abstract
Tasks	Ad-Hoc Information Retrieval, Community Detection, Document Classification, Document Summarization, Graph Clustering, Information Retrieval, Keyword Extraction
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-4026/
PDF	https://www.aclweb.org/anthology/P16-4026
PWC	https://paperswithcode.com/paper/gowvis-a-web-application-for-graph-of-words
Repo
Framework

Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data


Title	Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data
Authors	Zhen Hai, Peilin Zhao, Peng Cheng, Peng Yang, Xiao-Li Li, Guangxia Li
Abstract
Tasks	Feature Engineering, Multi-Task Learning
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1187/
PDF	https://www.aclweb.org/anthology/D16-1187
PWC	https://paperswithcode.com/paper/deceptive-review-spam-detection-via
Repo
Framework

Improving Semantic Parsing via Answer Type Inference


Title	Improving Semantic Parsing via Answer Type Inference
Authors	Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, Xifeng Yan
Abstract
Tasks	Knowledge Base Population, Question Answering, Relation Extraction, Semantic Parsing
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1015/
PDF	https://www.aclweb.org/anthology/D16-1015
PWC	https://paperswithcode.com/paper/improving-semantic-parsing-via-answer-type
Repo
Framework

Topically-focused Blog Corpora for Multiple Languages


Title	Topically-focused Blog Corpora for Multiple Languages
Authors	Andrew Salway, Dag Elgesem, Knut Hofland, Ãystein Reigem, Lubos Steskal
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/papers/W16-2603/w16-2603
PDF	https://www.aclweb.org/anthology/W16-2603
PWC	https://paperswithcode.com/paper/topically-focused-blog-corpora-for-multiple
Repo
Framework

MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking


Title	MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking
Authors	Michael Schlichtkrull, H{'e}ctor Mart{'\i}nez Alonso
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1209/
PDF	https://www.aclweb.org/anthology/S16-1209
PWC	https://paperswithcode.com/paper/msejrku-at-semeval-2016-task-14-taxonomy
Repo
Framework

Dimensionality Reduction of Massive Sparse Datasets Using Coresets


Title	Dimensionality Reduction of Massive Sparse Datasets Using Coresets
Authors	Dan Feldman, Mikhail Volkov, Daniela Rus
Abstract	In this paper we present a practical solution with performance guarantees to the problem of dimensionality reduction for very large scale sparse matrices. We show applications of our approach to computing the Principle Component Analysis (PCA) of any $n\times d$ matrix, using one pass over the stream of its rows. Our solution uses coresets: a scaled subset of the $n$ rows that approximates their sum of squared distances to \emph{every} $k$-dimensional \emph{affine} subspace. An open theoretical problem has been to compute such a coreset that is independent of both $n$ and $d$. An open practical problem has been to compute a non-trivial approximation to the PCA of very large but sparse databases such as the Wikipedia document-term matrix in a reasonable time. We answer both of these questions affirmatively. Our main technical result is a new framework for deterministic coreset constructions based on a reduction to the problem of counting items in a stream.
Tasks	Dimensionality Reduction
Published	2016-12-01
URL	http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets
PDF	http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-of-massive-sparse
Repo
Framework

L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs


Title	L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs
Authors	Sun Hee Park
Abstract
Tasks
Published	2016-10-01
URL	https://www.aclweb.org/anthology/Y16-3022/
PDF	https://www.aclweb.org/anthology/Y16-3022
PWC	https://paperswithcode.com/paper/l2-acquisition-of-korean-locative
Repo
Framework

Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning


Title	Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning
Authors	Wafia Adouane, Nasredine Semmar, Richard Johansson
Abstract	The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960{'}s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75{%} and 97.77{%} for RB and RA respectively.
Tasks	Language Identification, Transliteration
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4807/
PDF	https://www.aclweb.org/anthology/W16-4807
PWC	https://paperswithcode.com/paper/romanized-berber-and-romanized-arabic
Repo
Framework

A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging


Title	A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging
Authors	Nina Zhou, AiTi Aw, Nattadaporn Lertcheva, Xuancong Wang
Abstract	Previous studies on Thai Sentence Boundary Detection (SBD) mostly assumed sentence ends at a space disambiguation problem, which classified space either as an indicator for Sentence Boundary (SB) or non-Sentence Boundary (nSB). In this paper, we propose a word labeling approach which treats space as a normal word, and detects SB between any two words. This removes the restriction for SB to be oc-curred only at space and makes our system more robust for modern Thai writing. It is because in modern Thai writing, space is not consistently used to indicate SB. As syntactic information contributes to better SBD, we further propose a joint Part-Of-Speech (POS) tagging and SBD framework based on Factorial Conditional Random Field (FCRF) model. We compare the performance of our proposed ap-proach with reported methods on ORCHID corpus. We also performed experiments of FCRF model on the TaLAPi corpus. The results show that the word labelling approach has better performance than pre-vious space-based classification approaches and FCRF joint model outperforms LCRF model in terms of SBD in all experiments.
Tasks	Boundary Detection, Machine Translation, Part-Of-Speech Tagging
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1031/
PDF	https://www.aclweb.org/anthology/C16-1031
PWC	https://paperswithcode.com/paper/a-word-labeling-approach-to-thai-sentence
Repo
Framework

A Probabilistic Programming Approach To Probabilistic Data Analysis


Title	A Probabilistic Programming Approach To Probabilistic Data Analysis
Authors	Feras Saad, Vikash K. Mansinghka
Abstract	Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler’s Third Law by composing causal probabilistic programs with non-parametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries.
Tasks	Probabilistic Programming
Published	2016-12-01
URL	http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis
PDF	http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-programming-approach-to
Repo
Framework