May 5, 2019

1612 words 8 mins read

Paper Group NANR 122

Paper Group NANR 122

Prediction of Prospective User Engagement with Intelligent Assistants. Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task. RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics. Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus. Loc …

Prediction of Prospective User Engagement with Intelligent Assistants

Title Prediction of Prospective User Engagement with Intelligent Assistants
Authors Shumpei Sano, Nobuhiro Kaji, Manabu Sassano
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1114/
PDF https://www.aclweb.org/anthology/P16-1114
PWC https://paperswithcode.com/paper/prediction-of-prospective-user-engagement
Repo
Framework

Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task

Title Discourse Relation Sense Classification Systems for CoNLL-2016 Shared Task
Authors Ping Jian, Xiaohan She, Chenwei Zhang, Pengcheng Zhang, Jian Feng
Abstract
Tasks Relation Classification
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-2022/
PDF https://www.aclweb.org/anthology/K16-2022
PWC https://paperswithcode.com/paper/discourse-relation-sense-classification-1
Repo
Framework
Title RTM at SemEval-2016 Task 1: Predicting Semantic Similarity with Referential Translation Machines and Related Statistics
Authors Ergun Bi{\c{c}}ici
Abstract
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1117/
PDF https://www.aclweb.org/anthology/S16-1117
PWC https://paperswithcode.com/paper/rtm-at-semeval-2016-task-1-predicting-1
Repo
Framework

Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus

Title Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Authors Nicholas Asher, Julie Hunter, Mathieu Morey, Benamara Farah, Stergos Afantenos
Abstract This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especially when these goals are opposed. The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides full discourse structures for multi-party dialogues. It has other remarkable features that make it an interesting resource for other topics: interleaved threads, creative language, and interactions between linguistic and extra-linguistic contexts.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1432/
PDF https://www.aclweb.org/anthology/L16-1432
PWC https://paperswithcode.com/paper/discourse-structure-and-dialogue-acts-in
Repo
Framework

Local-Global Vectors to Improve Unigram Terminology Extraction

Title Local-Global Vectors to Improve Unigram Terminology Extraction
Authors Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, Farahnaz Faez
Abstract The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand.
Tasks Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4702/
PDF https://www.aclweb.org/anthology/W16-4702
PWC https://paperswithcode.com/paper/local-global-vectors-to-improve-unigram
Repo
Framework

GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization

Title GoWvis: A Web Application for Graph-of-Words-based Text Visualization and Summarization
Authors Antoine Tixier, Konstantinos Skianis, Michalis Vazirgiannis
Abstract
Tasks Ad-Hoc Information Retrieval, Community Detection, Document Classification, Document Summarization, Graph Clustering, Information Retrieval, Keyword Extraction
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4026/
PDF https://www.aclweb.org/anthology/P16-4026
PWC https://paperswithcode.com/paper/gowvis-a-web-application-for-graph-of-words
Repo
Framework

Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data

Title Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data
Authors Zhen Hai, Peilin Zhao, Peng Cheng, Peng Yang, Xiao-Li Li, Guangxia Li
Abstract
Tasks Feature Engineering, Multi-Task Learning
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1187/
PDF https://www.aclweb.org/anthology/D16-1187
PWC https://paperswithcode.com/paper/deceptive-review-spam-detection-via
Repo
Framework

Improving Semantic Parsing via Answer Type Inference

Title Improving Semantic Parsing via Answer Type Inference
Authors Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, Xifeng Yan
Abstract
Tasks Knowledge Base Population, Question Answering, Relation Extraction, Semantic Parsing
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1015/
PDF https://www.aclweb.org/anthology/D16-1015
PWC https://paperswithcode.com/paper/improving-semantic-parsing-via-answer-type
Repo
Framework

Topically-focused Blog Corpora for Multiple Languages

Title Topically-focused Blog Corpora for Multiple Languages
Authors Andrew Salway, Dag Elgesem, Knut Hofland, Øystein Reigem, Lubos Steskal
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/papers/W16-2603/w16-2603
PDF https://www.aclweb.org/anthology/W16-2603
PWC https://paperswithcode.com/paper/topically-focused-blog-corpora-for-multiple
Repo
Framework

MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking

Title MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking
Authors Michael Schlichtkrull, H{'e}ctor Mart{'\i}nez Alonso
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1209/
PDF https://www.aclweb.org/anthology/S16-1209
PWC https://paperswithcode.com/paper/msejrku-at-semeval-2016-task-14-taxonomy
Repo
Framework

Dimensionality Reduction of Massive Sparse Datasets Using Coresets

Title Dimensionality Reduction of Massive Sparse Datasets Using Coresets
Authors Dan Feldman, Mikhail Volkov, Daniela Rus
Abstract In this paper we present a practical solution with performance guarantees to the problem of dimensionality reduction for very large scale sparse matrices. We show applications of our approach to computing the Principle Component Analysis (PCA) of any $n\times d$ matrix, using one pass over the stream of its rows. Our solution uses coresets: a scaled subset of the $n$ rows that approximates their sum of squared distances to \emph{every} $k$-dimensional \emph{affine} subspace. An open theoretical problem has been to compute such a coreset that is independent of both $n$ and $d$. An open practical problem has been to compute a non-trivial approximation to the PCA of very large but sparse databases such as the Wikipedia document-term matrix in a reasonable time. We answer both of these questions affirmatively. Our main technical result is a new framework for deterministic coreset constructions based on a reduction to the problem of counting items in a stream.
Tasks Dimensionality Reduction
Published 2016-12-01
URL http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets
PDF http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets.pdf
PWC https://paperswithcode.com/paper/dimensionality-reduction-of-massive-sparse
Repo
Framework

L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs

Title L2 Acquisition of Korean locative construction by English L1 speakers: Learnability problem in Korean Figure non-alternating verbs
Authors Sun Hee Park
Abstract
Tasks
Published 2016-10-01
URL https://www.aclweb.org/anthology/Y16-3022/
PDF https://www.aclweb.org/anthology/Y16-3022
PWC https://paperswithcode.com/paper/l2-acquisition-of-korean-locative
Repo
Framework

Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning

Title Romanized Berber and Romanized Arabic Automatic Language Identification Using Machine Learning
Authors Wafia Adouane, Nasredine Semmar, Richard Johansson
Abstract The identification of the language of text/speech input is the first step to be able to properly do any language-dependent natural language processing. The task is called Automatic Language Identification (ALI). Being a well-studied field since early 1960{'}s, various methods have been applied to many standard languages. The ALI standard methods require datasets for training and use character/word-based n-gram models. However, social media and new technologies have contributed to the rise of informal and minority languages on the Web. The state-of-the-art automatic language identifiers fail to properly identify many of them. Romanized Arabic (RA) and Romanized Berber (RB) are cases of these informal languages which are under-resourced. The goal of this paper is twofold: detect RA and RB, at a document level, as separate languages and distinguish between them as they coexist in North Africa. We consider the task as a classification problem and use supervised machine learning to solve it. For both languages, character-based 5-grams combined with additional lexicons score the best, F-score of 99.75{%} and 97.77{%} for RB and RA respectively.
Tasks Language Identification, Transliteration
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4807/
PDF https://www.aclweb.org/anthology/W16-4807
PWC https://paperswithcode.com/paper/romanized-berber-and-romanized-arabic
Repo
Framework

A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging

Title A Word Labeling Approach to Thai Sentence Boundary Detection and POS Tagging
Authors Nina Zhou, AiTi Aw, Nattadaporn Lertcheva, Xuancong Wang
Abstract Previous studies on Thai Sentence Boundary Detection (SBD) mostly assumed sentence ends at a space disambiguation problem, which classified space either as an indicator for Sentence Boundary (SB) or non-Sentence Boundary (nSB). In this paper, we propose a word labeling approach which treats space as a normal word, and detects SB between any two words. This removes the restriction for SB to be oc-curred only at space and makes our system more robust for modern Thai writing. It is because in modern Thai writing, space is not consistently used to indicate SB. As syntactic information contributes to better SBD, we further propose a joint Part-Of-Speech (POS) tagging and SBD framework based on Factorial Conditional Random Field (FCRF) model. We compare the performance of our proposed ap-proach with reported methods on ORCHID corpus. We also performed experiments of FCRF model on the TaLAPi corpus. The results show that the word labelling approach has better performance than pre-vious space-based classification approaches and FCRF joint model outperforms LCRF model in terms of SBD in all experiments.
Tasks Boundary Detection, Machine Translation, Part-Of-Speech Tagging
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1031/
PDF https://www.aclweb.org/anthology/C16-1031
PWC https://paperswithcode.com/paper/a-word-labeling-approach-to-thai-sentence
Repo
Framework

A Probabilistic Programming Approach To Probabilistic Data Analysis

Title A Probabilistic Programming Approach To Probabilistic Data Analysis
Authors Feras Saad, Vikash K. Mansinghka
Abstract Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler’s Third Law by composing causal probabilistic programs with non-parametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries.
Tasks Probabilistic Programming
Published 2016-12-01
URL http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis
PDF http://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis.pdf
PWC https://paperswithcode.com/paper/a-probabilistic-programming-approach-to
Repo
Framework
comments powered by Disqus