May 5, 2019

1562 words 8 mins read

Paper Group NANR 119

Paper Group NANR 119

thecerealkiller at SemEval-2016 Task 4: Deep Learning based System for Classifying Sentiment of Tweets on Two Point Scale. What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems. Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing. Leveraging Entity Linking and Related Language …

thecerealkiller at SemEval-2016 Task 4: Deep Learning based System for Classifying Sentiment of Tweets on Two Point Scale

Title thecerealkiller at SemEval-2016 Task 4: Deep Learning based System for Classifying Sentiment of Tweets on Two Point Scale
Authors Vikrant Yadav
Abstract
Tasks Sentiment Analysis, Word Embeddings
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1013/
PDF https://www.aclweb.org/anthology/S16-1013
PWC https://paperswithcode.com/paper/thecerealkiller-at-semeval-2016-task-4-deep
Repo
Framework

What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems

Title What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems
Authors Emma Barker, Monica Paramita, Adam Funk, Emina Kurtic, Ahmet Aker, Jonathan Foster, Mark Hepple, Robert Gaizauskas
Abstract Automatic summarization of reader comments in on-line news is an extremely challenging task and a capability for which there is a clear need. Work to date has focussed on producing extractive summaries using well-known techniques imported from other areas of language processing. But are extractive summaries of comments what users really want? Do they support users in performing the sorts of tasks they are likely to want to perform with reader comments? In this paper we address these questions by doing three things. First, we offer a specification of one possible summary type for reader comment, based on an analysis of reader comment in terms of issues and viewpoints. Second, we define a task-based evaluation framework for reader comment summarization that allows summarization systems to be assessed in terms of how well they support users in a time-limited task of identifying issues and characterising opinion on issues in comments. Third, we describe a pilot evaluation in which we used the task-based evaluation framework to evaluate a prototype reader comment clustering and summarization system, demonstrating the viability of the evaluation framework and illustrating the sorts of insight such an evaluation affords.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1494/
PDF https://www.aclweb.org/anthology/L16-1494
PWC https://paperswithcode.com/paper/whats-the-issue-here-task-based-evaluation-of
Repo
Framework

Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing

Title Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing
Authors Manuel Burghardt, Daniel Granvogl, Christian Wolff
Abstract Data acquisition in dialectology is typically a tedious task, as dialect samples of spoken language have to be collected via questionnaires or interviews. In this article, we suggest to use the {}web as a corpus{''} approach for dialectology. We present a case study that demonstrates how authentic language data for the Bavarian dialect (ISO 639-3:bar) can be collected automatically from the social network Facebook. We also show that Facebook can be used effectively as a crowdsourcing platform, where users are willing to translate dialect words collaboratively in order to create a common lexicon of their Bavarian dialect. Key insights from the case study are summarized as {}lessons learned{''}, together with suggestions for future enhancements of the lexicon creation approach.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1321/
PDF https://www.aclweb.org/anthology/L16-1321
PWC https://paperswithcode.com/paper/creating-a-lexicon-of-bavarian-dialect-by
Repo
Framework
Title Leveraging Entity Linking and Related Language Projection to Improve Name Transliteration
Authors Ying Lin, Xiaoman Pan, Aliya Deri, Heng Ji, Kevin Knight
Abstract
Tasks Entity Disambiguation, Entity Linking, Language Modelling, Machine Translation, Transliteration
Published 2016-08-01
URL https://www.aclweb.org/anthology/papers/W16-2701/w16-2701
PDF https://www.aclweb.org/anthology/W16-2701v2
PWC https://paperswithcode.com/paper/leveraging-entity-linking-and-related
Repo
Framework

An LFG Account of Discontinuous Nominal Expressions

Title An LFG Account of Discontinuous Nominal Expressions
Authors Liselotte Snijders
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0901/
PDF https://www.aclweb.org/anthology/W16-0901
PWC https://paperswithcode.com/paper/an-lfg-account-of-discontinuous-nominal
Repo
Framework

A Discourse-Annotated Corpus of Conjoined VPs

Title A Discourse-Annotated Corpus of Conjoined VPs
Authors Bonnie Webber, Rashmi Prasad, Alan Lee, Aravind Joshi
Abstract
Tasks Common Sense Reasoning, Machine Translation, Question Answering
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1704/
PDF https://www.aclweb.org/anthology/W16-1704
PWC https://paperswithcode.com/paper/a-discourse-annotated-corpus-of-conjoined-vps
Repo
Framework

Morphological Reinflection via Discriminative String Transduction

Title Morphological Reinflection via Discriminative String Transduction
Authors Garrett Nicolai, Bradley Hauer, Adam St Arnaud, Grzegorz Kondrak
Abstract
Tasks Lemmatization
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2005/
PDF https://www.aclweb.org/anthology/W16-2005
PWC https://paperswithcode.com/paper/morphological-reinflection-via-discriminative
Repo
Framework

Discovery of Treatments from Text Corpora

Title Discovery of Treatments from Text Corpora
Authors Christian Fong, Justin Grimmer
Abstract
Tasks Topic Models
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1151/
PDF https://www.aclweb.org/anthology/P16-1151
PWC https://paperswithcode.com/paper/discovery-of-treatments-from-text-corpora
Repo
Framework

基於詞語分布均勻度的核心詞彙選擇 (A Study on Dispersion Measures for Core Vocabulary Compilation) [In Chinese]

Title 基於詞語分布均勻度的核心詞彙選擇 (A Study on Dispersion Measures for Core Vocabulary Compilation) [In Chinese]
Authors Ming-Hong Bai, Jian-Cheng Wu, Ying-Ni Chien, Shu-Ling Huang, Ching-Lung Lin
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/O16-3001/
PDF https://www.aclweb.org/anthology/O16-3001
PWC https://paperswithcode.com/paper/ao14eeaaa-aaaoc-aea12e-a-study-on-dispersion
Repo
Framework

Character Sequence Models for Colorful Words

Title Character Sequence Models for Colorful Words
Authors Kazuya Kawakami, Chris Dyer, Bryan Routledge, Noah A. Smith
Abstract
Tasks Language Modelling
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1202/
PDF https://www.aclweb.org/anthology/D16-1202
PWC https://paperswithcode.com/paper/character-sequence-models-for-colorful-words
Repo
Framework

Name Translation based on Fine-grained Named Entity Recognition in a Single Language

Title Name Translation based on Fine-grained Named Entity Recognition in a Single Language
Authors Kugatsu Sadamitsu, Itsumi Saito, Taichi Katayama, Hisako Asano, Yoshihiro Matsuo
Abstract We propose named entity abstraction methods with fine-grained named entity labels for improving statistical machine translation (SMT). The methods are based on a bilingual named entity recognizer that uses a monolingual named entity recognizer with transliteration. Through experiments, we demonstrate that incorporating fine-grained named entities into statistical machine translation improves the accuracy of SMT with more adequate granularity compared with the standard SMT, which is a non-named entity abstraction method.
Tasks Machine Translation, Named Entity Recognition, Transliteration
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1097/
PDF https://www.aclweb.org/anthology/L16-1097
PWC https://paperswithcode.com/paper/name-translation-based-on-fine-grained-named
Repo
Framework

Improving Patent Translation using Bilingual Term Extraction and Re-tokenization for Chinese–Japanese

Title Improving Patent Translation using Bilingual Term Extraction and Re-tokenization for Chinese–Japanese
Authors Wei Yang, Yves Lepage
Abstract Unlike European languages, many Asian languages like Chinese and Japanese do not have typographic boundaries in written system. Word segmentation (tokenization) that break sentences down into individual words (tokens) is normally treated as the first step for machine translation (MT). For Chinese and Japanese, different rules and segmentation tools lead different segmentation results in different level of granularity between Chinese and Japanese. To improve the translation accuracy, we adjust and balance the granularity of segmentation results around terms for Chinese{–}Japanese patent corpus for training translation model. In this paper, we describe a statistical machine translation (SMT) system which is built on re-tokenized Chinese-Japanese patent training corpus using extracted bilingual multi-word terms.
Tasks Chinese Word Segmentation, Machine Translation, Tokenization
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4619/
PDF https://www.aclweb.org/anthology/W16-4619
PWC https://paperswithcode.com/paper/improving-patent-translation-using-bilingual
Repo
Framework

PKUSUMSUM : A Java Platform for Multilingual Document Summarization

Title PKUSUMSUM : A Java Platform for Multilingual Document Summarization
Authors Jianmin Zhang, Tianming Wang, Xiaojun Wan
Abstract PKUSUMSUM is a Java platform for multilingual document summarization, and it sup-ports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks. The summarization platform has been released and users can easily use and update it. In this paper, we make a brief description of the char-acteristics, the summarization methods, and the evaluation results of the platform, and al-so compare PKUSUMSUM with other summarization toolkits.
Tasks Chinese Word Segmentation, Document Summarization, Information Retrieval
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2060/
PDF https://www.aclweb.org/anthology/C16-2060
PWC https://paperswithcode.com/paper/pkusumsum-a-java-platform-for-multilingual
Repo
Framework

A Corpus of Read and Spontaneous Upper Saxon German Speech for ASR Evaluation

Title A Corpus of Read and Spontaneous Upper Saxon German Speech for ASR Evaluation
Authors Robert Herms, Laura Seelig, Stefanie M{"u}nch, Maximilian Eibl
Abstract In this Paper we present a corpus named SXUCorpus which contains read and spontaneous speech of the Upper Saxon German dialect. The data has been collected from eight archives of local television stations located in the Free State of Saxony. The recordings include broadcasted topics of news, economy, weather, sport, and documentation from the years 1992 to 1996 and have been manually transcribed and labeled. In the paper, we report the methodology of collecting and processing analog audiovisual material, constructing the corpus and describe the properties of the data. In its current version, the corpus is available to the scientific community and is designed for automatic speech recognition (ASR) evaluation with a development set and a test set. We performed ASR experiments with the open-source framework sphinx-4 including a configuration for Standard German on the dataset. Additionally, we show the influence of acoustic model and language model adaptation by the utilization of the development set.
Tasks Language Modelling, Speech Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1736/
PDF https://www.aclweb.org/anthology/L16-1736
PWC https://paperswithcode.com/paper/a-corpus-of-read-and-spontaneous-upper-saxon
Repo
Framework

A Fluctuation Smoothing Approach for Unsupervised Automatic Short Answer Grading

Title A Fluctuation Smoothing Approach for Unsupervised Automatic Short Answer Grading
Authors Shourya Roy, D, S apat, ipan, Y. Narahari
Abstract We offer a fluctuation smoothing computational approach for unsupervised automatic short answer grading (ASAG) techniques in the educational ecosystem. A major drawback of the existing techniques is the significant effect that variations in model answers could have on their performances. The proposed fluctuation smoothing approach, based on classical sequential pattern mining, exploits lexical overlap in students{'} answers to any typical question. We empirically demonstrate using multiple datasets that the proposed approach improves the overall performance and significantly reduces (up to 63{%}) variation in performance (standard deviation) of unsupervised ASAG techniques. We bring in additional benchmarks such as (a) paraphrasing of model answers and (b) using answers by k top performing students as model answers, to amplify the benefits of the proposed approach.
Tasks Sequential Pattern Mining
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4911/
PDF https://www.aclweb.org/anthology/W16-4911
PWC https://paperswithcode.com/paper/a-fluctuation-smoothing-approach-for
Repo
Framework
comments powered by Disqus