July 26, 2019

2090 words 10 mins read

Paper Group NANR 24

Paper Group NANR 24

On the annotation of vague expressions: a case study on Romanian historical texts. Summarizing Lengthy Questions. Identifying Speakers and Listeners of Quoted Speech in Literary Works. Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes. Selective Decoding for Cross-lingual Open Information Extraction. Collecting fluency correct …

On the annotation of vague expressions: a case study on Romanian historical texts

Title On the annotation of vague expressions: a case study on Romanian historical texts
Authors Anca Dinu, Walther von Hahn, Cristina Vertan
Abstract Current approaches in Digital .Humanities tend to ignore a central as-pect of any hermeneutic introspection: the intrinsic vagueness of analyzed texts. Especially when dealing with his-torical documents neglecting vague-ness has important implications on the interpretation of the results. In this pa-per we present current limitation of an-notation approaches and describe a current methodology for annotating vagueness for historical Romanian texts.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-8104/
PDF http://doi.org/10.26615/978-954-452-046-5_004
PWC https://paperswithcode.com/paper/on-the-annotation-of-vague-expressions-a-case
Repo
Framework

Summarizing Lengthy Questions

Title Summarizing Lengthy Questions
Authors Tatsuya Ishigaki, Hiroya Takamura, Manabu Okumura
Abstract In this research, we propose the task of question summarization. We first analyzed question-summary pairs extracted from a Community Question Answering (CQA) site, and found that a proportion of questions cannot be summarized by extractive approaches but requires abstractive approaches. We created a dataset by regarding the question-title pairs posted on the CQA site as question-summary pairs. By using the data, we trained extractive and abstractive summarization models, and compared them based on ROUGE scores and manual evaluations. Our experimental results show an abstractive method using an encoder-decoder model with a copying mechanism achieves better scores for both ROUGE-2 F-measure and the evaluations by human judges.
Tasks Abstractive Text Summarization, Community Question Answering, Question Answering
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1080/
PDF https://www.aclweb.org/anthology/I17-1080
PWC https://paperswithcode.com/paper/summarizing-lengthy-questions
Repo
Framework

Identifying Speakers and Listeners of Quoted Speech in Literary Works

Title Identifying Speakers and Listeners of Quoted Speech in Literary Works
Authors Chak Yan Yeung, John Lee
Abstract We present the first study that evaluates both speaker and listener identification for direct speech in literary texts. Our approach consists of two steps: identification of speakers and listeners near the quotes, and dialogue chain segmentation. Evaluation results show that this approach outperforms a rule-based approach that is state-of-the-art on a corpus of literary texts.
Tasks Speaker Identification
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2055/
PDF https://www.aclweb.org/anthology/I17-2055
PWC https://paperswithcode.com/paper/identifying-speakers-and-listeners-of-quoted
Repo
Framework

Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes

Title Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes
Authors Francesco Barbieri, Luis Espinosa-Anke, Miguel Ballesteros, Juan Soler-Company, Horacio Saggion
Abstract Videogame streaming platforms have become a paramount example of noisy user-generated text. These are websites where gaming is broadcasted, and allows interaction with viewers via integrated chatrooms. Probably the best known platform of this kind is Twitch, which has more than 100 million monthly viewers. Despite these numbers, and unlike other platforms featuring short messages (e.g. Twitter), Twitch has not received much attention from the Natural Language Processing community. In this paper we aim at bridging this gap by proposing two important tasks specific to the Twitch platform, namely (1) Emote prediction; and (2) Trolling detection. In our experiments, we evaluate three models: a BOW baseline, a logistic supervised classifiers based on word embeddings, and a bidirectional long short-term memory recurrent neural network (LSTM). Our results show that the LSTM model outperforms the other two models, where explicit features with proven effectiveness for similar tasks were encoded.
Tasks Information Retrieval, Stance Detection, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4402/
PDF https://www.aclweb.org/anthology/W17-4402
PWC https://paperswithcode.com/paper/towards-the-understanding-of-gaming-audiences
Repo
Framework

Selective Decoding for Cross-lingual Open Information Extraction

Title Selective Decoding for Cross-lingual Open Information Extraction
Authors Sheng Zhang, Kevin Duh, Benjamin Van Durme
Abstract Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language. We propose a novel encoder-decoder model for this problem. It employs a novel selective decoding mechanism, which explicitly models the sequence labeling process as well as the sequence generation process on the decoder side. Compared to a standard encoder-decoder model, selective decoding significantly increases the performance on a Chinese-English cross-lingual open IE dataset by 3.87-4.49 BLEU and 1.91-5.92 F1. We also extend our approach to low-resource scenarios, and gain promising improvement.
Tasks Machine Translation, Open Information Extraction
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1084/
PDF https://www.aclweb.org/anthology/I17-1084
PWC https://paperswithcode.com/paper/selective-decoding-for-cross-lingual-open
Repo
Framework

Collecting fluency corrections for spoken learner English

Title Collecting fluency corrections for spoken learner English
Authors Andrew Caines, Emma Flint, Paula Buttery
Abstract We present crowdsourced collection of error annotations for transcriptions of spoken learner English. Our emphasis in data collection is on fluency corrections, a more complete correction than has traditionally been aimed for in grammatical error correction research (GEC). Fluency corrections require improvements to the text, taking discourse and utterance level semantics into account: the result is a more naturalistic, holistic version of the original. We propose that this shifted emphasis be reflected in a new name for the task: {`}holistic error correction{'} (HEC). We analyse crowdworker behaviour in HEC and conclude that the method is useful with certain amendments for future work. |
Tasks Grammatical Error Correction, Grammatical Error Detection, Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5010/
PDF https://www.aclweb.org/anthology/W17-5010
PWC https://paperswithcode.com/paper/collecting-fluency-corrections-for-spoken
Repo
Framework

Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Title Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Authors
Abstract
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1000/
PDF https://www.aclweb.org/anthology/K17-1000
PWC https://paperswithcode.com/paper/proceedings-of-the-21st-conference-on
Repo
Framework

Neural Net Models of Open-domain Discourse Coherence

Title Neural Net Models of Open-domain Discourse Coherence
Authors Jiwei Li, Dan Jurafsky
Abstract Discourse coherence is strongly associated with text quality, making it important to natural language generation and understanding. Yet existing models of coherence focus on measuring individual aspects of coherence (lexical overlap, rhetorical structure, entity centering) in narrow domains. In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences. We study both discriminative models that learn to distinguish coherent from incoherent discourse, and generative models that produce coherent text, including a novel neural latent-variable Markovian generative model that captures the latent discourse dependencies between sentences in a text. Our work achieves state-of-the-art performance on multiple coherence evaluations, and marks an initial step in generating coherent texts given discourse contexts.
Tasks Abstractive Text Summarization, Question Answering, Sentence Embeddings, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1019/
PDF https://www.aclweb.org/anthology/D17-1019
PWC https://paperswithcode.com/paper/neural-net-models-of-open-domain-discourse
Repo
Framework

Automatic Extraction of News Values from Headline Text

Title Automatic Extraction of News Values from Headline Text
Authors Alicja Piotrkowicz, Vania Dimitrova, Katja Markert
Abstract Headlines play a crucial role in attracting audiences{'} attention to online artefacts (e.g. news articles, videos, blogs). The ability to carry out an automatic, large-scale analysis of headlines is critical to facilitate the selection and prioritisation of a large volume of digital content. In journalism studies news content has been extensively studied using manually annotated news values - factors used implicitly and explicitly when making decisions on the selection and prioritisation of news items. This paper presents the first attempt at a fully automatic extraction of news values from headline text. The news values extraction methods are applied on a large headlines corpus collected from The Guardian, and evaluated by comparing it with a manually annotated gold standard. A crowdsourcing survey indicates that news values affect people{'}s decisions to click on a headline, supporting the need for an automatic news values detection.
Tasks Keyword Spotting, Recommendation Systems
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-4007/
PDF https://www.aclweb.org/anthology/E17-4007
PWC https://paperswithcode.com/paper/automatic-extraction-of-news-values-from
Repo
Framework

YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction

Title YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction
Authors You Zhang, Hang Yuan, Jin Wang, Xuejie Zhang
Abstract In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task. The CNN-LSTM model has two combined parts: CNN extracts local n-gram features within tweets and LSTM composes the features to capture long-distance dependency across tweets. Additionally, we used other three models (CNN, LSTM, BiLSTM) as baseline algorithms. Our introduced model showed good performance in the experimental results.
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5227/
PDF https://www.aclweb.org/anthology/W17-5227
PWC https://paperswithcode.com/paper/ynu-hpcc-at-emoint-2017-using-a-cnn-lstm
Repo
Framework

Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks

Title Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks
Authors Prateep Bhattacharjee, Sukhendu Das
Abstract Predicting the future from a sequence of video frames has been recently a sought after yet challenging task in the field of computer vision and machine learning. Although there have been efforts for tracking using motion trajectories and flow features, the complex problem of generating unseen frames has not been studied extensively. In this paper, we deal with this problem using convolutional models within a multi-stage Generative Adversarial Networks (GAN) framework. The proposed method uses two stages of GANs to generate a crisp and clear set of future frames. Although GANs have been used in the past for predicting the future, none of the works consider the relation between subsequent frames in the temporal dimension. Our main contribution lies in formulating two objective functions based on the Normalized Cross Correlation (NCC) and the Pairwise Contrastive Divergence (PCD) for solving this problem. This method, coupled with the traditional L1 loss, has been experimented with three real-world video datasets, viz. Sports-1M, UCF-101 and the KITTI. Performance analysis reveals superior results over the recent state-of-the-art methods.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7014-temporal-coherency-based-criteria-for-predicting-video-frames-using-deep-multi-stage-generative-adversarial-networks
PDF http://papers.nips.cc/paper/7014-temporal-coherency-based-criteria-for-predicting-video-frames-using-deep-multi-stage-generative-adversarial-networks.pdf
PWC https://paperswithcode.com/paper/temporal-coherency-based-criteria-for
Repo
Framework

Gradient Emotional Analysis

Title Gradient Emotional Analysis
Authors Lilia Simeonova
Abstract Over the past few years a lot of research has been done on sentiment analysis, however, the emotional analysis, being so subjective, is not a well examined dis-cipline. The main focus of this proposal is to categorize a given sentence in two dimensions - sentiment and arousal. For this purpose two techniques will be com-bined {–} Machine Learning approach and Lexicon-based approach. The first di-mension will give the sentiment value {–} positive versus negative. This will be re-solved by using Na{"\i}ve Bayes Classifier. The second and more interesting dimen-sion will determine the level of arousal. This will be achieved by evaluation of given a phrase or sentence based on lexi-con with affective ratings for 14 thousand English words.
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-2006/
PDF https://doi.org/10.26615/issn.1314-9156.2017_006
PWC https://paperswithcode.com/paper/gradient-emotional-analysis
Repo
Framework
Title Supervised Methods For Ranking Relations In Web Search
Authors Sumit Asthana, Asif Ekbal
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-7529/
PDF https://www.aclweb.org/anthology/W17-7529
PWC https://paperswithcode.com/paper/supervised-methods-for-ranking-relations-in
Repo
Framework

CKIP at IJCNLP-2017 Task 2: Neural Valence-Arousal Prediction for Phrases

Title CKIP at IJCNLP-2017 Task 2: Neural Valence-Arousal Prediction for Phrases
Authors Peng-Hsuan Li, Wei-Yun Ma, Hsin-Yang Wang
Abstract CKIP takes part in solving the Dimensional Sentiment Analysis for Chinese Phrases (DSAP) share task of IJCNLP 2017. This task calls for systems that can predict the valence and the arousal of Chinese phrases, which are real values between 1 and 9. To achieve this, functions mapping Chinese character sequences to real numbers are built by regression techniques. In addition, the CKIP phrase Valence-Arousal (VA) predictor depends on knowledge of modifier words and head words. This includes the types of known modifier words, VA of head words, and distributional semantics of both these words. The predictor took the second place out of 13 teams on phrase VA prediction, with 0.444 MAE and 0.935 PCC on valence, and 0.395 MAE and 0.904 PCC on arousal.
Tasks Sentiment Analysis
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4014/
PDF https://www.aclweb.org/anthology/I17-4014
PWC https://paperswithcode.com/paper/ckip-at-ijcnlp-2017-task-2-neural-valence
Repo
Framework

Segmentation-Free Word Embedding for Unsegmented Languages

Title Segmentation-Free Word Embedding for Unsegmented Languages
Authors Takamasa Oshikiri
Abstract In this paper, we propose a new pipeline of word embedding for unsegmented languages, called segmentation-free word embedding, which does not require word segmentation as a preprocessing step. Unlike space-delimited languages, unsegmented languages, such as Chinese and Japanese, require word segmentation as a preprocessing step. However, word segmentation, that often requires manually annotated resources, is difficult and expensive, and unavoidable errors in word segmentation affect downstream tasks. To avoid these problems in learning word vectors of unsegmented languages, we consider word co-occurrence statistics over all possible candidates of segmentations based on frequent character n-grams instead of segmented sentences provided by conventional word segmenters. Our experiments of noun category prediction tasks on raw Twitter, Weibo, and Wikipedia corpora show that the proposed method outperforms the conventional approaches that require word segmenters.
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1080/
PDF https://www.aclweb.org/anthology/D17-1080
PWC https://paperswithcode.com/paper/segmentation-free-word-embedding-for
Repo
Framework
comments powered by Disqus