July 26, 2019

2090 words 10 mins read

Paper Group NANR 24

On the annotation of vague expressions: a case study on Romanian historical texts. Summarizing Lengthy Questions. Identifying Speakers and Listeners of Quoted Speech in Literary Works. Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes. Selective Decoding for Cross-lingual Open Information Extraction. Collecting fluency correct …

On the annotation of vague expressions: a case study on Romanian historical texts


Title	On the annotation of vague expressions: a case study on Romanian historical texts
Authors	Anca Dinu, Walther von Hahn, Cristina Vertan
Abstract	Current approaches in Digital .Humanities tend to ignore a central as-pect of any hermeneutic introspection: the intrinsic vagueness of analyzed texts. Especially when dealing with his-torical documents neglecting vague-ness has important implications on the interpretation of the results. In this pa-per we present current limitation of an-notation approaches and describe a current methodology for annotating vagueness for historical Romanian texts.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-8104/
PDF	http://doi.org/10.26615/978-954-452-046-5_004
PWC	https://paperswithcode.com/paper/on-the-annotation-of-vague-expressions-a-case
Repo
Framework

Summarizing Lengthy Questions


Title	Summarizing Lengthy Questions
Authors	Tatsuya Ishigaki, Hiroya Takamura, Manabu Okumura
Abstract	In this research, we propose the task of question summarization. We first analyzed question-summary pairs extracted from a Community Question Answering (CQA) site, and found that a proportion of questions cannot be summarized by extractive approaches but requires abstractive approaches. We created a dataset by regarding the question-title pairs posted on the CQA site as question-summary pairs. By using the data, we trained extractive and abstractive summarization models, and compared them based on ROUGE scores and manual evaluations. Our experimental results show an abstractive method using an encoder-decoder model with a copying mechanism achieves better scores for both ROUGE-2 F-measure and the evaluations by human judges.
Tasks	Abstractive Text Summarization, Community Question Answering, Question Answering
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1080/
PDF	https://www.aclweb.org/anthology/I17-1080
PWC	https://paperswithcode.com/paper/summarizing-lengthy-questions
Repo
Framework

Identifying Speakers and Listeners of Quoted Speech in Literary Works


Title	Identifying Speakers and Listeners of Quoted Speech in Literary Works
Authors	Chak Yan Yeung, John Lee
Abstract	We present the first study that evaluates both speaker and listener identification for direct speech in literary texts. Our approach consists of two steps: identification of speakers and listeners near the quotes, and dialogue chain segmentation. Evaluation results show that this approach outperforms a rule-based approach that is state-of-the-art on a corpus of literary texts.
Tasks	Speaker Identification
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2055/
PDF	https://www.aclweb.org/anthology/I17-2055
PWC	https://paperswithcode.com/paper/identifying-speakers-and-listeners-of-quoted
Repo
Framework

Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes


Title	Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes
Authors	Francesco Barbieri, Luis Espinosa-Anke, Miguel Ballesteros, Juan Soler-Company, Horacio Saggion
Abstract	Videogame streaming platforms have become a paramount example of noisy user-generated text. These are websites where gaming is broadcasted, and allows interaction with viewers via integrated chatrooms. Probably the best known platform of this kind is Twitch, which has more than 100 million monthly viewers. Despite these numbers, and unlike other platforms featuring short messages (e.g. Twitter), Twitch has not received much attention from the Natural Language Processing community. In this paper we aim at bridging this gap by proposing two important tasks specific to the Twitch platform, namely (1) Emote prediction; and (2) Trolling detection. In our experiments, we evaluate three models: a BOW baseline, a logistic supervised classifiers based on word embeddings, and a bidirectional long short-term memory recurrent neural network (LSTM). Our results show that the LSTM model outperforms the other two models, where explicit features with proven effectiveness for similar tasks were encoded.
Tasks	Information Retrieval, Stance Detection, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4402/
PDF	https://www.aclweb.org/anthology/W17-4402
PWC	https://paperswithcode.com/paper/towards-the-understanding-of-gaming-audiences
Repo
Framework

Selective Decoding for Cross-lingual Open Information Extraction


Title	Selective Decoding for Cross-lingual Open Information Extraction
Authors	Sheng Zhang, Kevin Duh, Benjamin Van Durme
Abstract	Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language. We propose a novel encoder-decoder model for this problem. It employs a novel selective decoding mechanism, which explicitly models the sequence labeling process as well as the sequence generation process on the decoder side. Compared to a standard encoder-decoder model, selective decoding significantly increases the performance on a Chinese-English cross-lingual open IE dataset by 3.87-4.49 BLEU and 1.91-5.92 F1. We also extend our approach to low-resource scenarios, and gain promising improvement.
Tasks	Machine Translation, Open Information Extraction
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1084/
PDF	https://www.aclweb.org/anthology/I17-1084
PWC	https://paperswithcode.com/paper/selective-decoding-for-cross-lingual-open
Repo
Framework

Collecting fluency corrections for spoken learner English


Title	Collecting fluency corrections for spoken learner English
Authors	Andrew Caines, Emma Flint, Paula Buttery
Abstract	We present crowdsourced collection of error annotations for transcriptions of spoken learner English. Our emphasis in data collection is on fluency corrections, a more complete correction than has traditionally been aimed for in grammatical error correction research (GEC). Fluency corrections require improvements to the text, taking discourse and utterance level semantics into account: the result is a more naturalistic, holistic version of the original. We propose that this shifted emphasis be reflected in a new name for the task: {`}holistic error correction{'} (HEC). We analyse crowdworker behaviour in HEC and conclude that the method is useful with certain amendments for future work. \|
Tasks	Grammatical Error Correction, Grammatical Error Detection, Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5010/
PDF	https://www.aclweb.org/anthology/W17-5010
PWC	https://paperswithcode.com/paper/collecting-fluency-corrections-for-spoken
Repo
Framework

Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)


Title	Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Authors
Abstract
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-1000/
PDF	https://www.aclweb.org/anthology/K17-1000
PWC	https://paperswithcode.com/paper/proceedings-of-the-21st-conference-on
Repo
Framework

Neural Net Models of Open-domain Discourse Coherence


Title	Neural Net Models of Open-domain Discourse Coherence
Authors	Jiwei Li, Dan Jurafsky
Abstract	Discourse coherence is strongly associated with text quality, making it important to natural language generation and understanding. Yet existing models of coherence focus on measuring individual aspects of coherence (lexical overlap, rhetorical structure, entity centering) in narrow domains. In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences. We study both discriminative models that learn to distinguish coherent from incoherent discourse, and generative models that produce coherent text, including a novel neural latent-variable Markovian generative model that captures the latent discourse dependencies between sentences in a text. Our work achieves state-of-the-art performance on multiple coherence evaluations, and marks an initial step in generating coherent texts given discourse contexts.
Tasks	Abstractive Text Summarization, Question Answering, Sentence Embeddings, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1019/
PDF	https://www.aclweb.org/anthology/D17-1019
PWC	https://paperswithcode.com/paper/neural-net-models-of-open-domain-discourse
Repo
Framework

Automatic Extraction of News Values from Headline Text


Title	Automatic Extraction of News Values from Headline Text
Authors	Alicja Piotrkowicz, Vania Dimitrova, Katja Markert
Abstract	Headlines play a crucial role in attracting audiences{'} attention to online artefacts (e.g. news articles, videos, blogs). The ability to carry out an automatic, large-scale analysis of headlines is critical to facilitate the selection and prioritisation of a large volume of digital content. In journalism studies news content has been extensively studied using manually annotated news values - factors used implicitly and explicitly when making decisions on the selection and prioritisation of news items. This paper presents the first attempt at a fully automatic extraction of news values from headline text. The news values extraction methods are applied on a large headlines corpus collected from The Guardian, and evaluated by comparing it with a manually annotated gold standard. A crowdsourcing survey indicates that news values affect people{'}s decisions to click on a headline, supporting the need for an automatic news values detection.
Tasks	Keyword Spotting, Recommendation Systems
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-4007/
PDF	https://www.aclweb.org/anthology/E17-4007
PWC	https://paperswithcode.com/paper/automatic-extraction-of-news-values-from
Repo
Framework

YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction


Title	YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction
Authors	You Zhang, Hang Yuan, Jin Wang, Xuejie Zhang
Abstract	In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task. The CNN-LSTM model has two combined parts: CNN extracts local n-gram features within tweets and LSTM composes the features to capture long-distance dependency across tweets. Additionally, we used other three models (CNN, LSTM, BiLSTM) as baseline algorithms. Our introduced model showed good performance in the experimental results.
Tasks	Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5227/
PDF	https://www.aclweb.org/anthology/W17-5227
PWC	https://paperswithcode.com/paper/ynu-hpcc-at-emoint-2017-using-a-cnn-lstm
Repo
Framework

Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks


Title	Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks
Authors	Prateep Bhattacharjee, Sukhendu Das
Abstract	Predicting the future from a sequence of video frames has been recently a sought after yet challenging task in the field of computer vision and machine learning. Although there have been efforts for tracking using motion trajectories and flow features, the complex problem of generating unseen frames has not been studied extensively. In this paper, we deal with this problem using convolutional models within a multi-stage Generative Adversarial Networks (GAN) framework. The proposed method uses two stages of GANs to generate a crisp and clear set of future frames. Although GANs have been used in the past for predicting the future, none of the works consider the relation between subsequent frames in the temporal dimension. Our main contribution lies in formulating two objective functions based on the Normalized Cross Correlation (NCC) and the Pairwise Contrastive Divergence (PCD) for solving this problem. This method, coupled with the traditional L1 loss, has been experimented with three real-world video datasets, viz. Sports-1M, UCF-101 and the KITTI. Performance analysis reveals superior results over the recent state-of-the-art methods.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/7014-temporal-coherency-based-criteria-for-predicting-video-frames-using-deep-multi-stage-generative-adversarial-networks
PDF	http://papers.nips.cc/paper/7014-temporal-coherency-based-criteria-for-predicting-video-frames-using-deep-multi-stage-generative-adversarial-networks.pdf
PWC	https://paperswithcode.com/paper/temporal-coherency-based-criteria-for
Repo
Framework

Gradient Emotional Analysis


Title	Gradient Emotional Analysis
Authors	Lilia Simeonova
Abstract	Over the past few years a lot of research has been done on sentiment analysis, however, the emotional analysis, being so subjective, is not a well examined dis-cipline. The main focus of this proposal is to categorize a given sentence in two dimensions - sentiment and arousal. For this purpose two techniques will be com-bined {–} Machine Learning approach and Lexicon-based approach. The first di-mension will give the sentiment value {–} positive versus negative. This will be re-solved by using Na{"\i}ve Bayes Classifier. The second and more interesting dimen-sion will determine the level of arousal. This will be achieved by evaluation of given a phrase or sentence based on lexi-con with affective ratings for 14 thousand English words.
Tasks	Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-2006/
PDF	https://doi.org/10.26615/issn.1314-9156.2017_006
PWC	https://paperswithcode.com/paper/gradient-emotional-analysis
Repo
Framework

Supervised Methods For Ranking Relations In Web Search


Title	Supervised Methods For Ranking Relations In Web Search
Authors	Sumit Asthana, Asif Ekbal
Abstract
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7529/
PDF	https://www.aclweb.org/anthology/W17-7529
PWC	https://paperswithcode.com/paper/supervised-methods-for-ranking-relations-in
Repo
Framework

CKIP at IJCNLP-2017 Task 2: Neural Valence-Arousal Prediction for Phrases


Title	CKIP at IJCNLP-2017 Task 2: Neural Valence-Arousal Prediction for Phrases
Authors	Peng-Hsuan Li, Wei-Yun Ma, Hsin-Yang Wang
Abstract	CKIP takes part in solving the Dimensional Sentiment Analysis for Chinese Phrases (DSAP) share task of IJCNLP 2017. This task calls for systems that can predict the valence and the arousal of Chinese phrases, which are real values between 1 and 9. To achieve this, functions mapping Chinese character sequences to real numbers are built by regression techniques. In addition, the CKIP phrase Valence-Arousal (VA) predictor depends on knowledge of modifier words and head words. This includes the types of known modifier words, VA of head words, and distributional semantics of both these words. The predictor took the second place out of 13 teams on phrase VA prediction, with 0.444 MAE and 0.935 PCC on valence, and 0.395 MAE and 0.904 PCC on arousal.
Tasks	Sentiment Analysis
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4014/
PDF	https://www.aclweb.org/anthology/I17-4014
PWC	https://paperswithcode.com/paper/ckip-at-ijcnlp-2017-task-2-neural-valence
Repo
Framework

Segmentation-Free Word Embedding for Unsegmented Languages


Title	Segmentation-Free Word Embedding for Unsegmented Languages
Authors	Takamasa Oshikiri
Abstract	In this paper, we propose a new pipeline of word embedding for unsegmented languages, called segmentation-free word embedding, which does not require word segmentation as a preprocessing step. Unlike space-delimited languages, unsegmented languages, such as Chinese and Japanese, require word segmentation as a preprocessing step. However, word segmentation, that often requires manually annotated resources, is difficult and expensive, and unavoidable errors in word segmentation affect downstream tasks. To avoid these problems in learning word vectors of unsegmented languages, we consider word co-occurrence statistics over all possible candidates of segmentations based on frequent character n-grams instead of segmented sentences provided by conventional word segmenters. Our experiments of noun category prediction tasks on raw Twitter, Weibo, and Wikipedia corpora show that the proposed method outperforms the conventional approaches that require word segmenters.
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1080/
PDF	https://www.aclweb.org/anthology/D17-1080
PWC	https://paperswithcode.com/paper/segmentation-free-word-embedding-for
Repo
Framework