January 24, 2020

755 words 4 mins read

Paper Group NANR 271

Paper Group NANR 271

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension. Generalizing Question Answering System with Pre-trained Language Model Fine-tuning. Crowd-sourcing annotation of complex NLU tasks: A case study of argumentative content annotation. TakeLab at SemEval-2019 Task 4: Hyperpartisan News De …

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension

Title D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension
Authors Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang
Abstract In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models. Our system is built on a framework of pretraining and fine-tuning, namely D-NET. The techniques of pre-trained language models and multi-task learning are explored to improve the generalization of MRC models and we conduct experiments to examine the effectiveness of these strategies. Our system is ranked at top 1 of all the participants in terms of averaged F1 score. Our codes and models will be released at PaddleNLP.
Tasks Machine Reading Comprehension, Multi-Task Learning, Question Answering, Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5828/
PDF https://www.aclweb.org/anthology/D19-5828
PWC https://paperswithcode.com/paper/d-net-a-pre-training-and-fine-tuning
Repo
Framework

Generalizing Question Answering System with Pre-trained Language Model Fine-tuning

Title Generalizing Question Answering System with Pre-trained Language Model Fine-tuning
Authors Dan Su, Yan Xu, Genta Indra Winata, Peng Xu, Hyeondey Kim, Zihan Liu, Pascale Fung
Abstract With a large number of datasets being released and new techniques being proposed, Question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC)tasks. However, most existing methods focus on improving in-domain performance, leaving open the research question of how these mod-els and techniques can generalize to out-of-domain and unseen RC tasks. To enhance the generalization ability, we propose a multi-task learning framework that learns the shared representation across different tasks. Our model is built on top of a large pre-trained language model, such as XLNet, and then fine-tuned on multiple RC datasets. Experimental results show the effectiveness of our methods, with an average Exact Match score of 56.59 and an average F1 score of 68.98, which significantly improves the BERT-Large baseline by8.39 and 7.22, respectively
Tasks Language Modelling, Multi-Task Learning, Question Answering, Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5827/
PDF https://www.aclweb.org/anthology/D19-5827
PWC https://paperswithcode.com/paper/generalizing-question-answering-system-with
Repo
Framework

Crowd-sourcing annotation of complex NLU tasks: A case study of argumentative content annotation

Title Crowd-sourcing annotation of complex NLU tasks: A case study of argumentative content annotation
Authors Tamar Lavee, Lili Kotlerman, Matan Orbach, Yonatan Bilu, Michal Jacovi, Ranit Aharonov, Noam Slonim
Abstract Recent advancements in machine reading and listening comprehension involve the annotation of long texts. Such tasks are typically time consuming, making crowd-annotations an attractive solution, yet their complexity often makes such a solution unfeasible. In particular, a major concern is that crowd annotators may be tempted to skim through long texts, and answer questions without reading thoroughly. We present a case study of adapting this type of task to the crowd. The task is to identify claims in a several minute long debate speech. We show that sentence-by-sentence annotation does not scale and that labeling only a subset of sentences is insufficient. Instead, we propose a scheme for effectively performing the full, complex task with crowd annotators, allowing the collection of large scale annotated datasets. We believe that the encountered challenges and pitfalls, as well as lessons learned, are relevant in general when collecting data for large scale natural language understanding (NLU) tasks.
Tasks Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5905/
PDF https://www.aclweb.org/anthology/D19-5905
PWC https://paperswithcode.com/paper/crowd-sourcing-annotation-of-complex-nlu
Repo
Framework

TakeLab at SemEval-2019 Task 4: Hyperpartisan News Detection

Title TakeLab at SemEval-2019 Task 4: Hyperpartisan News Detection
Authors Niko Pali{'c}, Juraj Vladika, Dominik {\v{C}}ubeli{'c}, Ivan Lovren{\v{c}}i{'c}, Maja Buljan, Jan {\v{S}}najder
Abstract In this paper, we demonstrate the system built to solve the SemEval-2019 task 4: Hyperpartisan News Detection (Kiesel et al., 2019), the task of automatically determining whether an article is heavily biased towards one side of the political spectrum. Our system receives an article in its raw, textual form, analyzes it, and predicts with moderate accuracy whether the article is hyperpartisan. The learning model used was primarily trained on a manually prelabeled dataset containing news articles. The system relies on the previously constructed SVM model, available in the Python Scikit-Learn library. We ranked 6th in the competition of 42 teams with an accuracy of 79.1{%} (the winning team had 82.2{%}).
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2172/
PDF https://www.aclweb.org/anthology/S19-2172
PWC https://paperswithcode.com/paper/takelab-at-semeval-2019-task-4-hyperpartisan
Repo
Framework
comments powered by Disqus