Paper Group NANR 134
Team Peter-Parker at SemEval-2019 Task 4: BERT-Based Method in Hyperpartisan News Detection. The University of Maryland’s Kazakh-English Neural Machine Translation System at WMT19. Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task. Times Are Changing: Investigating the Pace of Language Change in Diachronic Word Em …
Team Peter-Parker at SemEval-2019 Task 4: BERT-Based Method in Hyperpartisan News Detection
Title | Team Peter-Parker at SemEval-2019 Task 4: BERT-Based Method in Hyperpartisan News Detection |
Authors | Zhiyuan Ning, Yuanzhen Lin, Ruichao Zhong |
Abstract | This paper describes the team peter-parker{'}s participation in Hyperpartisan News Detection task (SemEval-2019 Task 4), which requires to classify whether a given news article is bias or not. We decided to use JAVA to do the article parsing tool and the BERT-Based model to do the bias prediction. Furthermore, we will show experiment results with analysis. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2181/ |
https://www.aclweb.org/anthology/S19-2181 | |
PWC | https://paperswithcode.com/paper/team-peter-parker-at-semeval-2019-task-4-bert |
Repo | |
Framework | |
The University of Maryland’s Kazakh-English Neural Machine Translation System at WMT19
Title | The University of Maryland’s Kazakh-English Neural Machine Translation System at WMT19 |
Authors | Eleftheria Briakou, Marine Carpuat |
Abstract | This paper describes the University of Maryland{'}s submission to the WMT 2019 Kazakh-English news translation task. We study the impact of transfer learning from another low-resource but related language. We experiment with different ways of encoding lexical units to maximize lexical overlap between the two language pairs, as well as back-translation and ensembling. The submitted system improves over a Kazakh-only baseline by +5.45 BLEU on newstest2019. |
Tasks | Machine Translation, Transfer Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5308/ |
https://www.aclweb.org/anthology/W19-5308 | |
PWC | https://paperswithcode.com/paper/the-university-of-marylands-kazakh-english |
Repo | |
Framework | |
Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task
Title | Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task |
Authors | Frances Yung, Vera Demberg, Merel Scholman |
Abstract | The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora. It would also open the avenue to answering new research questions: Collecting annotations from a larger number of individuals per instance would allow to investigate the distribution of inferred relations, and to study individual differences in coherence relation interpretation. However, annotating coherence relations with untrained workers is not trivial. We here propose a novel two-step annotation procedure, which extends an earlier method by Scholman and Demberg (2017a). In our approach, coherence relation labels are inferred from connectives that workers insert into the text. We show that the proposed method leads to replicable coherence annotations, and analyse the agreement between the obtained relation labels and annotations from PDTB and RSTDT on the same texts. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4003/ |
https://www.aclweb.org/anthology/W19-4003 | |
PWC | https://paperswithcode.com/paper/crowdsourcing-discourse-relation-annotations |
Repo | |
Framework | |
Times Are Changing: Investigating the Pace of Language Change in Diachronic Word Embeddings
Title | Times Are Changing: Investigating the Pace of Language Change in Diachronic Word Embeddings |
Authors | Br, Stephanie l, David Lassner |
Abstract | We propose Word Embedding Networks, a novel method that is able to learn word embeddings of individual data slices while simultaneously aligning and ordering them without feeding temporal information a priori to the model. This gives us the opportunity to analyse the dynamics in word embeddings on a large scale in a purely data-driven manner. In experiments on two different newspaper corpora, the New York Times (English) and die Zeit (German), we were able to show that time actually determines the dynamics of semantic change. However, there is by no means a uniform evolution, but instead times of faster and times of slower change. |
Tasks | Word Embeddings |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4718/ |
https://www.aclweb.org/anthology/W19-4718 | |
PWC | https://paperswithcode.com/paper/times-are-changing-investigating-the-pace-of |
Repo | |
Framework | |
The making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children
Title | The making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children |
Authors | Ronja Laarmann-Quante, Stefanie Dipper, Eva Belke |
Abstract | To date, corpus and computational linguistic work on written language acquisition has mostly dealt with second language learners who have usually already mastered orthography acquisition in their first language. In this paper, we present the Litkey Corpus, a richly-annotated longitudinal corpus of written texts produced by primary school children in Germany from grades 2 to 4. The paper focuses on the (semi-)automatic annotation procedure at various linguistic levels, which include POS tags, features of the word-internal structure (phonemes, syllables, morphemes) and key orthographic features of the target words as well as a categorization of spelling errors. Comprehensive evaluations show that high accuracy was achieved on all levels, making the Litkey Corpus a useful resource for corpus-based research on literacy acquisition of German primary school children and for developing NLP tools for educational purposes. The corpus is freely available under https://www.linguistics.rub.de/litkeycorpus/. |
Tasks | Language Acquisition |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4006/ |
https://www.aclweb.org/anthology/W19-4006 | |
PWC | https://paperswithcode.com/paper/the-making-of-the-litkey-corpus-a-richly |
Repo | |
Framework | |
Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019)
Title | Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019) |
Authors | |
Abstract | |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6500/ |
https://www.aclweb.org/anthology/D19-6500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-fourth-workshop-on-5 |
Repo | |
Framework | |
Zero-Resource Neural Machine Translation with Monolingual Pivot Data
Title | Zero-Resource Neural Machine Translation with Monolingual Pivot Data |
Authors | Anna Currey, Kenneth Heafield |
Abstract | Zero-shot neural machine translation (NMT) is a framework that uses source-pivot and target-pivot parallel data to train a source-target NMT system. An extension to zero-shot NMT is zero-resource NMT, which generates pseudo-parallel corpora using a zero-shot system and further trains the zero-shot system on that data. In this paper, we expand on zero-resource NMT by incorporating monolingual data in the pivot language into training; since the pivot language is usually the highest-resource language of the three, we expect monolingual pivot-language data to be most abundant. We propose methods for generating pseudo-parallel corpora using pivot-language monolingual data and for leveraging the pseudo-parallel corpora to improve the zero-shot NMT system. We evaluate these methods for a high-resource language pair (German-Russian) using English as the pivot. We show that our proposed methods yield consistent improvements over strong zero-shot and zero-resource baselines and even catch up to pivot-based models in BLEU (while not requiring the two-pass inference that pivot models require). |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5610/ |
https://www.aclweb.org/anthology/D19-5610 | |
PWC | https://paperswithcode.com/paper/zero-resource-neural-machine-translation-with-1 |
Repo | |
Framework | |
Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings
Title | Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings |
Authors | Philippa Shoemark, Farhana Ferdousi Liza, Dong Nguyen, Scott Hale, Barbara McGillivray |
Abstract | Word embeddings are increasingly used for the automatic detection of semantic change; yet, a robust evaluation and systematic comparison of the choices involved has been lacking. We propose a new evaluation framework for semantic change detection and find that (i) using the whole time series is preferable over only comparing between the first and last time points; (ii) independently trained and aligned embeddings perform better than continuously trained embeddings for long time periods; and (iii) that the reference point for comparison matters. We also present an analysis of the changes detected on a large Twitter dataset spanning 5.5 years. |
Tasks | Time Series, Word Embeddings |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1007/ |
https://www.aclweb.org/anthology/D19-1007 | |
PWC | https://paperswithcode.com/paper/room-to-glo-a-systematic-comparison-of |
Repo | |
Framework | |
Attention and Lexicon Regularized LSTM for Aspect-based Sentiment Analysis
Title | Attention and Lexicon Regularized LSTM for Aspect-based Sentiment Analysis |
Authors | Lingxian Bao, Patrik Lambert, Toni Badia |
Abstract | Abstract Attention based deep learning systems have been demonstrated to be the state of the art approach for aspect-level sentiment analysis, however, end-to-end deep neural networks lack flexibility as one can not easily adjust the network to fix an obvious problem, especially when more training data is not available: e.g. when it always predicts \textit{positive} when seeing the word \textit{disappointed}. Meanwhile, it is less stressed that attention mechanism is likely to {}over-focus{''} on particular parts of a sentence, while ignoring positions which provide key information for judging the polarity. In this paper, we describe a simple yet effective approach to leverage lexicon information so that the model becomes more flexible and robust. We also explore the effect of regularizing attention vectors to allow the network to have a broader { }focus{''} on different parts of the sentence. The experimental results demonstrate the effectiveness of our approach. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-2035/ |
https://www.aclweb.org/anthology/P19-2035 | |
PWC | https://paperswithcode.com/paper/attention-and-lexicon-regularized-lstm-for |
Repo | |
Framework | |
Entity-level Classification of Adverse Drug Reactions: a Comparison of Neural Network Models
Title | Entity-level Classification of Adverse Drug Reactions: a Comparison of Neural Network Models |
Authors | Ilseyar Alimova, Elena Tutubalina |
Abstract | This paper presents our experimental work on exploring the potential of neural network models developed for aspect-based sentiment analysis for entity-level adverse drug reaction (ADR) classification. Our goal is to explore how to represent local context around ADR mentions and learn an entity representation, interacting with its context. We conducted extensive experiments on various sources of text-based information, including social media, electronic health records, and abstracts of scientific articles from PubMed. The results show that Interactive Attention Neural Network (IAN) outperformed other models on four corpora in terms of macro F-measure. This work is an abridged version of our recent paper accepted to Programming and Computer Software journal in 2019. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3641/ |
https://www.aclweb.org/anthology/W19-3641 | |
PWC | https://paperswithcode.com/paper/entity-level-classification-of-adverse-drug |
Repo | |
Framework | |
A Working Memory Model for Task-oriented Dialog Response Generation
Title | A Working Memory Model for Task-oriented Dialog Response Generation |
Authors | Xiuyi Chen, Jiaming Xu, Bo Xu |
Abstract | Recently, to incorporate external Knowledge Base (KB) information, one form of world knowledge, several end-to-end task-oriented dialog systems have been proposed. These models, however, tend to confound the dialog history with KB tuples and simply store them into one memory. Inspired by the psychological studies on working memory, we propose a working memory model (WMM2Seq) for dialog response generation. Our WMM2Seq adopts a working memory to interact with two separated long-term memories, which are the episodic memory for memorizing dialog history and the semantic memory for storing KB tuples. The working memory consists of a central executive to attend to the aforementioned memories, and a short-term storage system to store the {``}activated{''} contents from the long-term memories. Furthermore, we introduce a context-sensitive perceptual process for the token representations of dialog history, and then feed them into the episodic memory. Extensive experiments on two task-oriented dialog datasets demonstrate that our WMM2Seq significantly outperforms the state-of-the-art results in several evaluation metrics. | |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1258/ |
https://www.aclweb.org/anthology/P19-1258 | |
PWC | https://paperswithcode.com/paper/a-working-memory-model-for-task-oriented |
Repo | |
Framework | |
Building a Comprehensive Romanian Knowledge Base for Drug Administration
Title | Building a Comprehensive Romanian Knowledge Base for Drug Administration |
Authors | Bogdan Nicula, Mihai Dascalu, Maria-Dorinela S{^\i}rbu, {\textcommabelow{S}}tefan Tr{\u{a}}u{\textcommabelow{s}}an-Matu, Alex Nu{\textcommabelow{t}}{\u{a}}, ru |
Abstract | Information on drug administration is obtained traditionally from doctors and pharmacists, as well as leaflets which provide in most cases cumbersome and hard-to-follow details. Thus, the need for medical knowledge bases emerges to provide access to concrete and well-structured information which can play an important role in informing patients. This paper introduces a Romanian medical knowledge base focused on drug-drug interactions, on representing relevant drug information, and on symptom-disease relations. The knowledge base was created by extracting and transforming information using Natural Language Processing techniques from both structured and unstructured sources, together with manual annotations. The resulting Romanian ontologies are aligned with larger English medical ontologies. Our knowledge base supports queries regarding drugs (e.g., active ingredients, concentration, expiration date), drug-drug interaction, symptom-disease relations, as well as drug-symptom relations. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1096/ |
https://www.aclweb.org/anthology/R19-1096 | |
PWC | https://paperswithcode.com/paper/building-a-comprehensive-romanian-knowledge |
Repo | |
Framework | |
Presupposition Projection and Repair Strategies in Trivalent Semantics
Title | Presupposition Projection and Repair Strategies in Trivalent Semantics |
Authors | Yoad Winter |
Abstract | |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/W19-5703/ |
https://www.aclweb.org/anthology/W19-5703 | |
PWC | https://paperswithcode.com/paper/presupposition-projection-and-repair |
Repo | |
Framework | |
Nonsense!: Quality Control via Two-Step Reason Selection for Annotating Local Acceptability and Related Attributes in News Editorials
Title | Nonsense!: Quality Control via Two-Step Reason Selection for Annotating Local Acceptability and Related Attributes in News Editorials |
Authors | Wonsuk Yang, Seungwon Yoon, Ada Carpenter, Jong Park |
Abstract | Annotation quality control is a critical aspect for building reliable corpora through linguistic annotation. In this study, we present a simple but powerful quality control method using two-step reason selection. We gathered sentential annotations of local acceptance and three related attributes through a crowdsourcing platform. For each attribute, the reason for the choice of the attribute value is selected in a two-step manner. The options given for reason selection were designed to facilitate the detection of a nonsensical reason selection. We assume that a sentential annotation that contains a nonsensical reason is less reliable than the one without such reason. Our method, based solely on this assumption, is found to retain the annotations with satisfactory quality out of the entire annotations mixed with those of low quality. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1293/ |
https://www.aclweb.org/anthology/D19-1293 | |
PWC | https://paperswithcode.com/paper/nonsense-quality-control-via-two-step-reason |
Repo | |
Framework | |
The IIIT-H Gujarati-English Machine Translation System for WMT19
Title | The IIIT-H Gujarati-English Machine Translation System for WMT19 |
Authors | Vikrant Goyal, Dipti Misra Sharma |
Abstract | This paper describes the Neural Machine Translation system of IIIT-Hyderabad for the Gujarati→English news translation shared task of WMT19. Our system is basedon encoder-decoder framework with attention mechanism. We experimented with Multilingual Neural MT models. Our experiments show that Multilingual Neural Machine Translation leveraging parallel data from related language pairs helps in significant BLEU improvements upto 11.5, for low resource language pairs like Gujarati-English |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5316/ |
https://www.aclweb.org/anthology/W19-5316 | |
PWC | https://paperswithcode.com/paper/the-iiit-h-gujarati-english-machine |
Repo | |
Framework | |