January 24, 2020

2483 words 12 mins read

Paper Group NANR 225

Paper Group NANR 225

``A Buster Keaton of Linguistics’': First Automated Approaches for the Extraction of Vossian Antonomasia. Proceedings of TyP-NLP: The First Workshop on Typology for Polyglot NLP. sthruggle at SemEval-2019 Task 5: An Ensemble Approach to Hate Speech Detection. On the Relation between Position Information and Sentence Length in Neural Machine Transla …

``A Buster Keaton of Linguistics’': First Automated Approaches for the Extraction of Vossian Antonomasia

Title ``A Buster Keaton of Linguistics’': First Automated Approaches for the Extraction of Vossian Antonomasia |
Authors Michel Schwab, Robert J{"a}schke, Frank Fischer, Jannik Str{"o}tgen
Abstract Attributing a particular property to a person by naming another person, who is typically wellknown for the respective property, is called a Vossian Antonomasia (VA). This subtpye of metonymy, which overlaps with metaphor, has a specific syntax and is especially frequent in journalistic texts. While identifying Vossian Antonomasia is of particular interest in the study of stylistics, it is also a source of errors in relation and fact extraction as an explicitly mentioned entity occurs only metaphorically and should not be associated with respective contexts. Despite rather simple syntactic variations, the automatic extraction of VA was never addressed as yet since it requires a deeper semantic understanding of mentioned entities and underlying relations. In this paper, we propose a first method for the extraction of VAs that works completely automatically. Our approaches use named entity recognition, distant supervision based on Wikidata, and a bi-directional LSTM for postprocessing. The evaluation on 1.8 million articles of the New York Times corpus shows that our approach significantly outperforms the only existing semi-automatic approach for VA identification by more than 30 percentage points in precision.
Tasks Named Entity Recognition
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1647/
PDF https://www.aclweb.org/anthology/D19-1647
PWC https://paperswithcode.com/paper/a-buster-keaton-of-linguistics-first
Repo
Framework

Proceedings of TyP-NLP: The First Workshop on Typology for Polyglot NLP

Title Proceedings of TyP-NLP: The First Workshop on Typology for Polyglot NLP
Authors
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4900/
PDF https://www.aclweb.org/anthology/W19-4900
PWC https://paperswithcode.com/paper/proceedings-of-typ-nlp-the-first-workshop-on
Repo
Framework

sthruggle at SemEval-2019 Task 5: An Ensemble Approach to Hate Speech Detection

Title sthruggle at SemEval-2019 Task 5: An Ensemble Approach to Hate Speech Detection
Authors Aria Nourbakhsh, Frida Vermeer, Gijs Wiltvank, Rob van der Goot
Abstract In this paper, we present our approach to detection of hate speech against women and immigrants in tweets for our participation in the SemEval-2019 Task 5. We trained an SVM and an RF classifier using character bi- and trigram features and a BiLSTM pre-initialized with external word embeddings. We combined the predictions of the SVM, RF and BiLSTM in two different ensemble models. The first was a majority vote of the binary values, and the second used the average of the confidence scores. For development, we got the highest accuracy (75{%}) by the final ensemble model with majority voting. For testing, all models scored substantially lower and the scores between the classifiers varied more. We believe that these large differences between the higher accuracies in the development phase and the lower accuracies we obtained in the testing phase have partly to do with differences between the training, development and testing data.
Tasks Hate Speech Detection, Word Embeddings
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2086/
PDF https://www.aclweb.org/anthology/S19-2086
PWC https://paperswithcode.com/paper/sthruggle-at-semeval-2019-task-5-an-ensemble
Repo
Framework

On the Relation between Position Information and Sentence Length in Neural Machine Translation

Title On the Relation between Position Information and Sentence Length in Neural Machine Translation
Authors Masato Neishi, Naoki Yoshinaga
Abstract Long sentences have been one of the major challenges in neural machine translation (NMT). Although some approaches such as the attention mechanism have partially remedied the problem, we found that the current standard NMT model, Transformer, has difficulty in translating long sentences compared to the former standard, Recurrent Neural Network (RNN)-based model. One of the key differences of these NMT models is how the model handles position information which is essential to process sequential data. In this study, we focus on the position information type of NMT models, and hypothesize that relative position is better than absolute position. To examine the hypothesis, we propose RNN-Transformer which replaces positional encoding layer of Transformer by RNN, and then compare RNN-based model and four variants of Transformer. Experiments on ASPEC English-to-Japanese and WMT2014 English-to-German translation tasks demonstrate that relative position helps translating sentences longer than those in the training data. Further experiments on length-controlled training data reveal that absolute position actually causes overfitting to the sentence length.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/K19-1031/
PDF https://www.aclweb.org/anthology/K19-1031
PWC https://paperswithcode.com/paper/on-the-relation-between-position-information
Repo
Framework

Biomedical Named Entity Recognition with Multilingual BERT

Title Biomedical Named Entity Recognition with Multilingual BERT
Authors Kai Hakala, Sampo Pyysalo
Abstract We present the approach of the Turku NLP group to the PharmaCoNER task on Spanish biomedical named entity recognition. We apply a CRF-based baseline approach and multilingual BERT to the task, achieving an F-score of 88{%} on the development data and 87{%} on the test set with BERT. Our approach reflects a straightforward application of a state-of-the-art multilingual model that is not specifically tailored to either the language nor the application domain. The source code is available at: https://github.com/chaanim/pharmaconer
Tasks Named Entity Recognition
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5709/
PDF https://www.aclweb.org/anthology/D19-5709
PWC https://paperswithcode.com/paper/biomedical-named-entity-recognition-with
Repo
Framework

Rubric Reliability and Annotation of Content and Argument in Source-Based Argument Essays

Title Rubric Reliability and Annotation of Content and Argument in Source-Based Argument Essays
Authors Yanjun Gao, Alex Driban, Brennan Xavier McManus, Elena Musi, Patricia Davies, Smar Muresan, a, Rebecca J. Passonneau
Abstract We present a unique dataset of student source-based argument essays to facilitate research on the relations between content, argumentation skills, and assessment. Two classroom writing assignments were given to college students in a STEM major, accompanied by a carefully designed rubric. The paper presents a reliability study of the rubric, showing it to be highly reliable, and initial annotation on content and argumentation annotation of the essays.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4452/
PDF https://www.aclweb.org/anthology/W19-4452
PWC https://paperswithcode.com/paper/rubric-reliability-and-annotation-of-content
Repo
Framework

RACAI’s System at PharmaCoNER 2019

Title RACAI’s System at PharmaCoNER 2019
Authors Radu Ion, Vasile Florian P{\u{a}}i{\textcommabelow{s}}, Maria Mitrofan
Abstract This paper describes the Named Entity Recognition system of the Institute for Artificial Intelligence {``}Mihai Dr{\u{a}}g{\u{a}}nescu{''} of the Romanian Academy (RACAI for short). Our best F1 score of 0.84984 was achieved using an ensemble of two systems: a gazetteer-based baseline and a RNN-based NER system, developed specially for PharmaCoNER 2019. We will describe the individual systems and the ensemble algorithm, compare the final system to the current state of the art, as well as discuss our results with respect to the quality of the training data and its annotation strategy. The resulting NER system is language independent, provided that language-dependent resources and preprocessing tools exist, such as tokenizers and POS taggers. |
Tasks Named Entity Recognition
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5714/
PDF https://www.aclweb.org/anthology/D19-5714
PWC https://paperswithcode.com/paper/racais-system-at-pharmaconer-2019
Repo
Framework

Identifying Predictive Causal Factors from News Streams

Title Identifying Predictive Causal Factors from News Streams
Authors Ananth Balashankar, Sun Chakraborty, an, Samuel Fraiberger, Lakshminarayanan Subramanian
Abstract We propose a new framework to uncover the relationship between news events and real world phenomena. We present the Predictive Causal Graph (PCG) which allows to detect latent relationships between events mentioned in news streams. This graph is constructed by measuring how the occurrence of a word in the news influences the occurrence of another (set of) word(s) in the future. We show that PCG can be used to extract latent features from news streams, outperforming other graph-based methods in prediction error of 10 stock price time series for 12 months. We then extended PCG to be applicable for longer time windows by allowing time-varying factors, leading to stock price prediction error rates between 1.5{%} and 5{%} for about 4 years. We then manually validated PCG, finding that 67{%} of the causation semantic frame arguments present in the news corpus were directly connected in the PCG, the remaining being connected through a semantically relevant intermediate node.
Tasks Stock Price Prediction, Time Series
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1238/
PDF https://www.aclweb.org/anthology/D19-1238
PWC https://paperswithcode.com/paper/identifying-predictive-causal-factors-from
Repo
Framework

The LMU Munich Unsupervised Machine Translation System for WMT19

Title The LMU Munich Unsupervised Machine Translation System for WMT19
Authors Dario Stojanovski, Viktor Hangya, Matthias Huck, Alex Fraser, er
Abstract We describe LMU Munich{'}s machine translation system for German→Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation. We train our model using monolingual data only from both languages. The final model is an unsupervised neural model using established techniques for unsupervised translation such as denoising autoencoding and online back-translation. We bootstrap the model with masked language model pretraining and enhance it with back-translations from an unsupervised phrase-based system which is itself bootstrapped using unsupervised bilingual word embeddings.
Tasks Denoising, Language Modelling, Machine Translation, Unsupervised Machine Translation, Word Embeddings
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5344/
PDF https://www.aclweb.org/anthology/W19-5344
PWC https://paperswithcode.com/paper/the-lmu-munich-unsupervised-machine-1
Repo
Framework

UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition

Title UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition
Authors Lenz Furrer, Joseph Cornelius, Fabio Rinaldi
Abstract As our submission to the CRAFT shared task 2019, we present two neural approaches to concept recognition. We propose two different systems for joint named entity recognition (NER) and normalization (NEN), both of which model the task as a sequence labeling problem. Our first system is a BiLSTM network with two separate outputs for NER and NEN trained from scratch, whereas the second system is an instance of BioBERT fine-tuned on the concept-recognition task. We exploit two strategies for extending concept coverage, ontology pretraining and backoff with a dictionary lookup. Our results show that the backoff strategy effectively tackles the problem of unseen concepts, addressing a major limitation of the chosen design. In the cross-system comparison, BioBERT proves to be a strong basis for creating a concept-recognition system, although some entity types are predicted more accurately by the BiLSTM-based system.
Tasks Named Entity Recognition
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5726/
PDF https://www.aclweb.org/anthology/D19-5726
PWC https://paperswithcode.com/paper/uzhcraft-st-a-sequence-labeling-approach-to
Repo
Framework

Monocular Depth Estimation Using Relative Depth Maps

Title Monocular Depth Estimation Using Relative Depth Maps
Authors Jae-Han Lee, Chang-Su Kim
Abstract We propose a novel algorithm for monocular depth estimation using relative depth maps. First, using a convolutional neural network, we estimate relative depths between pairs of regions, as well as ordinary depths, at various scales. Second, we restore relative depth maps from selectively estimated data based on the rank-1 property of pairwise comparison matrices. Third, we decompose ordinary and relative depth maps into components and recombine them optimally to reconstruct a final depth map. Experimental results show that the proposed algorithm provides the state-of-art depth estimation performance.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Lee_Monocular_Depth_Estimation_Using_Relative_Depth_Maps_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Lee_Monocular_Depth_Estimation_Using_Relative_Depth_Maps_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/monocular-depth-estimation-using-relative
Repo
Framework

Creation of a corpus with semantic role labels for Hungarian

Title Creation of a corpus with semantic role labels for Hungarian
Authors Attila Nov{'a}k, L{'a}szl{'o} Laki, Borb{'a}la Nov{'a}k, Andrea D{"o}m{"o}t{"o}r, No{'e}mi Ligeti-Nagy, {'A}gnes Kalivoda
Abstract In this article, an ongoing research is presented, the immediate goal of which is to create a corpus annotated with semantic role labels for Hungarian that can be used to train a parser-based system capable of formulating relevant questions about the text it processes. We briefly describe the objectives of our research, our efforts at eliminating errors in the Hungarian Universal Dependencies corpus, which we use as the base of our annotation effort, at creating a Hungarian verbal argument database annotated with thematic roles, at classifying adjuncts, and at matching verbal argument frames to specific occurrences of verbs and participles in the corpus.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4026/
PDF https://www.aclweb.org/anthology/W19-4026
PWC https://paperswithcode.com/paper/creation-of-a-corpus-with-semantic-role
Repo
Framework

AE2-Nets: Autoencoder in Autoencoder Networks

Title AE2-Nets: Autoencoder in Autoencoder Networks
Authors Changqing Zhang, Yeqing Liu, Huazhu Fu
Abstract Learning on data represented with multiple views (e.g., multiple types of descriptors or modalities) is a rapidly growing direction in machine learning and computer vision. Although effectiveness achieved, most existing algorithms usually focus on classification or clustering tasks. Differently, in this paper, we focus on unsupervised representation learning and propose a novel framework termed Autoencoder in Autoencoder Networks (AE^2-Nets), which integrates information from heterogeneous sources into an intact representation by the nested autoencoder framework. The proposed method has the following merits: (1) our model jointly performs view-specific representation learning (with the inner autoencoder networks) and multi-view information encoding (with the outer autoencoder networks) in a unified framework; (2) due to the degradation process from the latent representation to each single view, our model flexibly balances the complementarity and consistence among multiple views. The proposed model is efficiently solved by the alternating direction method (ADM), and demonstrates the effectiveness compared with state-of-the-art algorithms.
Tasks Representation Learning, Unsupervised Representation Learning
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_AE2-Nets_Autoencoder_in_Autoencoder_Networks_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_AE2-Nets_Autoencoder_in_Autoencoder_Networks_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/ae2-nets-autoencoder-in-autoencoder-networks
Repo
Framework

Propagation Mechanism for Deep and Wide Neural Networks

Title Propagation Mechanism for Deep and Wide Neural Networks
Authors Dejiang Xu, Mong Li Lee, Wynne Hsu
Abstract Recent deep neural networks (DNN) utilize identity mappings involving either element-wise addition or channel-wise concatenation for the propagation of these identity mappings. In this paper, we propose a new propagation mechanism called channel-wise addition (cAdd) to deal with the vanishing gradients problem without sacrificing the complexity of the learned features. Unlike channel-wise concatenation, cAdd is able to eliminate the need to store feature maps thus reducing the memory requirement. The proposed cAdd mechanism can deepen and widen existing neural architectures with fewer parameters compared to channel-wise concatenation and element-wise addition. We incorporate cAdd into state-of-the-art architectures such as ResNet, WideResNet, and CondenseNet and carry out extensive experiments on CIFAR10, CIFAR100, SVHN and ImageNet to demonstrate that cAdd-based architectures are able to achieve much higher accuracy with fewer parameters compared to their corresponding base architectures.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_Propagation_Mechanism_for_Deep_and_Wide_Neural_Networks_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Xu_Propagation_Mechanism_for_Deep_and_Wide_Neural_Networks_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/propagation-mechanism-for-deep-and-wide
Repo
Framework

A Fast Machine Learning Model for ECG-Based Heartbeat Classification and Arrhythmia Detection

Title A Fast Machine Learning Model for ECG-Based Heartbeat Classification and Arrhythmia Detection
Authors Miquel Alfaras, Miguel C. Soriano, Silvia Ortín
Abstract We present a fully automatic and fast ECG arrhythmia classifier based on a simple brain-inspired machine learning approach known as Echo State Networks. Our classifier has a low-demanding feature processing that only requires a single ECG lead. Its training and validation follows an inter-patient procedure. Our approach is compatible with an online classification that aligns well with recent advances in health-monitoring wireless devices and wearables. The use of a combination of ensembles allows us to exploit parallelism to train the classifier with remarkable speeds. The heartbeat classifier is evaluated over two ECG databases, the MIT-BIH AR and the AHA. In the MIT-BIH AR database, our classification approach provides a sensitivity of 92.7% and positive predictive value of 86.1% for the ventricular ectopic beats, using the single lead II, and a sensitivity of 95.7% and positive predictive value of 75.1% when using the lead V1’. These results are comparable with the state of the art in fully automatic ECG classifiers and even outperform other ECG classifiers that follow more complex feature-selection approaches.
Tasks Arrhythmia Detection, Electrocardiography (ECG), Feature Selection, Heartbeat Classification
Published 2019-07-18
URL https://doi.org/10.3389/fphy.2019.00103
PDF https://www.frontiersin.org/articles/10.3389/fphy.2019.00103/pdf
PWC https://paperswithcode.com/paper/a-fast-machine-learning-model-for-ecg-based
Repo
Framework
comments powered by Disqus