July 26, 2019

2207 words 11 mins read

Paper Group NANR 68

Paper Group NANR 68

Adversarial Training for Relation Extraction. Towards efficient string processing of annotated events. An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts. Learning to Model the Tail. SINAI at SemEval-2017 Task 4: User based classification. The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTrage …

Adversarial Training for Relation Extraction

Title Adversarial Training for Relation Extraction
Authors Yi Wu, David Bamman, Stuart Russell
Abstract Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data. We apply adversarial training in relation extraction within the multi-instance multi-label learning framework. We evaluate various neural network architectures on two different datasets. Experimental results demonstrate that adversarial training is generally effective for both CNN and RNN models and significantly improves the precision of predicted relations.
Tasks Image Classification, Multi-Label Learning, Relation Extraction, Text Classification
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1187/
PDF https://www.aclweb.org/anthology/D17-1187
PWC https://paperswithcode.com/paper/adversarial-training-for-relation-extraction
Repo
Framework

Towards efficient string processing of annotated events

Title Towards efficient string processing of annotated events
Authors David Woods, Fern, Tim o, Carl Vogel
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7414/
PDF https://www.aclweb.org/anthology/W17-7414
PWC https://paperswithcode.com/paper/towards-efficient-string-processing-of
Repo
Framework

An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts

Title An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts
Authors Cecilia Ovesdotter Alm, Benjamin Meyers, Emily Prud{'}hommeaux
Abstract We present an educational tool that integrates computational linguistics resources for use in non-technical undergraduate language science courses. By using the tool in conjunction with evidence-driven pedagogical case studies, we strive to provide opportunities for students to gain an understanding of linguistic concepts and analysis through the lens of realistic problems in feasible ways. Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and learning of linguistics. The approach introduced also has potential to encourage students across training backgrounds to continue on to computational language analysis coursework.
Tasks Active Learning
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-2003/
PDF https://www.aclweb.org/anthology/D17-2003
PWC https://paperswithcode.com/paper/an-analysis-and-visualization-tool-for-case
Repo
Framework

Learning to Model the Tail

Title Learning to Model the Tail
Authors Yu-Xiong Wang, Deva Ramanan, Martial Hebert
Abstract We describe an approach to learning from long-tailed, imbalanced datasets that are prevalent in real-world settings. Here, the challenge is to learn accurate “few-shot’’ models for classes in the tail of the class distribution, for which little data is available. We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail. Our key insights are as follows. First, we propose to transfer meta-knowledge about learning-to-learn from the head classes. This knowledge is encoded with a meta-network that operates on the space of model parameters, that is trained to predict many-shot model parameters from few-shot model parameters. Second, we transfer this meta-knowledge in a progressive manner, from classes in the head to the “body’', and from the “body’’ to the tail. That is, we transfer knowledge in a gradual fashion, regularizing meta-networks for few-shot regression with those trained with more training data. This allows our final network to capture a notion of model dynamics, that predicts how model parameters are likely to change as more training data is gradually added. We demonstrate results on image classification datasets (SUN, Places, and ImageNet) tuned for the long-tailed setting, that significantly outperform common heuristics, such as data resampling or reweighting.
Tasks few-shot regression, Image Classification, Transfer Learning
Published 2017-12-01
URL http://papers.nips.cc/paper/7278-learning-to-model-the-tail
PDF http://papers.nips.cc/paper/7278-learning-to-model-the-tail.pdf
PWC https://paperswithcode.com/paper/learning-to-model-the-tail
Repo
Framework

SINAI at SemEval-2017 Task 4: User based classification

Title SINAI at SemEval-2017 Task 4: User based classification
Authors Salud Mar{'\i}a Jim{'e}nez-Zafra, Arturo Montejo-R{'a}ez, Maite Martin, L. Alfonso Ure{~n}a-L{'o}pez
Abstract This document describes our participation in SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have only reported results for subtask B - English, determining the polarity towards a topic on a two point scale (positive or negative sentiment). Our main contribution is the integration of user information in the classification process. A SVM model is trained with Word2Vec vectors from user{'}s tweets extracted from his timeline. The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are error prone because they are classified by an imperfect system. This encourages us to explore further integration of user information for author-based Sentiment Analysis.
Tasks Sentiment Analysis, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2104/
PDF https://www.aclweb.org/anthology/S17-2104
PWC https://paperswithcode.com/paper/sinai-at-semeval-2017-task-4-user-based
Repo
Framework

The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy

Title The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy
Authors {'E}ric de La Clergerie, Beno{^\i}t Sagot, Djam{'e} Seddah
Abstract We present the ParisNLP entry at the UD CoNLL 2017 parsing shared task. In addition to the UDpipe models provided, we built our own data-driven tokenization models, sentence segmenter and lexicon-based morphological analyzers. All of these were used with a range of different parsing models (neural or not, feature-rich or not, transition or graph-based, etc.) and the best combination for each language was selected. Unfortunately, a glitch in the shared task{'}s Matrix led our model selector to run generic, weakly lexicalized models, tailored for surprise languages, instead of our dataset-specific models. Because of this {#}ParsingTragedy, we officially ranked 27th, whereas our real models finally unofficially ranked 6th.
Tasks Dependency Parsing, Tokenization
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-3026/
PDF https://www.aclweb.org/anthology/K17-3026
PWC https://paperswithcode.com/paper/the-parisnlp-entry-at-the-conll-ud-shared
Repo
Framework

Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models

Title Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models
Authors Subhradeep Kayal, Zubair Afzal, George Tsatsaronis, Sophia Katrenko, Pascal Coupet, Marius Doornenbal, Michelle Gregory
Abstract In this paper we present a solution for tagging funding bodies and grants in scientific articles using a combination of trained sequential learning models, namely conditional random fields (CRF), hidden markov models (HMM) and maximum entropy models (MaxEnt), on a benchmark set created in-house. We apply the trained models to address the BioASQ challenge 5c, which is a newly introduced task that aims to solve the problem of funding information extraction from scientific articles. Results in the dry-run data set of BioASQ task 5c show that the suggested approach can achieve a micro-recall of more than 85{%} in tagging both funding bodies and grants.
Tasks Document Summarization, Information Retrieval, Multi-Document Summarization, Text Classification
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2327/
PDF https://www.aclweb.org/anthology/W17-2327
PWC https://paperswithcode.com/paper/tagging-funding-agencies-and-grants-in
Repo
Framework

Integrating Vision and Language Datasets to Measure Word Concreteness

Title Integrating Vision and Language Datasets to Measure Word Concreteness
Authors Gitit Kehat, James Pustejovsky
Abstract We present and take advantage of the inherent visualizability properties of words in visual corpora (the textual components of vision-language datasets) to compute concreteness scores for words. Our simple method does not require hand-annotated concreteness score lists for training, and yields state-of-the-art results when evaluated against concreteness scores lists and previously derived scores, as well as when used for metaphor detection.
Tasks Image Captioning, Image Retrieval, Question Answering
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2018/
PDF https://www.aclweb.org/anthology/I17-2018
PWC https://paperswithcode.com/paper/integrating-vision-and-language-datasets-to
Repo
Framework

Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project

Title Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project
Authors Alfiya Galieva, Olga Nevzorova, Dilyara Yakubova
Abstract This paper discusses the general methodology and important practical aspects of implementing a new bilingual lexical resource {–} the Russian-Tatar Socio-Political Thesaurus that is being developed on the basis of the Russian RuThes thesaurus format as a hierarchy of concepts viewed as units of thought. Each concept is linked with a set of language expressions (words and collocations) referring to it in texts (text entries). Currently the Russian-Tatar Socio-Political Thesaurus includes 6,000 concepts, while new concepts and text entries are being constantly added to it. The paper outlines main challenges of translating concept names and their text entries into Tatar, and describes ways of reflecting the specificity of the Tatar lexical-semantic system.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1034/
PDF https://doi.org/10.26615/978-954-452-049-6_034
PWC https://paperswithcode.com/paper/russian-tatar-socio-political-thesaurus
Repo
Framework

The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields

Title The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields
Authors Lung-Hao Lee, Kuei-Ching Lee, Yuen-Hsien Tseng
Abstract This study describes the design of the NTNU system for the ScienceIE task at the SemEval 2017 workshop. We use self-defined feature templates and multiple conditional random fields with extracted features to identify keyphrases along with categorized labels and their relations from scientific publications. A total of 16 teams participated in evaluation scenario 1 (subtasks A, B, and C), with only 7 teams competing in all sub-tasks. Our best micro-averaging F1 across the three subtasks is 0.23, ranking in the middle among all 16 submissions.
Tasks Multi-Task Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2165/
PDF https://www.aclweb.org/anthology/S17-2165
PWC https://paperswithcode.com/paper/the-ntnu-system-at-semeval-2017-task-10
Repo
Framework

Machine translation with North Saami as a pivot language

Title Machine translation with North Saami as a pivot language
Authors Lene Antonsen, Ciprian Gerstenberger, Maja Kappfjell, S Nyst{\o} Rahka, ra, Marja-Liisa Olthuis, Trond Trosterud, Francis M. Tyers
Abstract
Tasks Machine Translation
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0215/
PDF https://www.aclweb.org/anthology/W17-0215
PWC https://paperswithcode.com/paper/machine-translation-with-north-saami-as-a
Repo
Framework

Tracking Bias in News Sources Using Social Media: the Russia-Ukraine Maidan Crisis of 2013–2014

Title Tracking Bias in News Sources Using Social Media: the Russia-Ukraine Maidan Crisis of 2013–2014
Authors Peter Potash, Alexey Romanov, Mikhail Gronas, Anna Rumshisky, Mikhail Gronas
Abstract This paper addresses the task of identifying the bias in news articles published during a political or social conflict. We create a silver-standard corpus based on the actions of users in social media. Specifically, we reconceptualize bias in terms of how likely a given article is to be shared or liked by each of the opposing sides. We apply our methodology to a dataset of links collected in relation to the Russia-Ukraine Maidan crisis from 2013-2014. We show that on the task of predicting which side is likely to prefer a given article, a Naive Bayes classifier can record 90.3{%} accuracy looking only at domain names of the news sources. The best accuracy of 93.5{%} is achieved by a feed forward neural network. We also apply our methodology to gold-labeled set of articles annotated for bias, where the aforementioned Naive Bayes classifier records 82.6{%} accuracy and a feed-forward neural networks records 85.6{%} accuracy.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4203/
PDF https://www.aclweb.org/anthology/W17-4203
PWC https://paperswithcode.com/paper/tracking-bias-in-news-sources-using-social
Repo
Framework

Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data

Title Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data
Authors Philippa Shoemark, James Kirby, Sharon Goldwater
Abstract Sociolinguistic research suggests that speakers modulate their language style in response to their audience. Similar effects have recently been claimed to occur in the informal written context of Twitter, with users choosing less region-specific and non-standard vocabulary when addressing larger audiences. However, these studies have not carefully controlled for the possible confound of topic: that is, tweets addressed to a broad audience might also tend towards topics that engender a more formal style. In addition, it is not clear to what extent previous results generalize to different samples of users. Using mixed-effects models, we show that audience and topic have independent effects on the rate of distinctively Scottish usage in two demographically distinct Twitter user samples. However, not all effects are consistent between the two groups, underscoring the importance of replicating studies on distinct user samples before drawing strong conclusions from social media data.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4908/
PDF https://www.aclweb.org/anthology/W17-4908
PWC https://paperswithcode.com/paper/topic-and-audience-effects-on-distinctively
Repo
Framework

Using Convolutional Neural Networks to Classify Hate-Speech

Title Using Convolutional Neural Networks to Classify Hate-Speech
Authors Bj{"o}rn Gamb{"a}ck, Utpal Kumar Sikdar
Abstract The paper introduces a deep learning-based Twitter hate-speech text classification system. The classifier assigns each tweet to one of four predefined categories: racism, sexism, both (racism and sexism) and non-hate-speech. Four Convolutional Neural Network models were trained on resp. character 4-grams, word vectors based on semantic information built using word2vec, randomly generated word vectors, and word vectors combined with character n-grams. The feature set was down-sized in the networks by max-pooling, and a softmax function used to classify tweets. Tested by 10-fold cross-validation, the model based on word2vec embeddings performed best, with higher precision than recall, and a 78.3{%} F-score.
Tasks Named Entity Recognition, Part-Of-Speech Tagging, Sentiment Analysis, Text Classification
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3013/
PDF https://www.aclweb.org/anthology/W17-3013
PWC https://paperswithcode.com/paper/using-convolutional-neural-networks-to-1
Repo
Framework

A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations

Title A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations
Authors KyungTae Lim, Thierry Poibeau
Abstract In this paper, we present our multilingual dependency parser developed for the CoNLL 2017 UD Shared Task dealing with {``}Multilingual Parsing from Raw Text to Universal Dependencies{''}. Our parser extends the monolingual BIST-parser as a multi-source multilingual trainable parser. Thanks to multilingual word embeddings and one hot encodings for languages, our system can use both monolingual and multi-source training. We trained 69 monolingual language models and 13 multilingual models for the shared task. Our multilingual approach making use of different resources yield better results than the monolingual approach for 11 languages. Our system ranked 5 th and achieved 70.93 overall LAS score over the 81 test corpora (macro-averaged LAS F1 score). |
Tasks Dependency Parsing, Multilingual Word Embeddings, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-3006/
PDF https://www.aclweb.org/anthology/K17-3006
PWC https://paperswithcode.com/paper/a-system-for-multilingual-dependency-parsing
Repo
Framework
comments powered by Disqus