July 26, 2019

2207 words 11 mins read

Paper Group NANR 68

Adversarial Training for Relation Extraction. Towards efficient string processing of annotated events. An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts. Learning to Model the Tail. SINAI at SemEval-2017 Task 4: User based classification. The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTrage …

Adversarial Training for Relation Extraction


Title	Adversarial Training for Relation Extraction
Authors	Yi Wu, David Bamman, Stuart Russell
Abstract	Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data. We apply adversarial training in relation extraction within the multi-instance multi-label learning framework. We evaluate various neural network architectures on two different datasets. Experimental results demonstrate that adversarial training is generally effective for both CNN and RNN models and significantly improves the precision of predicted relations.
Tasks	Image Classification, Multi-Label Learning, Relation Extraction, Text Classification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1187/
PDF	https://www.aclweb.org/anthology/D17-1187
PWC	https://paperswithcode.com/paper/adversarial-training-for-relation-extraction
Repo
Framework

Towards efficient string processing of annotated events


Title	Towards efficient string processing of annotated events
Authors	David Woods, Fern, Tim o, Carl Vogel
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7414/
PDF	https://www.aclweb.org/anthology/W17-7414
PWC	https://paperswithcode.com/paper/towards-efficient-string-processing-of
Repo
Framework

An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts


Title	An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts
Authors	Cecilia Ovesdotter Alm, Benjamin Meyers, Emily Prud{'}hommeaux
Abstract	We present an educational tool that integrates computational linguistics resources for use in non-technical undergraduate language science courses. By using the tool in conjunction with evidence-driven pedagogical case studies, we strive to provide opportunities for students to gain an understanding of linguistic concepts and analysis through the lens of realistic problems in feasible ways. Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and learning of linguistics. The approach introduced also has potential to encourage students across training backgrounds to continue on to computational language analysis coursework.
Tasks	Active Learning
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-2003/
PDF	https://www.aclweb.org/anthology/D17-2003
PWC	https://paperswithcode.com/paper/an-analysis-and-visualization-tool-for-case
Repo
Framework

Learning to Model the Tail


Title	Learning to Model the Tail
Authors	Yu-Xiong Wang, Deva Ramanan, Martial Hebert
Abstract	We describe an approach to learning from long-tailed, imbalanced datasets that are prevalent in real-world settings. Here, the challenge is to learn accurate “few-shot’’ models for classes in the tail of the class distribution, for which little data is available. We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail. Our key insights are as follows. First, we propose to transfer meta-knowledge about learning-to-learn from the head classes. This knowledge is encoded with a meta-network that operates on the space of model parameters, that is trained to predict many-shot model parameters from few-shot model parameters. Second, we transfer this meta-knowledge in a progressive manner, from classes in the head to the “body’', and from the “body’’ to the tail. That is, we transfer knowledge in a gradual fashion, regularizing meta-networks for few-shot regression with those trained with more training data. This allows our final network to capture a notion of model dynamics, that predicts how model parameters are likely to change as more training data is gradually added. We demonstrate results on image classification datasets (SUN, Places, and ImageNet) tuned for the long-tailed setting, that significantly outperform common heuristics, such as data resampling or reweighting.
Tasks	few-shot regression, Image Classification, Transfer Learning
Published	2017-12-01
URL	http://papers.nips.cc/paper/7278-learning-to-model-the-tail
PDF	http://papers.nips.cc/paper/7278-learning-to-model-the-tail.pdf
PWC	https://paperswithcode.com/paper/learning-to-model-the-tail
Repo
Framework

SINAI at SemEval-2017 Task 4: User based classification


Title	SINAI at SemEval-2017 Task 4: User based classification
Authors	Salud Mar{'\i}a Jim{'e}nez-Zafra, Arturo Montejo-R{'a}ez, Maite Martin, L. Alfonso Ure{~n}a-L{'o}pez
Abstract	This document describes our participation in SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have only reported results for subtask B - English, determining the polarity towards a topic on a two point scale (positive or negative sentiment). Our main contribution is the integration of user information in the classification process. A SVM model is trained with Word2Vec vectors from user{'}s tweets extracted from his timeline. The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are error prone because they are classified by an imperfect system. This encourages us to explore further integration of user information for author-based Sentiment Analysis.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2104/
PDF	https://www.aclweb.org/anthology/S17-2104
PWC	https://paperswithcode.com/paper/sinai-at-semeval-2017-task-4-user-based
Repo
Framework

The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy


Title	The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy
Authors	{'E}ric de La Clergerie, Beno{^\i}t Sagot, Djam{'e} Seddah
Abstract	We present the ParisNLP entry at the UD CoNLL 2017 parsing shared task. In addition to the UDpipe models provided, we built our own data-driven tokenization models, sentence segmenter and lexicon-based morphological analyzers. All of these were used with a range of different parsing models (neural or not, feature-rich or not, transition or graph-based, etc.) and the best combination for each language was selected. Unfortunately, a glitch in the shared task{'}s Matrix led our model selector to run generic, weakly lexicalized models, tailored for surprise languages, instead of our dataset-specific models. Because of this {#}ParsingTragedy, we officially ranked 27th, whereas our real models finally unofficially ranked 6th.
Tasks	Dependency Parsing, Tokenization
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-3026/
PDF	https://www.aclweb.org/anthology/K17-3026
PWC	https://paperswithcode.com/paper/the-parisnlp-entry-at-the-conll-ud-shared
Repo
Framework

Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models


Title	Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models
Authors	Subhradeep Kayal, Zubair Afzal, George Tsatsaronis, Sophia Katrenko, Pascal Coupet, Marius Doornenbal, Michelle Gregory
Abstract	In this paper we present a solution for tagging funding bodies and grants in scientific articles using a combination of trained sequential learning models, namely conditional random fields (CRF), hidden markov models (HMM) and maximum entropy models (MaxEnt), on a benchmark set created in-house. We apply the trained models to address the BioASQ challenge 5c, which is a newly introduced task that aims to solve the problem of funding information extraction from scientific articles. Results in the dry-run data set of BioASQ task 5c show that the suggested approach can achieve a micro-recall of more than 85{%} in tagging both funding bodies and grants.
Tasks	Document Summarization, Information Retrieval, Multi-Document Summarization, Text Classification
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2327/
PDF	https://www.aclweb.org/anthology/W17-2327
PWC	https://paperswithcode.com/paper/tagging-funding-agencies-and-grants-in
Repo
Framework

Integrating Vision and Language Datasets to Measure Word Concreteness


Title	Integrating Vision and Language Datasets to Measure Word Concreteness
Authors	Gitit Kehat, James Pustejovsky
Abstract	We present and take advantage of the inherent visualizability properties of words in visual corpora (the textual components of vision-language datasets) to compute concreteness scores for words. Our simple method does not require hand-annotated concreteness score lists for training, and yields state-of-the-art results when evaluated against concreteness scores lists and previously derived scores, as well as when used for metaphor detection.
Tasks	Image Captioning, Image Retrieval, Question Answering
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2018/
PDF	https://www.aclweb.org/anthology/I17-2018
PWC	https://paperswithcode.com/paper/integrating-vision-and-language-datasets-to
Repo
Framework

Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project


Title	Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project
Authors	Alfiya Galieva, Olga Nevzorova, Dilyara Yakubova
Abstract	This paper discusses the general methodology and important practical aspects of implementing a new bilingual lexical resource {–} the Russian-Tatar Socio-Political Thesaurus that is being developed on the basis of the Russian RuThes thesaurus format as a hierarchy of concepts viewed as units of thought. Each concept is linked with a set of language expressions (words and collocations) referring to it in texts (text entries). Currently the Russian-Tatar Socio-Political Thesaurus includes 6,000 concepts, while new concepts and text entries are being constantly added to it. The paper outlines main challenges of translating concept names and their text entries into Tatar, and describes ways of reflecting the specificity of the Tatar lexical-semantic system.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1034/
PDF	https://doi.org/10.26615/978-954-452-049-6_034
PWC	https://paperswithcode.com/paper/russian-tatar-socio-political-thesaurus
Repo
Framework

The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields


Title	The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields
Authors	Lung-Hao Lee, Kuei-Ching Lee, Yuen-Hsien Tseng
Abstract	This study describes the design of the NTNU system for the ScienceIE task at the SemEval 2017 workshop. We use self-defined feature templates and multiple conditional random fields with extracted features to identify keyphrases along with categorized labels and their relations from scientific publications. A total of 16 teams participated in evaluation scenario 1 (subtasks A, B, and C), with only 7 teams competing in all sub-tasks. Our best micro-averaging F1 across the three subtasks is 0.23, ranking in the middle among all 16 submissions.
Tasks	Multi-Task Learning
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2165/
PDF	https://www.aclweb.org/anthology/S17-2165
PWC	https://paperswithcode.com/paper/the-ntnu-system-at-semeval-2017-task-10
Repo
Framework

Machine translation with North Saami as a pivot language


Title	Machine translation with North Saami as a pivot language
Authors	Lene Antonsen, Ciprian Gerstenberger, Maja Kappfjell, S Nyst{\o} Rahka, ra, Marja-Liisa Olthuis, Trond Trosterud, Francis M. Tyers
Abstract
Tasks	Machine Translation
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0215/
PDF	https://www.aclweb.org/anthology/W17-0215
PWC	https://paperswithcode.com/paper/machine-translation-with-north-saami-as-a
Repo
Framework


Title	Tracking Bias in News Sources Using Social Media: the Russia-Ukraine Maidan Crisis of 2013–2014
Authors	Peter Potash, Alexey Romanov, Mikhail Gronas, Anna Rumshisky, Mikhail Gronas
Abstract	This paper addresses the task of identifying the bias in news articles published during a political or social conflict. We create a silver-standard corpus based on the actions of users in social media. Specifically, we reconceptualize bias in terms of how likely a given article is to be shared or liked by each of the opposing sides. We apply our methodology to a dataset of links collected in relation to the Russia-Ukraine Maidan crisis from 2013-2014. We show that on the task of predicting which side is likely to prefer a given article, a Naive Bayes classifier can record 90.3{%} accuracy looking only at domain names of the news sources. The best accuracy of 93.5{%} is achieved by a feed forward neural network. We also apply our methodology to gold-labeled set of articles annotated for bias, where the aforementioned Naive Bayes classifier records 82.6{%} accuracy and a feed-forward neural networks records 85.6{%} accuracy.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4203/
PDF	https://www.aclweb.org/anthology/W17-4203
PWC	https://paperswithcode.com/paper/tracking-bias-in-news-sources-using-social
Repo
Framework

Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data


Title	Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data
Authors	Philippa Shoemark, James Kirby, Sharon Goldwater
Abstract	Sociolinguistic research suggests that speakers modulate their language style in response to their audience. Similar effects have recently been claimed to occur in the informal written context of Twitter, with users choosing less region-specific and non-standard vocabulary when addressing larger audiences. However, these studies have not carefully controlled for the possible confound of topic: that is, tweets addressed to a broad audience might also tend towards topics that engender a more formal style. In addition, it is not clear to what extent previous results generalize to different samples of users. Using mixed-effects models, we show that audience and topic have independent effects on the rate of distinctively Scottish usage in two demographically distinct Twitter user samples. However, not all effects are consistent between the two groups, underscoring the importance of replicating studies on distinct user samples before drawing strong conclusions from social media data.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4908/
PDF	https://www.aclweb.org/anthology/W17-4908
PWC	https://paperswithcode.com/paper/topic-and-audience-effects-on-distinctively
Repo
Framework

Using Convolutional Neural Networks to Classify Hate-Speech


Title	Using Convolutional Neural Networks to Classify Hate-Speech
Authors	Bj{"o}rn Gamb{"a}ck, Utpal Kumar Sikdar
Abstract	The paper introduces a deep learning-based Twitter hate-speech text classification system. The classifier assigns each tweet to one of four predefined categories: racism, sexism, both (racism and sexism) and non-hate-speech. Four Convolutional Neural Network models were trained on resp. character 4-grams, word vectors based on semantic information built using word2vec, randomly generated word vectors, and word vectors combined with character n-grams. The feature set was down-sized in the networks by max-pooling, and a softmax function used to classify tweets. Tested by 10-fold cross-validation, the model based on word2vec embeddings performed best, with higher precision than recall, and a 78.3{%} F-score.
Tasks	Named Entity Recognition, Part-Of-Speech Tagging, Sentiment Analysis, Text Classification
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-3013/
PDF	https://www.aclweb.org/anthology/W17-3013
PWC	https://paperswithcode.com/paper/using-convolutional-neural-networks-to-1
Repo
Framework

A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations


Title	A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations
Authors	KyungTae Lim, Thierry Poibeau
Abstract	In this paper, we present our multilingual dependency parser developed for the CoNLL 2017 UD Shared Task dealing with {``}Multilingual Parsing from Raw Text to Universal Dependencies{''}. Our parser extends the monolingual BIST-parser as a multi-source multilingual trainable parser. Thanks to multilingual word embeddings and one hot encodings for languages, our system can use both monolingual and multi-source training. We trained 69 monolingual language models and 13 multilingual models for the shared task. Our multilingual approach making use of different resources yield better results than the monolingual approach for 11 languages. Our system ranked 5 th and achieved 70.93 overall LAS score over the 81 test corpora (macro-averaged LAS F1 score). \|
Tasks	Dependency Parsing, Multilingual Word Embeddings, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-3006/
PDF	https://www.aclweb.org/anthology/K17-3006
PWC	https://paperswithcode.com/paper/a-system-for-multilingual-dependency-parsing
Repo
Framework