Paper Group NANR 68
Adversarial Training for Relation Extraction. Towards efficient string processing of annotated events. An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts. Learning to Model the Tail. SINAI at SemEval-2017 Task 4: User based classification. The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTrage …
Adversarial Training for Relation Extraction
Title | Adversarial Training for Relation Extraction |
Authors | Yi Wu, David Bamman, Stuart Russell |
Abstract | Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data. We apply adversarial training in relation extraction within the multi-instance multi-label learning framework. We evaluate various neural network architectures on two different datasets. Experimental results demonstrate that adversarial training is generally effective for both CNN and RNN models and significantly improves the precision of predicted relations. |
Tasks | Image Classification, Multi-Label Learning, Relation Extraction, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1187/ |
https://www.aclweb.org/anthology/D17-1187 | |
PWC | https://paperswithcode.com/paper/adversarial-training-for-relation-extraction |
Repo | |
Framework | |
Towards efficient string processing of annotated events
Title | Towards efficient string processing of annotated events |
Authors | David Woods, Fern, Tim o, Carl Vogel |
Abstract | |
Tasks | |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-7414/ |
https://www.aclweb.org/anthology/W17-7414 | |
PWC | https://paperswithcode.com/paper/towards-efficient-string-processing-of |
Repo | |
Framework | |
An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts
Title | An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts |
Authors | Cecilia Ovesdotter Alm, Benjamin Meyers, Emily Prud{'}hommeaux |
Abstract | We present an educational tool that integrates computational linguistics resources for use in non-technical undergraduate language science courses. By using the tool in conjunction with evidence-driven pedagogical case studies, we strive to provide opportunities for students to gain an understanding of linguistic concepts and analysis through the lens of realistic problems in feasible ways. Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and learning of linguistics. The approach introduced also has potential to encourage students across training backgrounds to continue on to computational language analysis coursework. |
Tasks | Active Learning |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-2003/ |
https://www.aclweb.org/anthology/D17-2003 | |
PWC | https://paperswithcode.com/paper/an-analysis-and-visualization-tool-for-case |
Repo | |
Framework | |
Learning to Model the Tail
Title | Learning to Model the Tail |
Authors | Yu-Xiong Wang, Deva Ramanan, Martial Hebert |
Abstract | We describe an approach to learning from long-tailed, imbalanced datasets that are prevalent in real-world settings. Here, the challenge is to learn accurate “few-shot’’ models for classes in the tail of the class distribution, for which little data is available. We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail. Our key insights are as follows. First, we propose to transfer meta-knowledge about learning-to-learn from the head classes. This knowledge is encoded with a meta-network that operates on the space of model parameters, that is trained to predict many-shot model parameters from few-shot model parameters. Second, we transfer this meta-knowledge in a progressive manner, from classes in the head to the “body’', and from the “body’’ to the tail. That is, we transfer knowledge in a gradual fashion, regularizing meta-networks for few-shot regression with those trained with more training data. This allows our final network to capture a notion of model dynamics, that predicts how model parameters are likely to change as more training data is gradually added. We demonstrate results on image classification datasets (SUN, Places, and ImageNet) tuned for the long-tailed setting, that significantly outperform common heuristics, such as data resampling or reweighting. |
Tasks | few-shot regression, Image Classification, Transfer Learning |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7278-learning-to-model-the-tail |
http://papers.nips.cc/paper/7278-learning-to-model-the-tail.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-model-the-tail |
Repo | |
Framework | |
SINAI at SemEval-2017 Task 4: User based classification
Title | SINAI at SemEval-2017 Task 4: User based classification |
Authors | Salud Mar{'\i}a Jim{'e}nez-Zafra, Arturo Montejo-R{'a}ez, Maite Martin, L. Alfonso Ure{~n}a-L{'o}pez |
Abstract | This document describes our participation in SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have only reported results for subtask B - English, determining the polarity towards a topic on a two point scale (positive or negative sentiment). Our main contribution is the integration of user information in the classification process. A SVM model is trained with Word2Vec vectors from user{'}s tweets extracted from his timeline. The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are error prone because they are classified by an imperfect system. This encourages us to explore further integration of user information for author-based Sentiment Analysis. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2104/ |
https://www.aclweb.org/anthology/S17-2104 | |
PWC | https://paperswithcode.com/paper/sinai-at-semeval-2017-task-4-user-based |
Repo | |
Framework | |
The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy
Title | The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy |
Authors | {'E}ric de La Clergerie, Beno{^\i}t Sagot, Djam{'e} Seddah |
Abstract | We present the ParisNLP entry at the UD CoNLL 2017 parsing shared task. In addition to the UDpipe models provided, we built our own data-driven tokenization models, sentence segmenter and lexicon-based morphological analyzers. All of these were used with a range of different parsing models (neural or not, feature-rich or not, transition or graph-based, etc.) and the best combination for each language was selected. Unfortunately, a glitch in the shared task{'}s Matrix led our model selector to run generic, weakly lexicalized models, tailored for surprise languages, instead of our dataset-specific models. Because of this {#}ParsingTragedy, we officially ranked 27th, whereas our real models finally unofficially ranked 6th. |
Tasks | Dependency Parsing, Tokenization |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-3026/ |
https://www.aclweb.org/anthology/K17-3026 | |
PWC | https://paperswithcode.com/paper/the-parisnlp-entry-at-the-conll-ud-shared |
Repo | |
Framework | |
Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models
Title | Tagging Funding Agencies and Grants in Scientific Articles using Sequential Learning Models |
Authors | Subhradeep Kayal, Zubair Afzal, George Tsatsaronis, Sophia Katrenko, Pascal Coupet, Marius Doornenbal, Michelle Gregory |
Abstract | In this paper we present a solution for tagging funding bodies and grants in scientific articles using a combination of trained sequential learning models, namely conditional random fields (CRF), hidden markov models (HMM) and maximum entropy models (MaxEnt), on a benchmark set created in-house. We apply the trained models to address the BioASQ challenge 5c, which is a newly introduced task that aims to solve the problem of funding information extraction from scientific articles. Results in the dry-run data set of BioASQ task 5c show that the suggested approach can achieve a micro-recall of more than 85{%} in tagging both funding bodies and grants. |
Tasks | Document Summarization, Information Retrieval, Multi-Document Summarization, Text Classification |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2327/ |
https://www.aclweb.org/anthology/W17-2327 | |
PWC | https://paperswithcode.com/paper/tagging-funding-agencies-and-grants-in |
Repo | |
Framework | |
Integrating Vision and Language Datasets to Measure Word Concreteness
Title | Integrating Vision and Language Datasets to Measure Word Concreteness |
Authors | Gitit Kehat, James Pustejovsky |
Abstract | We present and take advantage of the inherent visualizability properties of words in visual corpora (the textual components of vision-language datasets) to compute concreteness scores for words. Our simple method does not require hand-annotated concreteness score lists for training, and yields state-of-the-art results when evaluated against concreteness scores lists and previously derived scores, as well as when used for metaphor detection. |
Tasks | Image Captioning, Image Retrieval, Question Answering |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2018/ |
https://www.aclweb.org/anthology/I17-2018 | |
PWC | https://paperswithcode.com/paper/integrating-vision-and-language-datasets-to |
Repo | |
Framework | |
Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project
Title | Russian-Tatar Socio-Political Thesaurus: Methodology, Challenges, the Status of the Project |
Authors | Alfiya Galieva, Olga Nevzorova, Dilyara Yakubova |
Abstract | This paper discusses the general methodology and important practical aspects of implementing a new bilingual lexical resource {–} the Russian-Tatar Socio-Political Thesaurus that is being developed on the basis of the Russian RuThes thesaurus format as a hierarchy of concepts viewed as units of thought. Each concept is linked with a set of language expressions (words and collocations) referring to it in texts (text entries). Currently the Russian-Tatar Socio-Political Thesaurus includes 6,000 concepts, while new concepts and text entries are being constantly added to it. The paper outlines main challenges of translating concept names and their text entries into Tatar, and describes ways of reflecting the specificity of the Tatar lexical-semantic system. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1034/ |
https://doi.org/10.26615/978-954-452-049-6_034 | |
PWC | https://paperswithcode.com/paper/russian-tatar-socio-political-thesaurus |
Repo | |
Framework | |
The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields
Title | The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields |
Authors | Lung-Hao Lee, Kuei-Ching Lee, Yuen-Hsien Tseng |
Abstract | This study describes the design of the NTNU system for the ScienceIE task at the SemEval 2017 workshop. We use self-defined feature templates and multiple conditional random fields with extracted features to identify keyphrases along with categorized labels and their relations from scientific publications. A total of 16 teams participated in evaluation scenario 1 (subtasks A, B, and C), with only 7 teams competing in all sub-tasks. Our best micro-averaging F1 across the three subtasks is 0.23, ranking in the middle among all 16 submissions. |
Tasks | Multi-Task Learning |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2165/ |
https://www.aclweb.org/anthology/S17-2165 | |
PWC | https://paperswithcode.com/paper/the-ntnu-system-at-semeval-2017-task-10 |
Repo | |
Framework | |
Machine translation with North Saami as a pivot language
Title | Machine translation with North Saami as a pivot language |
Authors | Lene Antonsen, Ciprian Gerstenberger, Maja Kappfjell, S Nyst{\o} Rahka, ra, Marja-Liisa Olthuis, Trond Trosterud, Francis M. Tyers |
Abstract | |
Tasks | Machine Translation |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0215/ |
https://www.aclweb.org/anthology/W17-0215 | |
PWC | https://paperswithcode.com/paper/machine-translation-with-north-saami-as-a |
Repo | |
Framework | |
Tracking Bias in News Sources Using Social Media: the Russia-Ukraine Maidan Crisis of 2013–2014
Title | Tracking Bias in News Sources Using Social Media: the Russia-Ukraine Maidan Crisis of 2013–2014 |
Authors | Peter Potash, Alexey Romanov, Mikhail Gronas, Anna Rumshisky, Mikhail Gronas |
Abstract | This paper addresses the task of identifying the bias in news articles published during a political or social conflict. We create a silver-standard corpus based on the actions of users in social media. Specifically, we reconceptualize bias in terms of how likely a given article is to be shared or liked by each of the opposing sides. We apply our methodology to a dataset of links collected in relation to the Russia-Ukraine Maidan crisis from 2013-2014. We show that on the task of predicting which side is likely to prefer a given article, a Naive Bayes classifier can record 90.3{%} accuracy looking only at domain names of the news sources. The best accuracy of 93.5{%} is achieved by a feed forward neural network. We also apply our methodology to gold-labeled set of articles annotated for bias, where the aforementioned Naive Bayes classifier records 82.6{%} accuracy and a feed-forward neural networks records 85.6{%} accuracy. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4203/ |
https://www.aclweb.org/anthology/W17-4203 | |
PWC | https://paperswithcode.com/paper/tracking-bias-in-news-sources-using-social |
Repo | |
Framework | |
Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data
Title | Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data |
Authors | Philippa Shoemark, James Kirby, Sharon Goldwater |
Abstract | Sociolinguistic research suggests that speakers modulate their language style in response to their audience. Similar effects have recently been claimed to occur in the informal written context of Twitter, with users choosing less region-specific and non-standard vocabulary when addressing larger audiences. However, these studies have not carefully controlled for the possible confound of topic: that is, tweets addressed to a broad audience might also tend towards topics that engender a more formal style. In addition, it is not clear to what extent previous results generalize to different samples of users. Using mixed-effects models, we show that audience and topic have independent effects on the rate of distinctively Scottish usage in two demographically distinct Twitter user samples. However, not all effects are consistent between the two groups, underscoring the importance of replicating studies on distinct user samples before drawing strong conclusions from social media data. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4908/ |
https://www.aclweb.org/anthology/W17-4908 | |
PWC | https://paperswithcode.com/paper/topic-and-audience-effects-on-distinctively |
Repo | |
Framework | |
Using Convolutional Neural Networks to Classify Hate-Speech
Title | Using Convolutional Neural Networks to Classify Hate-Speech |
Authors | Bj{"o}rn Gamb{"a}ck, Utpal Kumar Sikdar |
Abstract | The paper introduces a deep learning-based Twitter hate-speech text classification system. The classifier assigns each tweet to one of four predefined categories: racism, sexism, both (racism and sexism) and non-hate-speech. Four Convolutional Neural Network models were trained on resp. character 4-grams, word vectors based on semantic information built using word2vec, randomly generated word vectors, and word vectors combined with character n-grams. The feature set was down-sized in the networks by max-pooling, and a softmax function used to classify tweets. Tested by 10-fold cross-validation, the model based on word2vec embeddings performed best, with higher precision than recall, and a 78.3{%} F-score. |
Tasks | Named Entity Recognition, Part-Of-Speech Tagging, Sentiment Analysis, Text Classification |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-3013/ |
https://www.aclweb.org/anthology/W17-3013 | |
PWC | https://paperswithcode.com/paper/using-convolutional-neural-networks-to-1 |
Repo | |
Framework | |
A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations
Title | A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations |
Authors | KyungTae Lim, Thierry Poibeau |
Abstract | In this paper, we present our multilingual dependency parser developed for the CoNLL 2017 UD Shared Task dealing with {``}Multilingual Parsing from Raw Text to Universal Dependencies{''}. Our parser extends the monolingual BIST-parser as a multi-source multilingual trainable parser. Thanks to multilingual word embeddings and one hot encodings for languages, our system can use both monolingual and multi-source training. We trained 69 monolingual language models and 13 multilingual models for the shared task. Our multilingual approach making use of different resources yield better results than the monolingual approach for 11 languages. Our system ranked 5 th and achieved 70.93 overall LAS score over the 81 test corpora (macro-averaged LAS F1 score). | |
Tasks | Dependency Parsing, Multilingual Word Embeddings, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-3006/ |
https://www.aclweb.org/anthology/K17-3006 | |
PWC | https://paperswithcode.com/paper/a-system-for-multilingual-dependency-parsing |
Repo | |
Framework | |