Paper Group NANR 56
Language Muse: Automated Linguistic Activity Generation for English Language Learners. Pronoun Prediction with Linguistic Features and Example Weighing. USAAR: An Operation Sequential Model for Automatic Statistical Post-Editing. Bilingual Embeddings and Word Alignments for Translation Quality Estimation. DBpedia Abstracts: A Large-Scale, Open, Mul …
Language Muse: Automated Linguistic Activity Generation for English Language Learners
Title | Language Muse: Automated Linguistic Activity Generation for English Language Learners |
Authors | Nitin Madnani, Jill Burstein, John Sabatini, Kietha Biggers, Slava Andreyev |
Abstract | |
Tasks | Question Generation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4014/ |
https://www.aclweb.org/anthology/P16-4014 | |
PWC | https://paperswithcode.com/paper/language-muse-automated-linguistic-activity |
Repo | |
Framework | |
Pronoun Prediction with Linguistic Features and Example Weighing
Title | Pronoun Prediction with Linguistic Features and Example Weighing |
Authors | Michal Nov{'a}k |
Abstract | |
Tasks | Language Modelling, Machine Translation, Word Alignment |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2354/ |
https://www.aclweb.org/anthology/W16-2354 | |
PWC | https://paperswithcode.com/paper/pronoun-prediction-with-linguistic-features |
Repo | |
Framework | |
USAAR: An Operation Sequential Model for Automatic Statistical Post-Editing
Title | USAAR: An Operation Sequential Model for Automatic Statistical Post-Editing |
Authors | Santanu Pal, Marcos Zampieri, Josef van Genabith |
Abstract | |
Tasks | Automatic Post-Editing, Machine Translation, Word Alignment |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2379/ |
https://www.aclweb.org/anthology/W16-2379 | |
PWC | https://paperswithcode.com/paper/usaar-an-operation-sequential-model-for |
Repo | |
Framework | |
Bilingual Embeddings and Word Alignments for Translation Quality Estimation
Title | Bilingual Embeddings and Word Alignments for Translation Quality Estimation |
Authors | Amal Abdelsalam, Ond{\v{r}}ej Bojar, Samhaa El-Beltagy |
Abstract | |
Tasks | Machine Translation, Word Alignment, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2380/ |
https://www.aclweb.org/anthology/W16-2380 | |
PWC | https://paperswithcode.com/paper/bilingual-embeddings-and-word-alignments-for |
Repo | |
Framework | |
DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus
Title | DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus |
Authors | Martin Br{"u}mmer, Milan Dojchinovski, Sebastian Hellmann |
Abstract | The ever increasing importance of machine learning in Natural Language Processing is accompanied by an equally increasing need in large-scale training and evaluation corpora. Due to its size, its openness and relative quality, the Wikipedia has already been a source of such data, but on a limited scale. This paper introduces the DBpedia Abstract Corpus, a large-scale, open corpus of annotated Wikipedia texts in six languages, featuring over 11 million texts and over 97 million entity links. The properties of the Wikipedia texts are being described, as well as the corpus creation process, its format and interesting use-cases, like Named Entity Linking training and evaluation. |
Tasks | Entity Linking |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1532/ |
https://www.aclweb.org/anthology/L16-1532 | |
PWC | https://paperswithcode.com/paper/dbpedia-abstracts-a-large-scale-open |
Repo | |
Framework | |
A Dataset for Detecting Stance in Tweets
Title | A Dataset for Detecting Stance in Tweets |
Authors | Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, Colin Cherry |
Abstract | We can often detect from a person{'}s utterances whether he/she is in favor of or against a given target entity (a product, topic, another person, etc.). Here for the first time we present a dataset of tweets annotated for whether the tweeter is in favor of or against pre-chosen targets of interest―their stance. The targets of interest may or may not be referred to in the tweets, and they may or may not be the target of opinion in the tweets. The data pertains to six targets of interest commonly known and debated in the United States. Apart from stance, the tweets are also annotated for whether the target of interest is the target of opinion in the tweet. The annotations were performed by crowdsourcing. Several techniques were employed to encourage high-quality annotations (for example, providing clear and simple instructions) and to identify and discard poor annotations (for example, using a small set of check questions annotated by the authors). This Stance Dataset, which was subsequently also annotated for sentiment, can be used to better understand the relationship between stance, sentiment, entity relationships, and textual inference. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1623/ |
https://www.aclweb.org/anthology/L16-1623 | |
PWC | https://paperswithcode.com/paper/a-dataset-for-detecting-stance-in-tweets |
Repo | |
Framework | |
Named Entity Resources - Overview and Outlook
Title | Named Entity Resources - Overview and Outlook |
Authors | Maud Ehrmann, Damien Nouvel, Sophie Rosset |
Abstract | Recognition of real-world entities is crucial for most NLP applications. Since its introduction some twenty years ago, named entity processing has undergone a significant evolution with, among others, the definition of new tasks (e.g. entity linking) and the emergence of new types of data (e.g. speech transcriptions, micro-blogging). These pose certainly new challenges which affect not only methods and algorithms but especially linguistic resources. Where do we stand with respect to named entity resources? This paper aims at providing a systematic overview of named entity resources, accounting for qualities such as multilingualism, dynamicity and interoperability, and to identify shortfalls in order to guide future developments. |
Tasks | Entity Linking |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1534/ |
https://www.aclweb.org/anthology/L16-1534 | |
PWC | https://paperswithcode.com/paper/named-entity-resources-overview-and-outlook |
Repo | |
Framework | |
Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level
Title | Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level |
Authors | Marcos Garcia |
Abstract | This paper explores the incorporation of lexico-semantic heuristics into a deterministic Coreference Resolution (CR) system for classifying named entities at document-level. The highest precise sieves of a CR tool are enriched with both a set of heuristics for merging named entities labeled with different classes and also with some constraints that avoid the incorrect merging of similar mentions. Several tests show that this strategy improves both NER labeling and CR. The CR tool can be applied in combination with any system for named entity recognition using the CoNLL format, and brings benefits to text analytics tasks such as Information Extraction. Experiments were carried out in Spanish, using three different NER tools. |
Tasks | Coreference Resolution, Named Entity Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1535/ |
https://www.aclweb.org/anthology/L16-1535 | |
PWC | https://paperswithcode.com/paper/incorporating-lexico-semantic-heuristics-into |
Repo | |
Framework | |
Inter-document Contextual Language model
Title | Inter-document Contextual Language model |
Authors | Quan Hung Tran, Ingrid Zukerman, Gholamreza Haffari |
Abstract | |
Tasks | Language Modelling |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1090/ |
https://www.aclweb.org/anthology/N16-1090 | |
PWC | https://paperswithcode.com/paper/inter-document-contextual-language-model |
Repo | |
Framework | |
Temporal Information Annotation: Crowd vs. Experts
Title | Temporal Information Annotation: Crowd vs. Experts |
Authors | Tommaso Caselli, Rachele Sprugnoli, Oana Inel |
Abstract | This paper describes two sets of crowdsourcing experiments on temporal information annotation conducted on two languages, i.e., English and Italian. The first experiment, launched on the CrowdFlower platform, was aimed at classifying temporal relations given target entities. The second one, relying on the CrowdTruth metric, consisted in two subtasks: one devoted to the recognition of events and temporal expressions and one to the detection and classification of temporal relations. The outcomes of the experiments suggest a valuable use of crowdsourcing annotations also for a complex task like Temporal Processing. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1557/ |
https://www.aclweb.org/anthology/L16-1557 | |
PWC | https://paperswithcode.com/paper/temporal-information-annotation-crowd-vs |
Repo | |
Framework | |
Using Word Embeddings to Translate Named Entities
Title | Using Word Embeddings to Translate Named Entities |
Authors | Octavia-Maria {\c{S}}ulea, Sergiu Nisioi, Liviu P. Dinu |
Abstract | In this paper we investigate the usefulness of neural word embeddings in the process of translating Named Entities (NEs) from a resource-rich language to a language low on resources relevant to the task at hand, introducing a novel, yet simple way of obtaining bilingual word vectors. Inspired by observations in (Mikolov et al., 2013b), which show that training their word vector model on comparable corpora yields comparable vector space representations of those corpora, reducing the problem of translating words to finding a rotation matrix, and results in (Zou et al., 2013), which showed that bilingual word embeddings can improve Chinese Named Entity Recognition (NER) and English to Chinese phrase translation, we use the sentence-aligned English-French EuroParl corpora and show that word embeddings extracted from a merged corpus (corpus resulted from the merger of the two aligned corpora) can be used to NE translation. We extrapolate that word embeddings trained on merged parallel corpora are useful in Named Entity Recognition and Translation tasks for resource-poor languages. |
Tasks | Chinese Named Entity Recognition, Named Entity Recognition, Word Embeddings |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1536/ |
https://www.aclweb.org/anthology/L16-1536 | |
PWC | https://paperswithcode.com/paper/using-word-embeddings-to-translate-named |
Repo | |
Framework | |
DISAANA and D-SUMM: Large-scale Real Time NLP Systems for Analyzing Disaster Related Reports in Tweets
Title | DISAANA and D-SUMM: Large-scale Real Time NLP Systems for Analyzing Disaster Related Reports in Tweets |
Authors | Kentaro Torisawa |
Abstract | This talk presents two NLP systems that were developed for helping disaster victims and rescue workers in the aftermath of large-scale disasters. DISAANA provides answers to questions such as {``}What is in short supply in Tokyo?{''} and displays locations related to each answer on a map. D-SUMM automatically summarizes a large number of disaster related reports concerning a specified area and helps rescue workers to understand disaster situations from a macro perspective. Both systems are publicly available as Web services. In the aftermath of the 2016 Kumamoto Earthquake (M7.0), the Japanese government actually used DISAANA to analyze the situation. | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3903/ |
https://www.aclweb.org/anthology/W16-3903 | |
PWC | https://paperswithcode.com/paper/disaana-and-d-summ-large-scale-real-time-nlp |
Repo | |
Framework | |
The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection
Title | The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection |
Authors | Ayah Zirikly, Bart Desmet, Mona Diab |
Abstract | This paper describes the GW/LT3 contribution to the 2016 VarDial shared task on the identification of similar languages (task 1) and Arabic dialects (task 2). For both tasks, we experimented with Logistic Regression and Neural Network classifiers in isolation. Additionally, we implemented a cascaded classifier that consists of coarse and fine-grained classifiers (task 1) and a classifier ensemble with majority voting for task 2. The submitted systems obtained state-of-the art performance and ranked first for the evaluation on social media data (test sets B1 and B2 for task 1), with a maximum weighted F1 score of 91.94{%}. |
Tasks | Feature Engineering |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4804/ |
https://www.aclweb.org/anthology/W16-4804 | |
PWC | https://paperswithcode.com/paper/the-gwlt3-vardial-2016-shared-task-system-for |
Repo | |
Framework | |
Learning to Search for Recognizing Named Entities in Twitter
Title | Learning to Search for Recognizing Named Entities in Twitter |
Authors | Ioannis Partalas, C{'e}dric Lopez, Nadia Derbas, Ruslan Kalitvianski |
Abstract | We presented in this work our participation in the 2nd Named Entity Recognition for Twitter shared task. The task has been cast as a sequence labeling one and we employed a learning to search approach in order to tackle it. We also leveraged LOD for extracting rich contextual features for the named-entities. Our submission achieved F-scores of 46.16 and 60.24 for the classification and the segmentation tasks and ranked 2nd and 3rd respectively. The post-analysis showed that LOD features improved substantially the performance of our system as they counter-balance the lack of context in tweets. The shared task gave us the opportunity to test the performance of NER systems in short and noisy textual data. The results of the participated systems shows that the task is far to be considered as a solved one and methods with stellar performance in normal texts need to be revised. |
Tasks | Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3923/ |
https://www.aclweb.org/anthology/W16-3923 | |
PWC | https://paperswithcode.com/paper/learning-to-search-for-recognizing-named |
Repo | |
Framework | |
Comparing Speech and Text Classification on ICNALE
Title | Comparing Speech and Text Classification on ICNALE |
Authors | Sergiu Nisioi |
Abstract | In this paper we explore and compare a speech and text classification approach on a corpus of native and non-native English speakers. We experiment on a subset of the International Corpus Network of Asian Learners of English containing the recorded speeches and the equivalent text transcriptions. Our results suggest a high correlation between the spoken and written classification results, showing that native accent is highly correlated with grammatical structures found in text. |
Tasks | Text Classification |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1542/ |
https://www.aclweb.org/anthology/L16-1542 | |
PWC | https://paperswithcode.com/paper/comparing-speech-and-text-classification-on |
Repo | |
Framework | |