Paper Group NANR 8
Mathematical Information Retrieval based on Type Embeddings and Query Expansion. A critique of word similarity as a method for evaluating distributional semantic models. NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis. UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based R …
Mathematical Information Retrieval based on Type Embeddings and Query Expansion
Title | Mathematical Information Retrieval based on Type Embeddings and Query Expansion |
Authors | Yiannos Stathopoulos, Simone Teufel |
Abstract | We present an approach to mathematical information retrieval (MIR) that exploits a special kind of technical terminology, referred to as a mathematical type. In this paper, we present and evaluate a type detection mechanism and show its positive effect on the retrieval of research-level mathematics. Our best model, which performs query expansion with a type-aware embedding space, strongly outperforms standard IR models with state-of-the-art query expansion (vector space-based and language modelling-based), on a relatively new corpus of research-level queries. |
Tasks | Information Retrieval, Language Modelling |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1221/ |
https://www.aclweb.org/anthology/C16-1221 | |
PWC | https://paperswithcode.com/paper/mathematical-information-retrieval-based-on |
Repo | |
Framework | |
A critique of word similarity as a method for evaluating distributional semantic models
Title | A critique of word similarity as a method for evaluating distributional semantic models |
Authors | Miroslav Batchkarov, Thomas Kober, Jeremy Reffin, Julie Weeds, David Weir |
Abstract | |
Tasks | Document Classification, Natural Language Inference |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2502/ |
https://www.aclweb.org/anthology/W16-2502 | |
PWC | https://paperswithcode.com/paper/a-critique-of-word-similarity-as-a-method-for |
Repo | |
Framework | |
NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis
Title | NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis |
Authors | Brage Ekroll Jahren, Valerij Fredriksen, Bj{"o}rn Gamb{"a}ck, Lars Bungum |
Abstract | |
Tasks | Sentiment Analysis, Twitter Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1014/ |
https://www.aclweb.org/anthology/S16-1014 | |
PWC | https://paperswithcode.com/paper/ntnusenteval-at-semeval-2016-task-4-combining |
Repo | |
Framework | |
UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation
Title | UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation |
Authors | Esteban Castillo, Ofelia Cervantes, Darnes Vilari{~n}o, David B{'a}ez |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1015/ |
https://www.aclweb.org/anthology/S16-1015 | |
PWC | https://paperswithcode.com/paper/udlap-at-semeval-2016-task-4-sentiment |
Repo | |
Framework | |
Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact
Title | Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact |
Authors | {^A}ngela Costa, Rui Correia, Lu{'\i}sa Coheur |
Abstract | In this paper we describe a corpus of automatic translations annotated with both error type and quality. The 300 sentences that we have selected were generated by Google Translate, Systran and two in-house Machine Translation systems that use Moses technology. The errors present on the translations were annotated with an error taxonomy that divides errors in five main linguistic categories (Orthography, Lexis, Grammar, Semantics and Discourse), reflecting the language level where the error is located. After the error annotation process, we accessed the translation quality of each sentence using a four point comprehension scale from 1 to 5. Both tasks of error and quality annotation were performed by two different annotators, achieving good levels of inter-annotator agreement. The creation of this corpus allowed us to use it as training data for a translation quality classifier. We concluded on error severity by observing the outputs of two machine learning classifiers: a decision tree and a regression model. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1044/ |
https://www.aclweb.org/anthology/L16-1044 | |
PWC | https://paperswithcode.com/paper/building-a-corpus-of-errors-and-quality-in |
Repo | |
Framework | |
Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities
Title | Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities |
Authors | Victoria Yaneva, Irina Temnikova, Ruslan Mitkov |
Abstract | This paper presents an approach for automatic evaluation of the readability of text simplification output for readers with cognitive disabilities. First, we present our work towards the development of the EasyRead corpus, which contains easy-to-read documents created especially for people with cognitive disabilities. We then compare the EasyRead corpus to the simplified output contained in the LocalNews corpus (Feng, 2009), the accessibility of which has been evaluated through reading comprehension experiments including 20 adults with mild intellectual disability. This comparison is made on the basis of 13 disability-specific linguistic features. The comparison reveals that there are no major differences between the two corpora, which shows that the EasyRead corpus is to a similar reading level as the user-evaluated texts. We also discuss the role of Simple Wikipedia (Zhu et al., 2010) as a widely-used accessibility benchmark, in light of our finding that it is significantly more complex than both the EasyRead and the LocalNews corpora. |
Tasks | Reading Comprehension, Text Simplification |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1045/ |
https://www.aclweb.org/anthology/L16-1045 | |
PWC | https://paperswithcode.com/paper/evaluating-the-readability-of-text |
Repo | |
Framework | |
A Proposal for a Part-of-Speech Tagset for the Albanian Language
Title | A Proposal for a Part-of-Speech Tagset for the Albanian Language |
Authors | Besim Kabashi, Thomas Proisl |
Abstract | Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data. The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena. |
Tasks | Part-Of-Speech Tagging |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1682/ |
https://www.aclweb.org/anthology/L16-1682 | |
PWC | https://paperswithcode.com/paper/a-proposal-for-a-part-of-speech-tagset-for |
Repo | |
Framework | |
A dictionary- and rule-based system for identification of bacteria and habitats in text
Title | A dictionary- and rule-based system for identification of bacteria and habitats in text |
Authors | Helen V Cook, Evangelos Pafilis, Lars Juhl Jensen |
Abstract | |
Tasks | Named Entity Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-3006/ |
https://www.aclweb.org/anthology/W16-3006 | |
PWC | https://paperswithcode.com/paper/a-dictionary-and-rule-based-system-for |
Repo | |
Framework | |
Specifying and Annotating Reduced Argument Span Via QA-SRL
Title | Specifying and Annotating Reduced Argument Span Via QA-SRL |
Authors | Gabriel Stanovsky, Ido Dagan, Meni Adler |
Abstract | |
Tasks | Knowledge Base Population, Reading Comprehension |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2077/ |
https://www.aclweb.org/anthology/P16-2077 | |
PWC | https://paperswithcode.com/paper/specifying-and-annotating-reduced-argument |
Repo | |
Framework | |
Entity-Supported Summarization of Biomedical Abstracts
Title | Entity-Supported Summarization of Biomedical Abstracts |
Authors | Frederik Schulze, Mariana Neves |
Abstract | The increasing amount of biomedical information that is available for researchers and clinicians makes it harder to quickly find the right information. Automatic summarization of multiple texts can provide summaries specific to the user{'}s information needs. In this paper we look into the use named-entity recognition for graph-based summarization. We extend the LexRank algorithm with information about named entities and present EntityRank, a multi-document graph-based summarization algorithm that is solely based on named entities. We evaluate our system on a datasets of 1009 human written summaries provided by BioASQ and on 1974 gene summaries, fetched from the Entrez Gene database. The results show that the addition of named-entity information increases the performance of graph-based summarizers and that the EntityRank significantly outperforms the other methods with regard to the ROUGE measures. |
Tasks | Document Summarization, Multi-Document Summarization, Named Entity Recognition, Question Answering, Text Summarization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5105/ |
https://www.aclweb.org/anthology/W16-5105 | |
PWC | https://paperswithcode.com/paper/entity-supported-summarization-of-biomedical |
Repo | |
Framework | |
PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis
Title | PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis |
Authors | Mateusz Lango, Dariusz Brzezinski, Jerzy Stefanowski |
Abstract | |
Tasks | Part-Of-Speech Tagging, Sentiment Analysis, Twitter Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1018/ |
https://www.aclweb.org/anthology/S16-1018 | |
PWC | https://paperswithcode.com/paper/put-at-semeval-2016-task-4-the-abc-of-twitter |
Repo | |
Framework | |
Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015
Title | Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015 |
Authors | Johann Poignant, Herv{'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Thomas Tamisier |
Abstract | In this paper, we claim that the CAMOMILE collaborative annotation platform (developed in the framework of the eponymous CHIST-ERA project) eases the organization of multimedia technology benchmarks, automating most of the campaign technical workflow and enabling collaborative (hence faster and cheaper) annotation of the evaluation data. This is demonstrated through the successful organization of a new multimedia task at MediaEval 2015, Multimodal Person Discovery in Broadcast TV. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1047/ |
https://www.aclweb.org/anthology/L16-1047 | |
PWC | https://paperswithcode.com/paper/benchmarking-multimedia-technologies-with-the |
Repo | |
Framework | |
Identifying Argument Components through TextRank
Title | Identifying Argument Components through TextRank |
Authors | Georgios Petasis, Vangelis Karkaletsis |
Abstract | |
Tasks | Argument Mining |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2811/ |
https://www.aclweb.org/anthology/W16-2811 | |
PWC | https://paperswithcode.com/paper/identifying-argument-components-through |
Repo | |
Framework | |
UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification
Title | UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification |
Authors | Omar Abdelwahab, Adel Elmaghraby |
Abstract | |
Tasks | Feature Engineering, Sentiment Analysis, Transfer Learning, Twitter Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1024/ |
https://www.aclweb.org/anthology/S16-1024 | |
PWC | https://paperswithcode.com/paper/uofl-at-semeval-2016-task-4-multi-domain |
Repo | |
Framework | |
A Corpus of Tables in Full-Text Biomedical Research Publications
Title | A Corpus of Tables in Full-Text Biomedical Research Publications |
Authors | Tatyana Shmanina, Ingrid Zukerman, Ai Lee Cheam, Thomas Bochynek, Lawrence Cavedon |
Abstract | The development of text mining techniques for biomedical research literature has received increased attention in recent times. However, most of these techniques focus on prose, while much important biomedical data reside in tables. In this paper, we present a corpus created to serve as a gold standard for the development and evaluation of techniques for the automatic extraction of information from biomedical tables. We describe the guidelines used for corpus annotation and the manner in which they were developed. The high inter-annotator agreement achieved on the corpus, and the generic nature of our annotation approach, suggest that the developed guidelines can serve as a general framework for table annotation in biomedical and other scientific domains. The annotated corpus and the guidelines are available at \url{http://www.csse.monash.edu.au/research/umnl/data/index.shtml}. |
Tasks | Entity Linking, Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5108/ |
https://www.aclweb.org/anthology/W16-5108 | |
PWC | https://paperswithcode.com/paper/a-corpus-of-tables-in-full-text-biomedical |
Repo | |
Framework | |