May 5, 2019

1564 words 8 mins read

Paper Group NANR 8

Mathematical Information Retrieval based on Type Embeddings and Query Expansion. A critique of word similarity as a method for evaluating distributional semantic models. NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis. UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based R …

Mathematical Information Retrieval based on Type Embeddings and Query Expansion


Title	Mathematical Information Retrieval based on Type Embeddings and Query Expansion
Authors	Yiannos Stathopoulos, Simone Teufel
Abstract	We present an approach to mathematical information retrieval (MIR) that exploits a special kind of technical terminology, referred to as a mathematical type. In this paper, we present and evaluate a type detection mechanism and show its positive effect on the retrieval of research-level mathematics. Our best model, which performs query expansion with a type-aware embedding space, strongly outperforms standard IR models with state-of-the-art query expansion (vector space-based and language modelling-based), on a relatively new corpus of research-level queries.
Tasks	Information Retrieval, Language Modelling
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1221/
PDF	https://www.aclweb.org/anthology/C16-1221
PWC	https://paperswithcode.com/paper/mathematical-information-retrieval-based-on
Repo
Framework

A critique of word similarity as a method for evaluating distributional semantic models


Title	A critique of word similarity as a method for evaluating distributional semantic models
Authors	Miroslav Batchkarov, Thomas Kober, Jeremy Reffin, Julie Weeds, David Weir
Abstract
Tasks	Document Classification, Natural Language Inference
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2502/
PDF	https://www.aclweb.org/anthology/W16-2502
PWC	https://paperswithcode.com/paper/a-critique-of-word-similarity-as-a-method-for
Repo
Framework

NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis


Title	NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis
Authors	Brage Ekroll Jahren, Valerij Fredriksen, Bj{"o}rn Gamb{"a}ck, Lars Bungum
Abstract
Tasks	Sentiment Analysis, Twitter Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1014/
PDF	https://www.aclweb.org/anthology/S16-1014
PWC	https://paperswithcode.com/paper/ntnusenteval-at-semeval-2016-task-4-combining
Repo
Framework

UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation


Title	UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation
Authors	Esteban Castillo, Ofelia Cervantes, Darnes Vilari{~n}o, David B{'a}ez
Abstract
Tasks	Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1015/
PDF	https://www.aclweb.org/anthology/S16-1015
PWC	https://paperswithcode.com/paper/udlap-at-semeval-2016-task-4-sentiment
Repo
Framework

Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact


Title	Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact
Authors	{^A}ngela Costa, Rui Correia, Lu{'\i}sa Coheur
Abstract	In this paper we describe a corpus of automatic translations annotated with both error type and quality. The 300 sentences that we have selected were generated by Google Translate, Systran and two in-house Machine Translation systems that use Moses technology. The errors present on the translations were annotated with an error taxonomy that divides errors in five main linguistic categories (Orthography, Lexis, Grammar, Semantics and Discourse), reflecting the language level where the error is located. After the error annotation process, we accessed the translation quality of each sentence using a four point comprehension scale from 1 to 5. Both tasks of error and quality annotation were performed by two different annotators, achieving good levels of inter-annotator agreement. The creation of this corpus allowed us to use it as training data for a translation quality classifier. We concluded on error severity by observing the outputs of two machine learning classifiers: a decision tree and a regression model.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1044/
PDF	https://www.aclweb.org/anthology/L16-1044
PWC	https://paperswithcode.com/paper/building-a-corpus-of-errors-and-quality-in
Repo
Framework

Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities


Title	Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities
Authors	Victoria Yaneva, Irina Temnikova, Ruslan Mitkov
Abstract	This paper presents an approach for automatic evaluation of the readability of text simplification output for readers with cognitive disabilities. First, we present our work towards the development of the EasyRead corpus, which contains easy-to-read documents created especially for people with cognitive disabilities. We then compare the EasyRead corpus to the simplified output contained in the LocalNews corpus (Feng, 2009), the accessibility of which has been evaluated through reading comprehension experiments including 20 adults with mild intellectual disability. This comparison is made on the basis of 13 disability-specific linguistic features. The comparison reveals that there are no major differences between the two corpora, which shows that the EasyRead corpus is to a similar reading level as the user-evaluated texts. We also discuss the role of Simple Wikipedia (Zhu et al., 2010) as a widely-used accessibility benchmark, in light of our finding that it is significantly more complex than both the EasyRead and the LocalNews corpora.
Tasks	Reading Comprehension, Text Simplification
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1045/
PDF	https://www.aclweb.org/anthology/L16-1045
PWC	https://paperswithcode.com/paper/evaluating-the-readability-of-text
Repo
Framework

A Proposal for a Part-of-Speech Tagset for the Albanian Language


Title	A Proposal for a Part-of-Speech Tagset for the Albanian Language
Authors	Besim Kabashi, Thomas Proisl
Abstract	Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data. The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena.
Tasks	Part-Of-Speech Tagging
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1682/
PDF	https://www.aclweb.org/anthology/L16-1682
PWC	https://paperswithcode.com/paper/a-proposal-for-a-part-of-speech-tagset-for
Repo
Framework

A dictionary- and rule-based system for identification of bacteria and habitats in text


Title	A dictionary- and rule-based system for identification of bacteria and habitats in text
Authors	Helen V Cook, Evangelos Pafilis, Lars Juhl Jensen
Abstract
Tasks	Named Entity Recognition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3006/
PDF	https://www.aclweb.org/anthology/W16-3006
PWC	https://paperswithcode.com/paper/a-dictionary-and-rule-based-system-for
Repo
Framework

Specifying and Annotating Reduced Argument Span Via QA-SRL


Title	Specifying and Annotating Reduced Argument Span Via QA-SRL
Authors	Gabriel Stanovsky, Ido Dagan, Meni Adler
Abstract
Tasks	Knowledge Base Population, Reading Comprehension
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2077/
PDF	https://www.aclweb.org/anthology/P16-2077
PWC	https://paperswithcode.com/paper/specifying-and-annotating-reduced-argument
Repo
Framework

Entity-Supported Summarization of Biomedical Abstracts


Title	Entity-Supported Summarization of Biomedical Abstracts
Authors	Frederik Schulze, Mariana Neves
Abstract	The increasing amount of biomedical information that is available for researchers and clinicians makes it harder to quickly find the right information. Automatic summarization of multiple texts can provide summaries specific to the user{'}s information needs. In this paper we look into the use named-entity recognition for graph-based summarization. We extend the LexRank algorithm with information about named entities and present EntityRank, a multi-document graph-based summarization algorithm that is solely based on named entities. We evaluate our system on a datasets of 1009 human written summaries provided by BioASQ and on 1974 gene summaries, fetched from the Entrez Gene database. The results show that the addition of named-entity information increases the performance of graph-based summarizers and that the EntityRank significantly outperforms the other methods with regard to the ROUGE measures.
Tasks	Document Summarization, Multi-Document Summarization, Named Entity Recognition, Question Answering, Text Summarization
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5105/
PDF	https://www.aclweb.org/anthology/W16-5105
PWC	https://paperswithcode.com/paper/entity-supported-summarization-of-biomedical
Repo
Framework

PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis


Title	PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis
Authors	Mateusz Lango, Dariusz Brzezinski, Jerzy Stefanowski
Abstract
Tasks	Part-Of-Speech Tagging, Sentiment Analysis, Twitter Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1018/
PDF	https://www.aclweb.org/anthology/S16-1018
PWC	https://paperswithcode.com/paper/put-at-semeval-2016-task-4-the-abc-of-twitter
Repo
Framework

Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015


Title	Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015
Authors	Johann Poignant, Herv{'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Thomas Tamisier
Abstract	In this paper, we claim that the CAMOMILE collaborative annotation platform (developed in the framework of the eponymous CHIST-ERA project) eases the organization of multimedia technology benchmarks, automating most of the campaign technical workflow and enabling collaborative (hence faster and cheaper) annotation of the evaluation data. This is demonstrated through the successful organization of a new multimedia task at MediaEval 2015, Multimodal Person Discovery in Broadcast TV.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1047/
PDF	https://www.aclweb.org/anthology/L16-1047
PWC	https://paperswithcode.com/paper/benchmarking-multimedia-technologies-with-the
Repo
Framework

Identifying Argument Components through TextRank


Title	Identifying Argument Components through TextRank
Authors	Georgios Petasis, Vangelis Karkaletsis
Abstract
Tasks	Argument Mining
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2811/
PDF	https://www.aclweb.org/anthology/W16-2811
PWC	https://paperswithcode.com/paper/identifying-argument-components-through
Repo
Framework

UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification


Title	UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification
Authors	Omar Abdelwahab, Adel Elmaghraby
Abstract
Tasks	Feature Engineering, Sentiment Analysis, Transfer Learning, Twitter Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1024/
PDF	https://www.aclweb.org/anthology/S16-1024
PWC	https://paperswithcode.com/paper/uofl-at-semeval-2016-task-4-multi-domain
Repo
Framework

A Corpus of Tables in Full-Text Biomedical Research Publications


Title	A Corpus of Tables in Full-Text Biomedical Research Publications
Authors	Tatyana Shmanina, Ingrid Zukerman, Ai Lee Cheam, Thomas Bochynek, Lawrence Cavedon
Abstract	The development of text mining techniques for biomedical research literature has received increased attention in recent times. However, most of these techniques focus on prose, while much important biomedical data reside in tables. In this paper, we present a corpus created to serve as a gold standard for the development and evaluation of techniques for the automatic extraction of information from biomedical tables. We describe the guidelines used for corpus annotation and the manner in which they were developed. The high inter-annotator agreement achieved on the corpus, and the generic nature of our annotation approach, suggest that the developed guidelines can serve as a general framework for table annotation in biomedical and other scientific domains. The annotated corpus and the guidelines are available at \url{http://www.csse.monash.edu.au/research/umnl/data/index.shtml}.
Tasks	Entity Linking, Named Entity Recognition
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5108/
PDF	https://www.aclweb.org/anthology/W16-5108
PWC	https://paperswithcode.com/paper/a-corpus-of-tables-in-full-text-biomedical
Repo
Framework