May 5, 2019

1564 words 8 mins read

Paper Group NANR 8

Paper Group NANR 8

Mathematical Information Retrieval based on Type Embeddings and Query Expansion. A critique of word similarity as a method for evaluating distributional semantic models. NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis. UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based R …

Mathematical Information Retrieval based on Type Embeddings and Query Expansion

Title Mathematical Information Retrieval based on Type Embeddings and Query Expansion
Authors Yiannos Stathopoulos, Simone Teufel
Abstract We present an approach to mathematical information retrieval (MIR) that exploits a special kind of technical terminology, referred to as a mathematical type. In this paper, we present and evaluate a type detection mechanism and show its positive effect on the retrieval of research-level mathematics. Our best model, which performs query expansion with a type-aware embedding space, strongly outperforms standard IR models with state-of-the-art query expansion (vector space-based and language modelling-based), on a relatively new corpus of research-level queries.
Tasks Information Retrieval, Language Modelling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1221/
PDF https://www.aclweb.org/anthology/C16-1221
PWC https://paperswithcode.com/paper/mathematical-information-retrieval-based-on
Repo
Framework

A critique of word similarity as a method for evaluating distributional semantic models

Title A critique of word similarity as a method for evaluating distributional semantic models
Authors Miroslav Batchkarov, Thomas Kober, Jeremy Reffin, Julie Weeds, David Weir
Abstract
Tasks Document Classification, Natural Language Inference
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2502/
PDF https://www.aclweb.org/anthology/W16-2502
PWC https://paperswithcode.com/paper/a-critique-of-word-similarity-as-a-method-for
Repo
Framework

NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis

Title NTNUSentEval at SemEval-2016 Task 4: Combining General Classifiers for Fast Twitter Sentiment Analysis
Authors Brage Ekroll Jahren, Valerij Fredriksen, Bj{"o}rn Gamb{"a}ck, Lars Bungum
Abstract
Tasks Sentiment Analysis, Twitter Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1014/
PDF https://www.aclweb.org/anthology/S16-1014
PWC https://paperswithcode.com/paper/ntnusenteval-at-semeval-2016-task-4-combining
Repo
Framework

UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation

Title UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation
Authors Esteban Castillo, Ofelia Cervantes, Darnes Vilari{~n}o, David B{'a}ez
Abstract
Tasks Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1015/
PDF https://www.aclweb.org/anthology/S16-1015
PWC https://paperswithcode.com/paper/udlap-at-semeval-2016-task-4-sentiment
Repo
Framework

Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact

Title Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact
Authors {^A}ngela Costa, Rui Correia, Lu{'\i}sa Coheur
Abstract In this paper we describe a corpus of automatic translations annotated with both error type and quality. The 300 sentences that we have selected were generated by Google Translate, Systran and two in-house Machine Translation systems that use Moses technology. The errors present on the translations were annotated with an error taxonomy that divides errors in five main linguistic categories (Orthography, Lexis, Grammar, Semantics and Discourse), reflecting the language level where the error is located. After the error annotation process, we accessed the translation quality of each sentence using a four point comprehension scale from 1 to 5. Both tasks of error and quality annotation were performed by two different annotators, achieving good levels of inter-annotator agreement. The creation of this corpus allowed us to use it as training data for a translation quality classifier. We concluded on error severity by observing the outputs of two machine learning classifiers: a decision tree and a regression model.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1044/
PDF https://www.aclweb.org/anthology/L16-1044
PWC https://paperswithcode.com/paper/building-a-corpus-of-errors-and-quality-in
Repo
Framework

Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities

Title Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities
Authors Victoria Yaneva, Irina Temnikova, Ruslan Mitkov
Abstract This paper presents an approach for automatic evaluation of the readability of text simplification output for readers with cognitive disabilities. First, we present our work towards the development of the EasyRead corpus, which contains easy-to-read documents created especially for people with cognitive disabilities. We then compare the EasyRead corpus to the simplified output contained in the LocalNews corpus (Feng, 2009), the accessibility of which has been evaluated through reading comprehension experiments including 20 adults with mild intellectual disability. This comparison is made on the basis of 13 disability-specific linguistic features. The comparison reveals that there are no major differences between the two corpora, which shows that the EasyRead corpus is to a similar reading level as the user-evaluated texts. We also discuss the role of Simple Wikipedia (Zhu et al., 2010) as a widely-used accessibility benchmark, in light of our finding that it is significantly more complex than both the EasyRead and the LocalNews corpora.
Tasks Reading Comprehension, Text Simplification
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1045/
PDF https://www.aclweb.org/anthology/L16-1045
PWC https://paperswithcode.com/paper/evaluating-the-readability-of-text
Repo
Framework

A Proposal for a Part-of-Speech Tagset for the Albanian Language

Title A Proposal for a Part-of-Speech Tagset for the Albanian Language
Authors Besim Kabashi, Thomas Proisl
Abstract Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data. The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena.
Tasks Part-Of-Speech Tagging
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1682/
PDF https://www.aclweb.org/anthology/L16-1682
PWC https://paperswithcode.com/paper/a-proposal-for-a-part-of-speech-tagset-for
Repo
Framework

A dictionary- and rule-based system for identification of bacteria and habitats in text

Title A dictionary- and rule-based system for identification of bacteria and habitats in text
Authors Helen V Cook, Evangelos Pafilis, Lars Juhl Jensen
Abstract
Tasks Named Entity Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3006/
PDF https://www.aclweb.org/anthology/W16-3006
PWC https://paperswithcode.com/paper/a-dictionary-and-rule-based-system-for
Repo
Framework

Specifying and Annotating Reduced Argument Span Via QA-SRL

Title Specifying and Annotating Reduced Argument Span Via QA-SRL
Authors Gabriel Stanovsky, Ido Dagan, Meni Adler
Abstract
Tasks Knowledge Base Population, Reading Comprehension
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2077/
PDF https://www.aclweb.org/anthology/P16-2077
PWC https://paperswithcode.com/paper/specifying-and-annotating-reduced-argument
Repo
Framework

Entity-Supported Summarization of Biomedical Abstracts

Title Entity-Supported Summarization of Biomedical Abstracts
Authors Frederik Schulze, Mariana Neves
Abstract The increasing amount of biomedical information that is available for researchers and clinicians makes it harder to quickly find the right information. Automatic summarization of multiple texts can provide summaries specific to the user{'}s information needs. In this paper we look into the use named-entity recognition for graph-based summarization. We extend the LexRank algorithm with information about named entities and present EntityRank, a multi-document graph-based summarization algorithm that is solely based on named entities. We evaluate our system on a datasets of 1009 human written summaries provided by BioASQ and on 1974 gene summaries, fetched from the Entrez Gene database. The results show that the addition of named-entity information increases the performance of graph-based summarizers and that the EntityRank significantly outperforms the other methods with regard to the ROUGE measures.
Tasks Document Summarization, Multi-Document Summarization, Named Entity Recognition, Question Answering, Text Summarization
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5105/
PDF https://www.aclweb.org/anthology/W16-5105
PWC https://paperswithcode.com/paper/entity-supported-summarization-of-biomedical
Repo
Framework

PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis

Title PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis
Authors Mateusz Lango, Dariusz Brzezinski, Jerzy Stefanowski
Abstract
Tasks Part-Of-Speech Tagging, Sentiment Analysis, Twitter Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1018/
PDF https://www.aclweb.org/anthology/S16-1018
PWC https://paperswithcode.com/paper/put-at-semeval-2016-task-4-the-abc-of-twitter
Repo
Framework

Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015

Title Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015
Authors Johann Poignant, Herv{'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Thomas Tamisier
Abstract In this paper, we claim that the CAMOMILE collaborative annotation platform (developed in the framework of the eponymous CHIST-ERA project) eases the organization of multimedia technology benchmarks, automating most of the campaign technical workflow and enabling collaborative (hence faster and cheaper) annotation of the evaluation data. This is demonstrated through the successful organization of a new multimedia task at MediaEval 2015, Multimodal Person Discovery in Broadcast TV.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1047/
PDF https://www.aclweb.org/anthology/L16-1047
PWC https://paperswithcode.com/paper/benchmarking-multimedia-technologies-with-the
Repo
Framework

Identifying Argument Components through TextRank

Title Identifying Argument Components through TextRank
Authors Georgios Petasis, Vangelis Karkaletsis
Abstract
Tasks Argument Mining
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2811/
PDF https://www.aclweb.org/anthology/W16-2811
PWC https://paperswithcode.com/paper/identifying-argument-components-through
Repo
Framework

UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification

Title UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification
Authors Omar Abdelwahab, Adel Elmaghraby
Abstract
Tasks Feature Engineering, Sentiment Analysis, Transfer Learning, Twitter Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1024/
PDF https://www.aclweb.org/anthology/S16-1024
PWC https://paperswithcode.com/paper/uofl-at-semeval-2016-task-4-multi-domain
Repo
Framework

A Corpus of Tables in Full-Text Biomedical Research Publications

Title A Corpus of Tables in Full-Text Biomedical Research Publications
Authors Tatyana Shmanina, Ingrid Zukerman, Ai Lee Cheam, Thomas Bochynek, Lawrence Cavedon
Abstract The development of text mining techniques for biomedical research literature has received increased attention in recent times. However, most of these techniques focus on prose, while much important biomedical data reside in tables. In this paper, we present a corpus created to serve as a gold standard for the development and evaluation of techniques for the automatic extraction of information from biomedical tables. We describe the guidelines used for corpus annotation and the manner in which they were developed. The high inter-annotator agreement achieved on the corpus, and the generic nature of our annotation approach, suggest that the developed guidelines can serve as a general framework for table annotation in biomedical and other scientific domains. The annotated corpus and the guidelines are available at \url{http://www.csse.monash.edu.au/research/umnl/data/index.shtml}.
Tasks Entity Linking, Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5108/
PDF https://www.aclweb.org/anthology/W16-5108
PWC https://paperswithcode.com/paper/a-corpus-of-tables-in-full-text-biomedical
Repo
Framework
comments powered by Disqus