July 26, 2019

2409 words 12 mins read

Paper Group NANR 27

Paper Group NANR 27

A Multi-strategy Query Processing Approach for Biomedical Question Answering: USTB_PRIR at BioASQ 2017 Task 5B. SemEval-2017 Task 12: Clinical TempEval. Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies. Know-Center at SemEval-2017 Task 10: Sequence Classification …

A Multi-strategy Query Processing Approach for Biomedical Question Answering: USTB_PRIR at BioASQ 2017 Task 5B

Title A Multi-strategy Query Processing Approach for Biomedical Question Answering: USTB_PRIR at BioASQ 2017 Task 5B
Authors Zan-Xia Jin, Bo-Wen Zhang, Fan Fang, Le-Le Zhang, Xu-Cheng Yin
Abstract This paper describes the participation of USTB{_}PRIR team in the 2017 BioASQ 5B on question answering, including document retrieval, snippet retrieval, and concept retrieval task. We introduce different multimodal query processing strategies to enrich query terms and assign different weights to them. Specifically, sequential dependence model (SDM), pseudo-relevance feedback (PRF), fielded sequential dependence model (FSDM) and Divergence from Randomness model (DFRM) are respectively performed on different fields of PubMed articles, sentences extracted from relevant articles, the five terminologies or ontologies (MeSH, GO, Jochem, Uniprot and DO) to achieve better search performances. Preliminary results show that our systems outperform others in the document and snippet retrieval task in the first two batches.
Tasks Information Retrieval, Question Answering
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2348/
PDF https://www.aclweb.org/anthology/W17-2348
PWC https://paperswithcode.com/paper/a-multi-strategy-query-processing-approach
Repo
Framework

SemEval-2017 Task 12: Clinical TempEval

Title SemEval-2017 Task 12: Clinical TempEval
Authors Steven Bethard, Guergana Savova, Martha Palmer, James Pustejovsky
Abstract Clinical TempEval 2017 aimed to answer the question: how well do systems trained on annotated timelines for one medical condition (colon cancer) perform in predicting timelines on another medical condition (brain cancer)? Nine sub-tasks were included, covering problems in time expression identification, event expression identification and temporal relation identification. Participant systems were evaluated on clinical and pathology notes from Mayo Clinic cancer patients, annotated with an extension of TimeML for the clinical domain. 11 teams participated in the tasks, with the best systems achieving F1 scores above 0.55 for time expressions, above 0.70 for event expressions, and above 0.40 for temporal relations. Most tasks observed about a 20 point drop over Clinical TempEval 2016, where systems were trained and evaluated on the same domain (colon cancer).
Tasks Domain Adaptation, Temporal Information Extraction
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2093/
PDF https://www.aclweb.org/anthology/S17-2093
PWC https://paperswithcode.com/paper/semeval-2017-task-12-clinical-tempeval
Repo
Framework

Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies

Title Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies
Authors Marta R. Costa-juss{`a}
Abstract Catalan and Spanish are two related languages given that both derive from Latin. They share similarities in several linguistic levels including morphology, syntax and semantics. This makes them particularly interesting for the MT task. Given the recent appearance and popularity of neural MT, this paper analyzes the performance of this new approach compared to the well-established rule-based and phrase-based MT systems. Experiments are reported on a large database of 180 million words. Results, in terms of standard automatic measures, show that neural MT clearly outperforms the rule-based and phrase-based MT system on in-domain test set, but it is worst in the out-of-domain test set. A naive system combination specially works for the latter. In-domain manual analysis shows that neural MT tends to improve both adequacy and fluency, for example, by being able to generate more natural translations instead of literal ones, choosing to the adequate target word when the source word has several translations and improving gender agreement. However, out-of-domain manual analysis shows how neural MT is more affected by unknown words or contexts.
Tasks Machine Translation
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1207/
PDF https://www.aclweb.org/anthology/W17-1207
PWC https://paperswithcode.com/paper/why-catalan-spanish-neural-machine
Repo
Framework

Know-Center at SemEval-2017 Task 10: Sequence Classification with the CODE Annotator

Title Know-Center at SemEval-2017 Task 10: Sequence Classification with the CODE Annotator
Authors Roman Kern, Stefan Falk, Andi Rexha
Abstract This paper describes our participation in SemEval-2017 Task 10. We competed in Subtask 1 and 2 which consist respectively in identifying all the key phrases in scientific publications and label them with one of the three categories: Task, Process, and Material. These scientific publications are selected from Computer Science, Material Sciences, and Physics domains. We followed a supervised approach for both subtasks by using a sequential classifier (CRF - Conditional Random Fields). For generating our solution we used a web-based application implemented in the EU-funded research project, named CODE. Our system achieved an F1 score of 0.39 for the Subtask 1 and 0.28 for the Subtask 2.
Tasks Information Retrieval, Reading Comprehension
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2167/
PDF https://www.aclweb.org/anthology/S17-2167
PWC https://paperswithcode.com/paper/know-center-at-semeval-2017-task-10-sequence
Repo
Framework

A Multi-aspect Analysis of Automatic Essay Scoring for Brazilian Portuguese

Title A Multi-aspect Analysis of Automatic Essay Scoring for Brazilian Portuguese
Authors Evelin Amorim, Adriano Veloso
Abstract Several methods for automatic essay scoring (AES) for English language have been proposed. However, multi-aspect AES systems for other languages are unusual. Therefore, we propose a multi-aspect AES system to apply on a dataset of Brazilian Portuguese essays, which human experts evaluated according to five aspects defined by Brazilian Government to the National Exam to High School Student (ENEM). These aspects are skills that student must master and every skill is assessed apart from each other. Besides the prediction of each aspect, the feature analysis also was performed for each aspect. The AES system proposed employs several features already employed by AES systems for English language. Our results show that predictions for some aspects performed well with the features we employed, while predictions for other aspects performed poorly. Also, it is possible to note the difference between the five aspects in the detailed feature analysis we performed. Besides these contributions, the eight millions of enrollments every year for ENEM raise some challenge issues for future directions in our research.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-4010/
PDF https://www.aclweb.org/anthology/E17-4010
PWC https://paperswithcode.com/paper/a-multi-aspect-analysis-of-automatic-essay
Repo
Framework

Procedural Text Generation from an Execution Video

Title Procedural Text Generation from an Execution Video
Authors Atsushi Ushiku, Hayato Hashimoto, Atsushi Hashimoto, Shinsuke Mori
Abstract In recent years, there has been a surge of interest in automatically describing images or videos in a natural language. These descriptions are useful for image/video search, etc. In this paper, we focus on procedure execution videos, in which a human makes or repairs something and propose a method for generating procedural texts from them. Since video/text pairs available are limited in size, the direct application of end-to-end deep learning is not feasible. Thus we propose to train Faster R-CNN network for object recognition and LSTM for text generation and combine them at run time. We took pairs of recipe and cooking video, generated a recipe from a video, and compared it with the original recipe. The experimental results showed that our method can produce a recipe as accurate as the state-of-the-art scene descriptions.
Tasks Object Recognition, Text Generation, Video Captioning
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1033/
PDF https://www.aclweb.org/anthology/I17-1033
PWC https://paperswithcode.com/paper/procedural-text-generation-from-an-execution
Repo
Framework

Increasing Return on Annotation Investment: The Automatic Construction of a Universal Dependency Treebank for Dutch

Title Increasing Return on Annotation Investment: The Automatic Construction of a Universal Dependency Treebank for Dutch
Authors Gosse Bouma, Gertjan van Noord
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0403/
PDF https://www.aclweb.org/anthology/W17-0403
PWC https://paperswithcode.com/paper/increasing-return-on-annotation-investment
Repo
Framework

Inducing Semantic Micro-Clusters from Deep Multi-View Representations of Novels

Title Inducing Semantic Micro-Clusters from Deep Multi-View Representations of Novels
Authors Lea Frermann, Gy{"o}rgy Szarvas
Abstract Automatically understanding the plot of novels is important both for informing literary scholarship and applications such as summarization or recommendation. Various models have addressed this task, but their evaluation has remained largely intrinsic and qualitative. Here, we propose a principled and scalable framework leveraging expert-provided semantic tags (e.g., mystery, pirates) to evaluate plot representations in an extrinsic fashion, assessing their ability to produce locally coherent groupings of novels (micro-clusters) in model space. We present a deep recurrent autoencoder model that learns richly structured multi-view plot representations, and show that they i) yield better micro-clusters than less structured representations; and ii) are interpretable, and thus useful for further literary analysis or labeling of the emerging micro-clusters.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1200/
PDF https://www.aclweb.org/anthology/D17-1200
PWC https://paperswithcode.com/paper/inducing-semantic-micro-clusters-from-deep
Repo
Framework

Multilingual CALL Framework for Automatic Language Exercise Generation from Free Text

Title Multilingual CALL Framework for Automatic Language Exercise Generation from Free Text
Authors Naiara Perez, Montse Cuadros
Abstract This paper describes a web-based application to design and answer exercises for language learning. It is available in Basque, Spanish, English, and French. Based on open-source Natural Language Processing (NLP) technology such as word embedding models and word sense disambiguation, the application enables users to automatic create easily and in real time three types of exercises, namely, Fill-in-the-Gaps, Multiple Choice, and Shuffled Sentences questionnaires. These are generated from texts of the users{'} own choice, so they can train their language skills with content of their particular interest.
Tasks Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-3013/
PDF https://www.aclweb.org/anthology/E17-3013
PWC https://paperswithcode.com/paper/multilingual-call-framework-for-automatic
Repo
Framework

Fast, Sample-Efficient Algorithms for Structured Phase Retrieval

Title Fast, Sample-Efficient Algorithms for Structured Phase Retrieval
Authors Gauri Jagatap, Chinmay Hegde
Abstract We consider the problem of recovering a signal x in R^n, from magnitude-only measurements, y_i = a_i^T x for i={1,2…m}. Also known as the phase retrieval problem, it is a fundamental challenge in nano-, bio- and astronomical imaging systems, astronomical imaging, and speech processing. The problem is ill-posed, and therefore additional assumptions on the signal and/or the measurements are necessary. In this paper, we first study the case where the underlying signal x is s-sparse. We develop a novel recovery algorithm that we call Compressive Phase Retrieval with Alternating Minimization, or CoPRAM. Our algorithm is simple and can be obtained via a natural combination of the classical alternating minimization approach for phase retrieval, with the CoSaMP algorithm for sparse recovery. Despite its simplicity, we prove that our algorithm achieves a sample complexity of O(s^2 log n) with Gaussian samples, which matches the best known existing results. It also demonstrates linear convergence in theory and practice and requires no extra tuning parameters other than the signal sparsity level s. We then consider the case where the underlying signal x arises from to structured sparsity models. We specifically examine the case of block-sparse signals with uniform block size of b and block sparsity k=s/b. For this problem, we design a recovery algorithm that we call Block CoPRAM that further reduces the sample complexity to O(ks log n). For sufficiently large block lengths of b=Theta(s), this bound equates to O(s log n). To our knowledge, this constitutes the first end-to-end linearly convergent family of algorithms for phase retrieval where the Gaussian sample complexity has a sub-quadratic dependence on the sparsity level of the signal.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7077-fast-sample-efficient-algorithms-for-structured-phase-retrieval
PDF http://papers.nips.cc/paper/7077-fast-sample-efficient-algorithms-for-structured-phase-retrieval.pdf
PWC https://paperswithcode.com/paper/fast-sample-efficient-algorithms-for
Repo
Framework

Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics

Title Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics
Authors
Abstract
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-4000/
PDF https://www.aclweb.org/anthology/E17-4000
PWC https://paperswithcode.com/paper/proceedings-of-the-student-research-workshop-3
Repo
Framework

An RNN-based Binary Classifier for the Story Cloze Test

Title An RNN-based Binary Classifier for the Story Cloze Test
Authors Melissa Roemmele, Sosuke Kobayashi, Naoya Inoue, Andrew Gordon
Abstract The Story Cloze Test consists of choosing a sentence that best completes a story given two choices. In this paper we present a system that performs this task using a supervised binary classifier on top of a recurrent neural network to predict the probability that a given story ending is correct. The classifier is trained to distinguish correct story endings given in the training data from incorrect ones that we artificially generate. Our experiments evaluate different methods for generating these negative examples, as well as different embedding-based representations of the stories. Our best result obtains 67.2{%} accuracy on the test set, outperforming the existing top baseline of 58.5{%}.
Tasks Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-0911/
PDF https://www.aclweb.org/anthology/W17-0911
PWC https://paperswithcode.com/paper/an-rnn-based-binary-classifier-for-the-story
Repo
Framework

Effective shared representations with Multitask Learning for Community Question Answering

Title Effective shared representations with Multitask Learning for Community Question Answering
Authors Daniele Bonadiman, Antonio Uva, Aless Moschitti, ro
Abstract An important asset of using Deep Neural Networks (DNNs) for text applications is their ability to automatically engineering features. Unfortunately, DNNs usually require a lot of training data, especially for highly semantic tasks such as community Question Answering (cQA). In this paper, we tackle the problem of data scarcity by learning the target DNN together with two auxiliary tasks in a multitask learning setting. We exploit the strong semantic connection between selection of comments relevant to (i) new questions and (ii) forum questions. This enables a global representation for comments, new and previous questions. The experiments of our model on a SemEval challenge dataset for cQA show a 20{%} of relative improvement over standard DNNs.
Tasks Community Question Answering, Document Ranking, Named Entity Recognition, Question Answering
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2115/
PDF https://www.aclweb.org/anthology/E17-2115
PWC https://paperswithcode.com/paper/effective-shared-representations-with
Repo
Framework

NTT Neural Machine Translation Systems at WAT 2017

Title NTT Neural Machine Translation Systems at WAT 2017
Authors Makoto Morishita, Jun Suzuki, Masaaki Nagata
Abstract In this year, we participated in four translation subtasks at WAT 2017. Our model structure is quite simple but we used it with well-tuned hyper-parameters, leading to a significant improvement compared to the previous state-of-the-art system. We also tried to make use of the unreliable part of the provided parallel corpus by back-translating and making a synthetic corpus. Our submitted system achieved the new state-of-the-art performance in terms of the BLEU score, as well as human evaluation.
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5706/
PDF https://www.aclweb.org/anthology/W17-5706
PWC https://paperswithcode.com/paper/ntt-neural-machine-translation-systems-at-wat
Repo
Framework

Online to Offline Conversions, Universality and Adaptive Minibatch Sizes

Title Online to Offline Conversions, Universality and Adaptive Minibatch Sizes
Authors Kfir Levy
Abstract We present an approach towards convex optimization that relies on a novel scheme which converts adaptive online algorithms into offline methods. In the offline optimization setting, our derived methods are shown to obtain favourable adaptive guarantees which depend on the harmonic sum of the queried gradients. We further show that our methods implicitly adapt to the objective’s structure: in the smooth case fast convergence rates are ensured without any prior knowledge of the smoothness parameter, while still maintaining guarantees in the non-smooth setting. Our approach has a natural extension to the stochastic setting, resulting in a lazy version of SGD (stochastic GD), where minibathces are chosen adaptively depending on the magnitude of the gradients. Thus providing a principled approach towards choosing minibatch sizes.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6759-online-to-offline-conversions-universality-and-adaptive-minibatch-sizes
PDF http://papers.nips.cc/paper/6759-online-to-offline-conversions-universality-and-adaptive-minibatch-sizes.pdf
PWC https://paperswithcode.com/paper/online-to-offline-conversions-universality-1
Repo
Framework
comments powered by Disqus