July 26, 2019

1832 words 9 mins read

Paper Group NANR 119

Paper Group NANR 119

On orthogonality and learning RNNs with long term dependencies. Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages. Document retrieval and question answering in medical documents. A large-scale corpus challenge.. Rule-based Machine translation from English to Finnish. EUDAMU at SemEval-2017 Task …

On orthogonality and learning RNNs with long term dependencies

Title On orthogonality and learning RNNs with long term dependencies
Authors Eugene Vorontsov, Chiheb Trabelsi, Samuel Kadoury, Chris Pal
Abstract It is well known that it is challenging to train deep neural networks and recurrent neural networks for tasks that exhibit long term dependencies. The vanishing or exploding gradient problem is a well known issue associated with these challenges. One approach to addressing vanishing and exploding gradients is to use either soft or hard constraints on weight matrices so as to encourage or enforce orthogonality. Orthogonal matrices preserve gradient norm during backpropagation and may therefore be a desirable property. This paper explores issues with optimization convergence, speed and gradient stability when encouraging or enforcing orthogonality. To perform this analysis, we propose a weight matrix factorization and parameterization strategy through which we can bound matrix norms and therein control the degree of expansivity induced during backpropagation. We find that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=740
PDF http://proceedings.mlr.press/v70/vorontsov17a/vorontsov17a.pdf
PWC https://paperswithcode.com/paper/on-orthogonality-and-learning-rnns-with-long
Repo
Framework

Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

Title Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages
Authors
Abstract
Tasks
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0100/
PDF https://www.aclweb.org/anthology/W17-0100
PWC https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-the-use-of
Repo
Framework

Document retrieval and question answering in medical documents. A large-scale corpus challenge.

Title Document retrieval and question answering in medical documents. A large-scale corpus challenge.
Authors Curea Eric
Abstract Whenever employed on large datasets, information retrieval works by isolating a subset of documents from the larger dataset and then proceeding with low-level processing of the text. This is usually carried out by means of adding index-terms to each document in the collection. In this paper we deal with automatic document classification and index-term detection applied on large-scale medical corpora. In our methodology we employ a linear classifier and we test our results on the BioASQ training corpora, which is a collection of 12 million MeSH-indexed medical abstracts. We cover both term-indexing, result retrieval and result ranking based on distributed word representations.
Tasks Document Classification, Information Retrieval, Question Answering
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-8001/
PDF https://doi.org/10.26615/978-954-452-044-1_001
PWC https://paperswithcode.com/paper/document-retrieval-and-question-answering-in
Repo
Framework

Rule-based Machine translation from English to Finnish

Title Rule-based Machine translation from English to Finnish
Authors Arvi Hurskainen, J{"o}rg Tiedemann
Abstract
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4731/
PDF https://www.aclweb.org/anthology/W17-4731
PWC https://paperswithcode.com/paper/rule-based-machine-translation-from-english
Repo
Framework

EUDAMU at SemEval-2017 Task 11: Action Ranking and Type Matching for End-User Development

Title EUDAMU at SemEval-2017 Task 11: Action Ranking and Type Matching for End-User Development
Authors Marek Kubis, Pawe{\l} Sk{'o}rzewski, Tomasz Zi{\k{e}}tkiewicz
Abstract The paper describes a system for end-user development using natural language. Our approach uses a ranking model to identify the actions to be executed followed by reference and parameter matching models to select parameter values that should be set for the given commands. We discuss the results of evaluation and possible improvements for future work.
Tasks Action Detection, Tokenization
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2175/
PDF https://www.aclweb.org/anthology/S17-2175
PWC https://paperswithcode.com/paper/eudamu-at-semeval-2017-task-11-action-ranking
Repo
Framework

The Effect of Translationese on Tuning for Statistical Machine Translation

Title The Effect of Translationese on Tuning for Statistical Machine Translation
Authors Sara Stymne
Abstract
Tasks Language Modelling, Machine Translation, Text Classification
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0230/
PDF https://www.aclweb.org/anthology/W17-0230
PWC https://paperswithcode.com/paper/the-effect-of-translationese-on-tuning-for
Repo
Framework

Evaluating Feature Extraction Methods for Knowledge-based Biomedical Word Sense Disambiguation

Title Evaluating Feature Extraction Methods for Knowledge-based Biomedical Word Sense Disambiguation
Authors Sam Henry, Clint Cuffy, Bridget McInnes
Abstract In this paper, we present an analysis of feature extraction methods via dimensionality reduction for the task of biomedical Word Sense Disambiguation (WSD). We modify the vector representations in the 2-MRD WSD algorithm, and evaluate four dimensionality reduction methods: Word Embeddings using Continuous Bag of Words and Skip Gram, Singular Value Decomposition (SVD), and Principal Component Analysis (PCA). We also evaluate the effects of vector size on the performance of each of these methods. Results are evaluated on five standard evaluation datasets (Abbrev.100, Abbrev.200, Abbrev.300, NLM-WSD, and MSH-WSD). We find that vector sizes of 100 are sufficient for all techniques except SVD, for which a vector size of 1500 is referred. We also show that SVD performs on par with Word Embeddings for all but one dataset.
Tasks Dimensionality Reduction, Information Retrieval, Question Answering, Word Embeddings, Word Sense Disambiguation
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2334/
PDF https://www.aclweb.org/anthology/W17-2334
PWC https://paperswithcode.com/paper/evaluating-feature-extraction-methods-for
Repo
Framework

Chinese Answer Extraction Based on POS Tree and Genetic Algorithm

Title Chinese Answer Extraction Based on POS Tree and Genetic Algorithm
Authors Shuihua Li, Xiaoming Zhang, Zhoujun Li
Abstract Answer extraction is the most important part of a chinese web-based question answering system. In order to enhance the robustness and adaptability of answer extraction to new domains and eliminate the influence of the incomplete and noisy search snippets, we propose two new answer exraction methods. We utilize text patterns to generate Part-of-Speech (POS) patterns. In addition, a method is proposed to construct a POS tree by using these POS patterns. The POS tree is useful to candidate answer extraction of web-based question answering. To retrieve a efficient POS tree, the similarities between questions are used to select the question-answer pairs whose questions are similar to the unanswered question. Then, the POS tree is improved based on these question-answer pairs. In order to rank these candidate answers, the weights of the leaf nodes of the POS tree are calculated using a heuristic method. Moreover, the Genetic Algorithm (GA) is used to train the weights. The experimental results of 10-fold crossvalidation show that the weighted POS tree trained by GA can improve the accuracy of answer extraction.
Tasks Information Retrieval, Question Answering
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-6004/
PDF https://www.aclweb.org/anthology/W17-6004
PWC https://paperswithcode.com/paper/chinese-answer-extraction-based-on-pos-tree
Repo
Framework

Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications

Title Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications
Authors
Abstract
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1900/
PDF https://www.aclweb.org/anthology/W17-1900
PWC https://paperswithcode.com/paper/proceedings-of-the-1st-workshop-on-sense
Repo
Framework

Arabic Diacritization: Stats, Rules, and Hacks

Title Arabic Diacritization: Stats, Rules, and Hacks
Authors Kareem Darwish, Hamdy Mubarak, Ahmed Abdelali
Abstract In this paper, we present a new and fast state-of-the-art Arabic diacritizer that guesses the diacritics of words and then their case endings. We employ a Viterbi decoder at word-level with back-off to stem, morphological patterns, and transliteration and sequence labeling based diacritization of named entities. For case endings, we use Support Vector Machine (SVM) based ranking coupled with morphological patterns and linguistic rules to properly guess case endings. We achieve a low word level diacritization error of 3.29{%} and 12.77{%} without and with case endings respectively on a new multi-genre free of copyright test set. We are making the diacritizer available for free for research purposes.
Tasks Part-Of-Speech Tagging, Transliteration, Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1302/
PDF https://www.aclweb.org/anthology/W17-1302
PWC https://paperswithcode.com/paper/arabic-diacritization-stats-rules-and-hacks
Repo
Framework

Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Title Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
Authors Ruslan Mitkov, Galia Angelova
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/papers/R17-1000/r17-1000
PDF https://www.aclweb.org/anthology/R17-1000
PWC https://paperswithcode.com/paper/proceedings-of-the-international-conference
Repo
Framework

JU CSE NLP @ SemEval 2017 Task 7: Employing Rules to Detect and Interpret English Puns

Title JU CSE NLP @ SemEval 2017 Task 7: Employing Rules to Detect and Interpret English Puns
Authors Aniket Pramanick, Dipankar Das
Abstract System description. Implementation of HMM and Cyclic Dependency Network.
Tasks Word Sense Disambiguation
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2073/
PDF https://www.aclweb.org/anthology/S17-2073
PWC https://paperswithcode.com/paper/ju-cse-nlp-semeval-2017-task-7-employing
Repo
Framework

Adapting Pre-trained Word Embeddings For Use In Medical Coding

Title Adapting Pre-trained Word Embeddings For Use In Medical Coding
Authors Kevin Patel, Divya Patel, Mansi Golakiya, Pushpak Bhattacharyya, Nilesh Birari
Abstract Word embeddings are a crucial component in modern NLP. Pre-trained embeddings released by different groups have been a major reason for their popularity. However, they are trained on generic corpora, which limits their direct use for domain specific tasks. In this paper, we propose a method to add task specific information to pre-trained word embeddings. Such information can improve their utility. We add information from medical coding data, as well as the first level from the hierarchy of ICD-10 medical code set to different pre-trained word embeddings. We adapt CBOW algorithm from the word2vec package for our purpose. We evaluated our approach on five different pre-trained word embeddings. Both the original word embeddings, and their modified versions (the ones with added information) were used for automated review of medical coding. The modified word embeddings give an improvement in f-score by 1{%} on the 5-fold evaluation on a private medical claims dataset. Our results show that adding extra information is possible and beneficial for the task at hand.
Tasks Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2338/
PDF https://www.aclweb.org/anthology/W17-2338
PWC https://paperswithcode.com/paper/adapting-pre-trained-word-embeddings-for-use
Repo
Framework

Random Permutation Online Isotonic Regression

Title Random Permutation Online Isotonic Regression
Authors Wojciech Kotlowski, Wouter M. Koolen, Alan Malek
Abstract We revisit isotonic regression on linear orders, the problem of fitting monotonic functions to best explain the data, in an online setting. It was previously shown that online isotonic regression is unlearnable in a fully adversarial model, which lead to its study in the fixed design model. Here, we instead develop the more practical random permutation model. We show that the regret is bounded above by the excess leave-one-out loss for which we develop efficient algorithms and matching lower bounds. We also analyze the class of simple and popular forward algorithms and recommend where to look for algorithms for online isotonic regression on partial orders.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7006-random-permutation-online-isotonic-regression
PDF http://papers.nips.cc/paper/7006-random-permutation-online-isotonic-regression.pdf
PWC https://paperswithcode.com/paper/random-permutation-online-isotonic-regression
Repo
Framework

Assessing the performance of Olelo, a real-time biomedical question answering application

Title Assessing the performance of Olelo, a real-time biomedical question answering application
Authors Mariana Neves, Fabian Eckert, Hendrik Folkerts, Matthias Uflacker
Abstract Question answering (QA) can support physicians and biomedical researchers to find answers to their questions in the scientific literature. Such systems process large collections of documents in real time and include many natural language processing (NLP) procedures. We recently developed Olelo, a QA system for biomedicine which includes various NLP components, such as question processing, document and passage retrieval, answer processing and multi-document summarization. In this work, we present an evaluation of our system on the the fifth BioASQ challenge. We participated with the current state of the application and with an extension based on semantic role labeling that we are currently investigating. In addition to the BioASQ evaluation, we compared our system to other on-line biomedical QA systems in terms of the response time and the quality of the answers.
Tasks Document Summarization, Information Retrieval, Multi-Document Summarization, Question Answering, Semantic Role Labeling
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2344/
PDF https://www.aclweb.org/anthology/W17-2344
PWC https://paperswithcode.com/paper/assessing-the-performance-of-olelo-a-real
Repo
Framework
comments powered by Disqus