May 5, 2019

1991 words 10 mins read

Paper Group NANR 33

Paper Group NANR 33

Infusing NLU into Automatic Question Generation. PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification. Sheffield Systems for the English-Romanian WMT Translation Task. The Event and Implied Situation Ontology (ESO): Application and Evaluation. Improving Word Alignment of Rare Words with Word Embeddings. AfriBoo …

Infusing NLU into Automatic Question Generation

Title Infusing NLU into Automatic Question Generation
Authors Karen Mazidi, Paul Tarau
Abstract
Tasks Question Generation, Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-6609/
PDF https://www.aclweb.org/anthology/W16-6609
PWC https://paperswithcode.com/paper/infusing-nlu-into-automatic-question
Repo
Framework

PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification

Title PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification
Authors Dominique Brunato, Andrea Cimino, Felice Dell{'}Orletta, Giulia Venturi
Abstract
Tasks Dependency Parsing, Domain Adaptation, Machine Translation, Natural Language Inference, Question Answering, Text Simplification
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1034/
PDF https://www.aclweb.org/anthology/D16-1034
PWC https://paperswithcode.com/paper/paccss-it-a-parallel-corpus-of-complex-simple
Repo
Framework

Sheffield Systems for the English-Romanian WMT Translation Task

Title Sheffield Systems for the English-Romanian WMT Translation Task
Authors Fr{'e}d{'e}ric Blain, Xingyi Song, Lucia Specia
Abstract
Tasks Language Modelling, Machine Translation, Word Alignment
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2307/
PDF https://www.aclweb.org/anthology/W16-2307
PWC https://paperswithcode.com/paper/sheffield-systems-for-the-english-romanian
Repo
Framework

The Event and Implied Situation Ontology (ESO): Application and Evaluation

Title The Event and Implied Situation Ontology (ESO): Application and Evaluation
Authors Roxane Segers, Marco Rospocher, Piek Vossen, Egoitz Laparra, German Rigau, Anne-Lyse Minard
Abstract This paper presents the Event and Implied Situation Ontology (ESO), a manually constructed resource which formalizes the pre and post situations of events and the roles of the entities affected by an event. The ontology is built on top of existing resources such as WordNet, SUMO and FrameNet. The ontology is injected to the Predicate Matrix, a resource that integrates predicate and role information from amongst others FrameNet, VerbNet, PropBank, NomBank and WordNet. We illustrate how these resources are used on large document collections to detect information that otherwise would have remained implicit. The ontology is evaluated on two aspects: recall and precision based on a manually annotated corpus and secondly, on the quality of the knowledge inferred by the situation assertions in the ontology. Evaluation results on the quality of the system show that 50{%} of the events typed and enriched with ESO assertions are correct.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1233/
PDF https://www.aclweb.org/anthology/L16-1233
PWC https://paperswithcode.com/paper/the-event-and-implied-situation-ontology-eso
Repo
Framework

Improving Word Alignment of Rare Words with Word Embeddings

Title Improving Word Alignment of Rare Words with Word Embeddings
Authors Masoud Jalili Sabet, Heshaam Faili, Gholamreza Haffari
Abstract We address the problem of inducing word alignment for language pairs by developing an unsupervised model with the capability of getting applied to other generative alignment models. We approach the task by: i)proposing a new alignment model based on the IBM alignment model 1 that uses vector representation of words, and ii)examining the use of similar source words to overcome the problem of rare source words and improving the alignments. We apply our method to English-French corpora and run the experiments with different sizes of sentence pairs. Our results show competitive performance against the baseline and in some cases improve the results up to 6.9{%} in terms of precision.
Tasks Machine Translation, Word Alignment, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1302/
PDF https://www.aclweb.org/anthology/C16-1302
PWC https://paperswithcode.com/paper/improving-word-alignment-of-rare-words-with
Repo
Framework

AfriBooms: An Online Treebank for Afrikaans

Title AfriBooms: An Online Treebank for Afrikaans
Authors Liesbeth Augustinus, Peter Dirix, Daniel van Niekerk, Ineke Schuurman, V, Vincent eghinste, Frank Van Eynde, Gerhard van Huyssteen
Abstract Compared to well-resourced languages such as English and Dutch, natural language processing (NLP) tools for Afrikaans are still not abundant. In the context of the AfriBooms project, KU Leuven and the North-West University collaborated to develop a first, small treebank, a dependency parser, and an easy to use online linguistic search engine for Afrikaans for use by researchers and students in the humanities and social sciences. The search tool is based on a similar development for Dutch, i.e. GrETEL, a user-friendly search engine which allows users to query a treebank by means of a natural language example instead of a formal search instruction.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1107/
PDF https://www.aclweb.org/anthology/L16-1107
PWC https://paperswithcode.com/paper/afribooms-an-online-treebank-for-afrikaans
Repo
Framework

Generating Task-Pertinent sorted Error Lists for Speech Recognition

Title Generating Task-Pertinent sorted Error Lists for Speech Recognition
Authors Olivier Galibert, Mohamed Ameur Ben Jannet, Juliette Kahn, Sophie Rosset
Abstract Automatic Speech recognition (ASR) is one of the most widely used components in spoken language processing applications. ASR errors are of varying importance with respect to the application, making error analysis keys to improving speech processing applications. Knowing the most serious errors for the applicative case is critical to build better systems. In the context of Automatic Speech Recognition (ASR) used as a first step towards Named Entity Recognition (NER) in speech, error seriousness is usually determined by their frequency, due to the use of the WER as metric to evaluate the ASR output, despite the emergence of more relevant measures in the literature. We propose to use a different evaluation metric form the literature in order to classify ASR errors according to their seriousness for NER. Our results show that the ASR errors importance is ranked differently depending on the used evaluation metric. A more detailed analysis shows that the estimation of the error impact given by the ATENE metric is more adapted to the NER task than the estimation based only on the most used frequency metric WER.
Tasks Named Entity Recognition, Speech Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1297/
PDF https://www.aclweb.org/anthology/L16-1297
PWC https://paperswithcode.com/paper/generating-task-pertinent-sorted-error-lists
Repo
Framework

Online Pricing with Strategic and Patient Buyers

Title Online Pricing with Strategic and Patient Buyers
Authors Michal Feldman, Tomer Koren, Roi Livni, Yishay Mansour, Aviv Zohar
Abstract We consider a seller with an unlimited supply of a single good, who is faced with a stream of $T$ buyers. Each buyer has a window of time in which she would like to purchase, and would buy at the lowest price in that window, provided that this price is lower than her private value (and otherwise, would not buy at all). In this setting, we give an algorithm that attains $O(T^{2/3})$ regret over any sequence of $T$ buyers with respect to the best fixed price in hindsight, and prove that no algorithm can perform better in the worst case.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6415-online-pricing-with-strategic-and-patient-buyers
PDF http://papers.nips.cc/paper/6415-online-pricing-with-strategic-and-patient-buyers.pdf
PWC https://paperswithcode.com/paper/online-pricing-with-strategic-and-patient
Repo
Framework

Towards proper name generation: a corpus analysis

Title Towards proper name generation: a corpus analysis
Authors Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer
Abstract
Tasks Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-6636/
PDF https://www.aclweb.org/anthology/W16-6636
PWC https://paperswithcode.com/paper/towards-proper-name-generation-a-corpus
Repo
Framework

Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks

Title Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks
Authors Bo Zheng, Wanxiang Che, Jiang Guo, Ting Liu
Abstract Grammatical error diagnosis is an important task in natural language processing. This paper introduces our Chinese Grammatical Error Diagnosis (CGED) system in the NLP-TEA-3 shared task for CGED. The CGED system can diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the CGED task as a sequence labeling task and describe three models, including a CRF-based model, an LSTM-based model and an ensemble model using stacking. We also show in details how we build and train the models. Evaluation includes three levels, which are detection level, identification level and position level. On the CGED-HSK dataset of NLP-TEA-3 shared task, our system presents the best F1-scores in all the three levels and also the best recall in the last two levels.
Tasks Information Retrieval, Language Modelling, Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4907/
PDF https://www.aclweb.org/anthology/W16-4907
PWC https://paperswithcode.com/paper/chinese-grammatical-error-diagnosis-with-long
Repo
Framework

A Redundancy-Aware Sentence Regression Framework for Extractive Summarization

Title A Redundancy-Aware Sentence Regression Framework for Extractive Summarization
Authors Pengjie Ren, Furu Wei, Zhumin Chen, Jun Ma, Ming Zhou
Abstract Existing sentence regression methods for extractive summarization usually model sentence importance and redundancy in two separate processes. They first evaluate the importance f(s) of each sentence s and then select sentences to generate a summary based on both the importance scores and redundancy among sentences. In this paper, we propose to model importance and redundancy simultaneously by directly evaluating the relative importance f(sS) of a sentence s given a set of selected sentences S. Specifically, we present a new framework to conduct regression with respect to the relative gain of s given S calculated by the ROUGE metric. Besides the single sentence features, additional features derived from the sentence relations are incorporated. Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that the proposed method outperforms state-of-the-art extractive summarization approaches.
Tasks Document Summarization, Multi-Document Summarization
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1004/
PDF https://www.aclweb.org/anthology/C16-1004
PWC https://paperswithcode.com/paper/a-redundancy-aware-sentence-regression
Repo
Framework

Towards a Multi-dimensional Taxonomy of Stories in Dialogue

Title Towards a Multi-dimensional Taxonomy of Stories in Dialogue
Authors Kathryn J. Collins, David Traum
Abstract In this paper, we present a taxonomy of stories told in dialogue. We based our scheme on prior work analyzing narrative structure and method of telling, relation to storyteller identity, as well as some categories particular to dialogue, such as how the story gets introduced. Our taxonomy currently has 5 major dimensions, with most having sub-dimensions - each dimension has an associated set of dimension-specific labels. We adapted an annotation tool for this taxonomy and have annotated portions of two different dialogue corpora, Switchboard and the Distress Analysis Interview Corpus. We present examples of some of the tags and concepts with stories from Switchboard, and some initial statistics of frequencies of the tags.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1018/
PDF https://www.aclweb.org/anthology/L16-1018
PWC https://paperswithcode.com/paper/towards-a-multi-dimensional-taxonomy-of
Repo
Framework

Multi-armed Bandits: Competing with Optimal Sequences

Title Multi-armed Bandits: Competing with Optimal Sequences
Authors Zohar S. Karnin, Oren Anava
Abstract We consider sequential decision making problem in the adversarial setting, where regret is measured with respect to the optimal sequence of actions and the feedback adheres the bandit setting. It is well-known that obtaining sublinear regret in this setting is impossible in general, which arises the question of when can we do better than linear regret? Previous works show that when the environment is guaranteed to vary slowly and furthermore we are given prior knowledge regarding its variation (i.e., a limit on the amount of changes suffered by the environment), then this task is feasible. The caveat however is that such prior knowledge is not likely to be available in practice, which causes the obtained regret bounds to be somewhat irrelevant. Our main result is a regret guarantee that scales with the variation parameter of the environment, without requiring any prior knowledge about it whatsoever. By that, we also resolve an open problem posted by [Gur, Zeevi and Besbes, NIPS’ 14]. An important key component in our result is a statistical test for identifying non-stationarity in a sequence of independent random variables. This test either identifies non-stationarity or upper-bounds the absolute deviation of the corresponding sequence of mean values in terms of its total variation. This test is interesting on its own right and has the potential to be found useful in additional settings.
Tasks Decision Making, Multi-Armed Bandits
Published 2016-12-01
URL http://papers.nips.cc/paper/6341-multi-armed-bandits-competing-with-optimal-sequences
PDF http://papers.nips.cc/paper/6341-multi-armed-bandits-competing-with-optimal-sequences.pdf
PWC https://paperswithcode.com/paper/multi-armed-bandits-competing-with-optimal
Repo
Framework

The DialogBank

Title The DialogBank
Authors Harry Bunt, Volha Petukhova, Andrei Malchanau, Kars Wijnhoven, Alex Fang
Abstract This paper presents the DialogBank, a new language resource consisting of dialogues with gold standard annotations according to the ISO 24617-2 standard. Some of these dialogues have been taken from existing corpora and have been re-annotated according to the ISO standard; others have been annotated directly according to the standard. The ISO 24617-2 annotations have been designed according to the ISO principles for semantic annotation, as formulated in ISO 24617-6. The DialogBank makes use of three alternative representation formats, which are shown to be interoperable.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1503/
PDF https://www.aclweb.org/anthology/L16-1503
PWC https://paperswithcode.com/paper/the-dialogbank
Repo
Framework

Question Answering with Knowledge Base, Web and Beyond

Title Question Answering with Knowledge Base, Web and Beyond
Authors Wen-tau Yih, Hao Ma
Abstract
Tasks Question Answering, Text Matching
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-4003/
PDF https://www.aclweb.org/anthology/N16-4003
PWC https://paperswithcode.com/paper/question-answering-with-knowledge-base-web
Repo
Framework
comments powered by Disqus