Paper Group NANR 33
Infusing NLU into Automatic Question Generation. PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification. Sheffield Systems for the English-Romanian WMT Translation Task. The Event and Implied Situation Ontology (ESO): Application and Evaluation. Improving Word Alignment of Rare Words with Word Embeddings. AfriBoo …
Infusing NLU into Automatic Question Generation
Title | Infusing NLU into Automatic Question Generation |
Authors | Karen Mazidi, Paul Tarau |
Abstract | |
Tasks | Question Generation, Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-6609/ |
https://www.aclweb.org/anthology/W16-6609 | |
PWC | https://paperswithcode.com/paper/infusing-nlu-into-automatic-question |
Repo | |
Framework | |
PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification
Title | PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification |
Authors | Dominique Brunato, Andrea Cimino, Felice Dell{'}Orletta, Giulia Venturi |
Abstract | |
Tasks | Dependency Parsing, Domain Adaptation, Machine Translation, Natural Language Inference, Question Answering, Text Simplification |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1034/ |
https://www.aclweb.org/anthology/D16-1034 | |
PWC | https://paperswithcode.com/paper/paccss-it-a-parallel-corpus-of-complex-simple |
Repo | |
Framework | |
Sheffield Systems for the English-Romanian WMT Translation Task
Title | Sheffield Systems for the English-Romanian WMT Translation Task |
Authors | Fr{'e}d{'e}ric Blain, Xingyi Song, Lucia Specia |
Abstract | |
Tasks | Language Modelling, Machine Translation, Word Alignment |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2307/ |
https://www.aclweb.org/anthology/W16-2307 | |
PWC | https://paperswithcode.com/paper/sheffield-systems-for-the-english-romanian |
Repo | |
Framework | |
The Event and Implied Situation Ontology (ESO): Application and Evaluation
Title | The Event and Implied Situation Ontology (ESO): Application and Evaluation |
Authors | Roxane Segers, Marco Rospocher, Piek Vossen, Egoitz Laparra, German Rigau, Anne-Lyse Minard |
Abstract | This paper presents the Event and Implied Situation Ontology (ESO), a manually constructed resource which formalizes the pre and post situations of events and the roles of the entities affected by an event. The ontology is built on top of existing resources such as WordNet, SUMO and FrameNet. The ontology is injected to the Predicate Matrix, a resource that integrates predicate and role information from amongst others FrameNet, VerbNet, PropBank, NomBank and WordNet. We illustrate how these resources are used on large document collections to detect information that otherwise would have remained implicit. The ontology is evaluated on two aspects: recall and precision based on a manually annotated corpus and secondly, on the quality of the knowledge inferred by the situation assertions in the ontology. Evaluation results on the quality of the system show that 50{%} of the events typed and enriched with ESO assertions are correct. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1233/ |
https://www.aclweb.org/anthology/L16-1233 | |
PWC | https://paperswithcode.com/paper/the-event-and-implied-situation-ontology-eso |
Repo | |
Framework | |
Improving Word Alignment of Rare Words with Word Embeddings
Title | Improving Word Alignment of Rare Words with Word Embeddings |
Authors | Masoud Jalili Sabet, Heshaam Faili, Gholamreza Haffari |
Abstract | We address the problem of inducing word alignment for language pairs by developing an unsupervised model with the capability of getting applied to other generative alignment models. We approach the task by: i)proposing a new alignment model based on the IBM alignment model 1 that uses vector representation of words, and ii)examining the use of similar source words to overcome the problem of rare source words and improving the alignments. We apply our method to English-French corpora and run the experiments with different sizes of sentence pairs. Our results show competitive performance against the baseline and in some cases improve the results up to 6.9{%} in terms of precision. |
Tasks | Machine Translation, Word Alignment, Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1302/ |
https://www.aclweb.org/anthology/C16-1302 | |
PWC | https://paperswithcode.com/paper/improving-word-alignment-of-rare-words-with |
Repo | |
Framework | |
AfriBooms: An Online Treebank for Afrikaans
Title | AfriBooms: An Online Treebank for Afrikaans |
Authors | Liesbeth Augustinus, Peter Dirix, Daniel van Niekerk, Ineke Schuurman, V, Vincent eghinste, Frank Van Eynde, Gerhard van Huyssteen |
Abstract | Compared to well-resourced languages such as English and Dutch, natural language processing (NLP) tools for Afrikaans are still not abundant. In the context of the AfriBooms project, KU Leuven and the North-West University collaborated to develop a first, small treebank, a dependency parser, and an easy to use online linguistic search engine for Afrikaans for use by researchers and students in the humanities and social sciences. The search tool is based on a similar development for Dutch, i.e. GrETEL, a user-friendly search engine which allows users to query a treebank by means of a natural language example instead of a formal search instruction. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1107/ |
https://www.aclweb.org/anthology/L16-1107 | |
PWC | https://paperswithcode.com/paper/afribooms-an-online-treebank-for-afrikaans |
Repo | |
Framework | |
Generating Task-Pertinent sorted Error Lists for Speech Recognition
Title | Generating Task-Pertinent sorted Error Lists for Speech Recognition |
Authors | Olivier Galibert, Mohamed Ameur Ben Jannet, Juliette Kahn, Sophie Rosset |
Abstract | Automatic Speech recognition (ASR) is one of the most widely used components in spoken language processing applications. ASR errors are of varying importance with respect to the application, making error analysis keys to improving speech processing applications. Knowing the most serious errors for the applicative case is critical to build better systems. In the context of Automatic Speech Recognition (ASR) used as a first step towards Named Entity Recognition (NER) in speech, error seriousness is usually determined by their frequency, due to the use of the WER as metric to evaluate the ASR output, despite the emergence of more relevant measures in the literature. We propose to use a different evaluation metric form the literature in order to classify ASR errors according to their seriousness for NER. Our results show that the ASR errors importance is ranked differently depending on the used evaluation metric. A more detailed analysis shows that the estimation of the error impact given by the ATENE metric is more adapted to the NER task than the estimation based only on the most used frequency metric WER. |
Tasks | Named Entity Recognition, Speech Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1297/ |
https://www.aclweb.org/anthology/L16-1297 | |
PWC | https://paperswithcode.com/paper/generating-task-pertinent-sorted-error-lists |
Repo | |
Framework | |
Online Pricing with Strategic and Patient Buyers
Title | Online Pricing with Strategic and Patient Buyers |
Authors | Michal Feldman, Tomer Koren, Roi Livni, Yishay Mansour, Aviv Zohar |
Abstract | We consider a seller with an unlimited supply of a single good, who is faced with a stream of $T$ buyers. Each buyer has a window of time in which she would like to purchase, and would buy at the lowest price in that window, provided that this price is lower than her private value (and otherwise, would not buy at all). In this setting, we give an algorithm that attains $O(T^{2/3})$ regret over any sequence of $T$ buyers with respect to the best fixed price in hindsight, and prove that no algorithm can perform better in the worst case. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6415-online-pricing-with-strategic-and-patient-buyers |
http://papers.nips.cc/paper/6415-online-pricing-with-strategic-and-patient-buyers.pdf | |
PWC | https://paperswithcode.com/paper/online-pricing-with-strategic-and-patient |
Repo | |
Framework | |
Towards proper name generation: a corpus analysis
Title | Towards proper name generation: a corpus analysis |
Authors | Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer |
Abstract | |
Tasks | Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-6636/ |
https://www.aclweb.org/anthology/W16-6636 | |
PWC | https://paperswithcode.com/paper/towards-proper-name-generation-a-corpus |
Repo | |
Framework | |
Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks
Title | Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks |
Authors | Bo Zheng, Wanxiang Che, Jiang Guo, Ting Liu |
Abstract | Grammatical error diagnosis is an important task in natural language processing. This paper introduces our Chinese Grammatical Error Diagnosis (CGED) system in the NLP-TEA-3 shared task for CGED. The CGED system can diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the CGED task as a sequence labeling task and describe three models, including a CRF-based model, an LSTM-based model and an ensemble model using stacking. We also show in details how we build and train the models. Evaluation includes three levels, which are detection level, identification level and position level. On the CGED-HSK dataset of NLP-TEA-3 shared task, our system presents the best F1-scores in all the three levels and also the best recall in the last two levels. |
Tasks | Information Retrieval, Language Modelling, Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4907/ |
https://www.aclweb.org/anthology/W16-4907 | |
PWC | https://paperswithcode.com/paper/chinese-grammatical-error-diagnosis-with-long |
Repo | |
Framework | |
A Redundancy-Aware Sentence Regression Framework for Extractive Summarization
Title | A Redundancy-Aware Sentence Regression Framework for Extractive Summarization |
Authors | Pengjie Ren, Furu Wei, Zhumin Chen, Jun Ma, Ming Zhou |
Abstract | Existing sentence regression methods for extractive summarization usually model sentence importance and redundancy in two separate processes. They first evaluate the importance f(s) of each sentence s and then select sentences to generate a summary based on both the importance scores and redundancy among sentences. In this paper, we propose to model importance and redundancy simultaneously by directly evaluating the relative importance f(sS) of a sentence s given a set of selected sentences S. Specifically, we present a new framework to conduct regression with respect to the relative gain of s given S calculated by the ROUGE metric. Besides the single sentence features, additional features derived from the sentence relations are incorporated. Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that the proposed method outperforms state-of-the-art extractive summarization approaches. |
Tasks | Document Summarization, Multi-Document Summarization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1004/ |
https://www.aclweb.org/anthology/C16-1004 | |
PWC | https://paperswithcode.com/paper/a-redundancy-aware-sentence-regression |
Repo | |
Framework | |
Towards a Multi-dimensional Taxonomy of Stories in Dialogue
Title | Towards a Multi-dimensional Taxonomy of Stories in Dialogue |
Authors | Kathryn J. Collins, David Traum |
Abstract | In this paper, we present a taxonomy of stories told in dialogue. We based our scheme on prior work analyzing narrative structure and method of telling, relation to storyteller identity, as well as some categories particular to dialogue, such as how the story gets introduced. Our taxonomy currently has 5 major dimensions, with most having sub-dimensions - each dimension has an associated set of dimension-specific labels. We adapted an annotation tool for this taxonomy and have annotated portions of two different dialogue corpora, Switchboard and the Distress Analysis Interview Corpus. We present examples of some of the tags and concepts with stories from Switchboard, and some initial statistics of frequencies of the tags. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1018/ |
https://www.aclweb.org/anthology/L16-1018 | |
PWC | https://paperswithcode.com/paper/towards-a-multi-dimensional-taxonomy-of |
Repo | |
Framework | |
Multi-armed Bandits: Competing with Optimal Sequences
Title | Multi-armed Bandits: Competing with Optimal Sequences |
Authors | Zohar S. Karnin, Oren Anava |
Abstract | We consider sequential decision making problem in the adversarial setting, where regret is measured with respect to the optimal sequence of actions and the feedback adheres the bandit setting. It is well-known that obtaining sublinear regret in this setting is impossible in general, which arises the question of when can we do better than linear regret? Previous works show that when the environment is guaranteed to vary slowly and furthermore we are given prior knowledge regarding its variation (i.e., a limit on the amount of changes suffered by the environment), then this task is feasible. The caveat however is that such prior knowledge is not likely to be available in practice, which causes the obtained regret bounds to be somewhat irrelevant. Our main result is a regret guarantee that scales with the variation parameter of the environment, without requiring any prior knowledge about it whatsoever. By that, we also resolve an open problem posted by [Gur, Zeevi and Besbes, NIPS’ 14]. An important key component in our result is a statistical test for identifying non-stationarity in a sequence of independent random variables. This test either identifies non-stationarity or upper-bounds the absolute deviation of the corresponding sequence of mean values in terms of its total variation. This test is interesting on its own right and has the potential to be found useful in additional settings. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6341-multi-armed-bandits-competing-with-optimal-sequences |
http://papers.nips.cc/paper/6341-multi-armed-bandits-competing-with-optimal-sequences.pdf | |
PWC | https://paperswithcode.com/paper/multi-armed-bandits-competing-with-optimal |
Repo | |
Framework | |
The DialogBank
Title | The DialogBank |
Authors | Harry Bunt, Volha Petukhova, Andrei Malchanau, Kars Wijnhoven, Alex Fang |
Abstract | This paper presents the DialogBank, a new language resource consisting of dialogues with gold standard annotations according to the ISO 24617-2 standard. Some of these dialogues have been taken from existing corpora and have been re-annotated according to the ISO standard; others have been annotated directly according to the standard. The ISO 24617-2 annotations have been designed according to the ISO principles for semantic annotation, as formulated in ISO 24617-6. The DialogBank makes use of three alternative representation formats, which are shown to be interoperable. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1503/ |
https://www.aclweb.org/anthology/L16-1503 | |
PWC | https://paperswithcode.com/paper/the-dialogbank |
Repo | |
Framework | |
Question Answering with Knowledge Base, Web and Beyond
Title | Question Answering with Knowledge Base, Web and Beyond |
Authors | Wen-tau Yih, Hao Ma |
Abstract | |
Tasks | Question Answering, Text Matching |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-4003/ |
https://www.aclweb.org/anthology/N16-4003 | |
PWC | https://paperswithcode.com/paper/question-answering-with-knowledge-base-web |
Repo | |
Framework | |