May 5, 2019

1818 words 9 mins read

Paper Group NANR 130

Paper Group NANR 130

Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters. TTS for Low Resource Languages: A Bangla Synthesizer. Optimal Learning for Multi-pass Stochastic Gradient Methods. A Large Rated Lexicon with French Medical Words. Fast and Flexible Monotonic Functions with Ensembles of Lattices. R …

Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters

Title Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters
Authors Chiaki Miyazaki, Toru Hirano, Ryuichiro Higashinaka, Yoshihiro Matsuo
Abstract
Tasks Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3641/
PDF https://www.aclweb.org/anthology/W16-3641
PWC https://paperswithcode.com/paper/towards-an-entertaining-natural-language
Repo
Framework

TTS for Low Resource Languages: A Bangla Synthesizer

Title TTS for Low Resource Languages: A Bangla Synthesizer
Authors Alex Gutkin, er, Linne Ha, Martin Jansche, Knot Pipatsrisawat, Richard Sproat
Abstract We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1317/
PDF https://www.aclweb.org/anthology/L16-1317
PWC https://paperswithcode.com/paper/tts-for-low-resource-languages-a-bangla
Repo
Framework

Optimal Learning for Multi-pass Stochastic Gradient Methods

Title Optimal Learning for Multi-pass Stochastic Gradient Methods
Authors Junhong Lin, Lorenzo Rosasco
Abstract We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods
PDF http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods.pdf
PWC https://paperswithcode.com/paper/optimal-learning-for-multi-pass-stochastic
Repo
Framework

A Large Rated Lexicon with French Medical Words

Title A Large Rated Lexicon with French Medical Words
Authors Natalia Grabar, Thierry Hamon
Abstract Patients are often exposed to medical terms, such as anosognosia, myelodysplastic, or hepatojejunostomy, that can be semantically complex and hardly understandable by non-experts in medicine. Hence, it is important to assess which words are potentially non-understandable and require further explanations. The purpose of our work is to build specific lexicon in which the words are rated according to whether they are understandable or non-understandable. We propose to work with medical words in French such as provided by an international medical terminology. The terms are segmented in single words and then each word is manually processed by three annotators. The objective is to assign each word into one of the three categories: I can understand, I am not sure, I cannot understand. The annotators do not have medical training nor they present specific medical problems. They are supposed to represent an average patient. The inter-annotator agreement is then computed. The content of the categories is analyzed. Possible applications in which this lexicon can be helpful are proposed and discussed. The rated lexicon is freely available for the research purposes. It is accessible online at http://natalia.grabar.perso.sfr.fr/rated-lexicon.html
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1420/
PDF https://www.aclweb.org/anthology/L16-1420
PWC https://paperswithcode.com/paper/a-large-rated-lexicon-with-french-medical
Repo
Framework

Fast and Flexible Monotonic Functions with Ensembles of Lattices

Title Fast and Flexible Monotonic Functions with Ensembles of Lattices
Authors Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta
Abstract For many machine learning problems, there are some inputs that are known to be positively (or negatively) related to the output, and in such cases training the model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated interpolated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices
PDF http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices.pdf
PWC https://paperswithcode.com/paper/fast-and-flexible-monotonic-functions-with
Repo
Framework

RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking

Title RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking
Authors Ahmed Magooda, Amr Gomaa, Ashraf Mahgoub, Hany Ahmed, Mohsen Rashwan, Hazem Raafat, Eslam Kamal, Ahmad Al Sallab
Abstract
Tasks Information Retrieval
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1127/
PDF https://www.aclweb.org/anthology/S16-1127
PWC https://paperswithcode.com/paper/rdi_team-at-semeval-2016-task-3-rdi
Repo
Framework

Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups

Title Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups
Authors Amit Navindgi, Caroline Brun, C{'e}cile Boulard Masson, Scott Nowson
Abstract
Tasks Aspect-Based Sentiment Analysis, Sentiment Analysis
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6205/
PDF https://www.aclweb.org/anthology/W16-6205
PWC https://paperswithcode.com/paper/steps-toward-automatic-understanding-of-the
Repo
Framework
Title CLIP@UMD at SemEval-2016 Task 8: Parser for Abstract Meaning Representation using Learning to Search
Authors Sudha Rao, Yogarshi Vyas, Hal Daum{'e} III, Philip Resnik
Abstract
Tasks Amr Parsing, Coreference Resolution, Dependency Parsing, Named Entity Recognition, Part-Of-Speech Tagging, Structured Prediction
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1184/
PDF https://www.aclweb.org/anthology/S16-1184
PWC https://paperswithcode.com/paper/clipumd-at-semeval-2016-task-8-parser-for
Repo
Framework

Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes

Title Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes
Authors Gordon Christie, Ankit Laddha, Aishwarya Agrawal, Stanislaw Antol, Yash Goyal, Kevin Kochersberger, Dhruv Batra
Abstract
Tasks Common Sense Reasoning, Prepositional Phrase Attachment, Semantic Segmentation
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1156/
PDF https://www.aclweb.org/anthology/D16-1156
PWC https://paperswithcode.com/paper/resolving-language-and-vision-ambiguities-1
Repo
Framework

Content selection as semantic-based ontology exploration

Title Content selection as semantic-based ontology exploration
Authors Laura Perez-Beltrachini, Claire Gardent, Anselme Revuz, B, Saptarashmi yopadhyay
Abstract
Tasks Common Sense Reasoning, Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-3508/
PDF https://www.aclweb.org/anthology/W16-3508
PWC https://paperswithcode.com/paper/content-selection-as-semantic-based-ontology
Repo
Framework

Robust k-means: a Theoretical Revisit

Title Robust k-means: a Theoretical Revisit
Authors Alexandros Georgogiannis
Abstract Over the last years, many variations of the quadratic k-means clustering procedure have been proposed, all aiming to robustify the performance of the algorithm in the presence of outliers. In general terms, two main approaches have been developed: one based on penalized regularization methods, and one based on trimming functions. In this work, we present a theoretical analysis of the robustness and consistency properties of a variant of the classical quadratic k-means algorithm, the robust k-means, which borrows ideas from outlier detection in regression. We show that two outliers in a dataset are enough to breakdown this clustering procedure. However, if we focus on “well-structured” datasets, then robust k-means can recover the underlying cluster structure in spite of the outliers. Finally, we show that, with slight modifications, the most general non-asymptotic results for consistency of quadratic k-means remain valid for this robust variant.
Tasks Outlier Detection
Published 2016-12-01
URL http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit
PDF http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit.pdf
PWC https://paperswithcode.com/paper/robust-k-means-a-theoretical-revisit
Repo
Framework

TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns

Title TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns
Authors Kathrin Eichler, Feiyu Xu, Hans Uszkoreit, Leonhard Hennig, Sebastian Krause
Abstract The task of relation extraction is to recognize and extract relations between entities or concepts in texts. Dependency parse trees have become a popular source for discovering extraction patterns, which encode the grammatical relations among the phrases that jointly express relation instances. State-of-the-art weakly supervised approaches to relation extraction typically extract thousands of unique patterns only potentially expressing the target relation. Among these patterns, some are semantically equivalent, but differ in their morphological, lexical-semantic or syntactic form. Some express a relation that entails the target relation. We propose a new approach to structuring extraction patterns by utilizing entailment graphs, hierarchical structures representing entailment relations, and present a novel resource of gold-standard entailment graphs based on a set of patterns automatically acquired using distant supervision. We describe the methodology used for creating the dataset and present statistics of the resource as well as an analysis of inference types underlying the entailment decisions.
Tasks Natural Language Inference, Relation Extraction
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1537/
PDF https://www.aclweb.org/anthology/L16-1537
PWC https://paperswithcode.com/paper/teg-rep-a-corpus-of-textual-entailment-graphs
Repo
Framework

Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese

Title Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese
Authors Tingming Lu, Man Zhu, Zhiqiang Gao, Yaocheng Gui
Abstract Annotated corpora are crucial language resources, and pre-annotation is an usual way to reduce the cost of corpus construction. Ensemble based pre-annotation approach combines multiple existing named entity taggers and categorizes annotations into normal annotations with high confidence and candidate annotations with low confidence, to reduce the human annotation time. In this paper, we manually annotate three English datasets under various pre-annotation conditions, report the effects of ensemble based pre-annotation, and analyze the experimental results. In order to verify the effectiveness of ensemble based pre-annotation in other languages, such as Chinese, three Chinese datasets are also tested. The experimental results show that the ensemble based pre-annotation approach significantly reduces the number of annotations which human annotators have to add, and outperforms the baseline approaches in reduction of human annotation time without loss in annotation performance (in terms of F1-measure), on both English and Chinese datasets.
Tasks Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5208/
PDF https://www.aclweb.org/anthology/W16-5208
PWC https://paperswithcode.com/paper/evaluating-ensemble-based-pre-annotation-on
Repo
Framework

Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter

Title Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter
Authors Michael Yoshitaka Erlewine
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/J16-4012/
PDF https://www.aclweb.org/anthology/J16-4012
PWC https://paperswithcode.com/paper/book-reviews-elements-of-formal-semantics-an
Repo
Framework

Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View

Title Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View
Authors Achim Stein
Abstract The treatment of medieval texts is a particular challenge for parsers. I compare how two dependency parsers, one graph-based, the other transition-based, perform on Old French, facing some typical problems of medieval texts: graphical variation, relatively free word order, and syntactic variation of several parameters over a diachronic period of about 300 years. Both parsers were trained and evaluated on the {``}Syntactic Reference Corpus of Medieval French{''} (SRCMF), a manually annotated dependency treebank. I discuss the relation between types of parsers and types of language, as well as the differences of the analyses from a linguistic point of view. |
Tasks Dependency Parsing
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1112/
PDF https://www.aclweb.org/anthology/L16-1112
PWC https://paperswithcode.com/paper/old-french-dependency-parsing-results-of-two
Repo
Framework
comments powered by Disqus