Paper Group NANR 130
Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters. TTS for Low Resource Languages: A Bangla Synthesizer. Optimal Learning for Multi-pass Stochastic Gradient Methods. A Large Rated Lexicon with French Medical Words. Fast and Flexible Monotonic Functions with Ensembles of Lattices. R …
Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters
Title | Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters |
Authors | Chiaki Miyazaki, Toru Hirano, Ryuichiro Higashinaka, Yoshihiro Matsuo |
Abstract | |
Tasks | Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3641/ |
https://www.aclweb.org/anthology/W16-3641 | |
PWC | https://paperswithcode.com/paper/towards-an-entertaining-natural-language |
Repo | |
Framework | |
TTS for Low Resource Languages: A Bangla Synthesizer
Title | TTS for Low Resource Languages: A Bangla Synthesizer |
Authors | Alex Gutkin, er, Linne Ha, Martin Jansche, Knot Pipatsrisawat, Richard Sproat |
Abstract | We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1317/ |
https://www.aclweb.org/anthology/L16-1317 | |
PWC | https://paperswithcode.com/paper/tts-for-low-resource-languages-a-bangla |
Repo | |
Framework | |
Optimal Learning for Multi-pass Stochastic Gradient Methods
Title | Optimal Learning for Multi-pass Stochastic Gradient Methods |
Authors | Junhong Lin, Lorenzo Rosasco |
Abstract | We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods |
http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods.pdf | |
PWC | https://paperswithcode.com/paper/optimal-learning-for-multi-pass-stochastic |
Repo | |
Framework | |
A Large Rated Lexicon with French Medical Words
Title | A Large Rated Lexicon with French Medical Words |
Authors | Natalia Grabar, Thierry Hamon |
Abstract | Patients are often exposed to medical terms, such as anosognosia, myelodysplastic, or hepatojejunostomy, that can be semantically complex and hardly understandable by non-experts in medicine. Hence, it is important to assess which words are potentially non-understandable and require further explanations. The purpose of our work is to build specific lexicon in which the words are rated according to whether they are understandable or non-understandable. We propose to work with medical words in French such as provided by an international medical terminology. The terms are segmented in single words and then each word is manually processed by three annotators. The objective is to assign each word into one of the three categories: I can understand, I am not sure, I cannot understand. The annotators do not have medical training nor they present specific medical problems. They are supposed to represent an average patient. The inter-annotator agreement is then computed. The content of the categories is analyzed. Possible applications in which this lexicon can be helpful are proposed and discussed. The rated lexicon is freely available for the research purposes. It is accessible online at http://natalia.grabar.perso.sfr.fr/rated-lexicon.html |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1420/ |
https://www.aclweb.org/anthology/L16-1420 | |
PWC | https://paperswithcode.com/paper/a-large-rated-lexicon-with-french-medical |
Repo | |
Framework | |
Fast and Flexible Monotonic Functions with Ensembles of Lattices
Title | Fast and Flexible Monotonic Functions with Ensembles of Lattices |
Authors | Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta |
Abstract | For many machine learning problems, there are some inputs that are known to be positively (or negatively) related to the output, and in such cases training the model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated interpolated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices |
http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-flexible-monotonic-functions-with |
Repo | |
Framework | |
RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking
Title | RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking |
Authors | Ahmed Magooda, Amr Gomaa, Ashraf Mahgoub, Hany Ahmed, Mohsen Rashwan, Hazem Raafat, Eslam Kamal, Ahmad Al Sallab |
Abstract | |
Tasks | Information Retrieval |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1127/ |
https://www.aclweb.org/anthology/S16-1127 | |
PWC | https://paperswithcode.com/paper/rdi_team-at-semeval-2016-task-3-rdi |
Repo | |
Framework | |
Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups
Title | Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups |
Authors | Amit Navindgi, Caroline Brun, C{'e}cile Boulard Masson, Scott Nowson |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6205/ |
https://www.aclweb.org/anthology/W16-6205 | |
PWC | https://paperswithcode.com/paper/steps-toward-automatic-understanding-of-the |
Repo | |
Framework | |
CLIP@UMD at SemEval-2016 Task 8: Parser for Abstract Meaning Representation using Learning to Search
Title | CLIP@UMD at SemEval-2016 Task 8: Parser for Abstract Meaning Representation using Learning to Search |
Authors | Sudha Rao, Yogarshi Vyas, Hal Daum{'e} III, Philip Resnik |
Abstract | |
Tasks | Amr Parsing, Coreference Resolution, Dependency Parsing, Named Entity Recognition, Part-Of-Speech Tagging, Structured Prediction |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1184/ |
https://www.aclweb.org/anthology/S16-1184 | |
PWC | https://paperswithcode.com/paper/clipumd-at-semeval-2016-task-8-parser-for |
Repo | |
Framework | |
Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes
Title | Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes |
Authors | Gordon Christie, Ankit Laddha, Aishwarya Agrawal, Stanislaw Antol, Yash Goyal, Kevin Kochersberger, Dhruv Batra |
Abstract | |
Tasks | Common Sense Reasoning, Prepositional Phrase Attachment, Semantic Segmentation |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1156/ |
https://www.aclweb.org/anthology/D16-1156 | |
PWC | https://paperswithcode.com/paper/resolving-language-and-vision-ambiguities-1 |
Repo | |
Framework | |
Content selection as semantic-based ontology exploration
Title | Content selection as semantic-based ontology exploration |
Authors | Laura Perez-Beltrachini, Claire Gardent, Anselme Revuz, B, Saptarashmi yopadhyay |
Abstract | |
Tasks | Common Sense Reasoning, Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3508/ |
https://www.aclweb.org/anthology/W16-3508 | |
PWC | https://paperswithcode.com/paper/content-selection-as-semantic-based-ontology |
Repo | |
Framework | |
Robust k-means: a Theoretical Revisit
Title | Robust k-means: a Theoretical Revisit |
Authors | Alexandros Georgogiannis |
Abstract | Over the last years, many variations of the quadratic k-means clustering procedure have been proposed, all aiming to robustify the performance of the algorithm in the presence of outliers. In general terms, two main approaches have been developed: one based on penalized regularization methods, and one based on trimming functions. In this work, we present a theoretical analysis of the robustness and consistency properties of a variant of the classical quadratic k-means algorithm, the robust k-means, which borrows ideas from outlier detection in regression. We show that two outliers in a dataset are enough to breakdown this clustering procedure. However, if we focus on “well-structured” datasets, then robust k-means can recover the underlying cluster structure in spite of the outliers. Finally, we show that, with slight modifications, the most general non-asymptotic results for consistency of quadratic k-means remain valid for this robust variant. |
Tasks | Outlier Detection |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit |
http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit.pdf | |
PWC | https://paperswithcode.com/paper/robust-k-means-a-theoretical-revisit |
Repo | |
Framework | |
TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns
Title | TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns |
Authors | Kathrin Eichler, Feiyu Xu, Hans Uszkoreit, Leonhard Hennig, Sebastian Krause |
Abstract | The task of relation extraction is to recognize and extract relations between entities or concepts in texts. Dependency parse trees have become a popular source for discovering extraction patterns, which encode the grammatical relations among the phrases that jointly express relation instances. State-of-the-art weakly supervised approaches to relation extraction typically extract thousands of unique patterns only potentially expressing the target relation. Among these patterns, some are semantically equivalent, but differ in their morphological, lexical-semantic or syntactic form. Some express a relation that entails the target relation. We propose a new approach to structuring extraction patterns by utilizing entailment graphs, hierarchical structures representing entailment relations, and present a novel resource of gold-standard entailment graphs based on a set of patterns automatically acquired using distant supervision. We describe the methodology used for creating the dataset and present statistics of the resource as well as an analysis of inference types underlying the entailment decisions. |
Tasks | Natural Language Inference, Relation Extraction |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1537/ |
https://www.aclweb.org/anthology/L16-1537 | |
PWC | https://paperswithcode.com/paper/teg-rep-a-corpus-of-textual-entailment-graphs |
Repo | |
Framework | |
Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese
Title | Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese |
Authors | Tingming Lu, Man Zhu, Zhiqiang Gao, Yaocheng Gui |
Abstract | Annotated corpora are crucial language resources, and pre-annotation is an usual way to reduce the cost of corpus construction. Ensemble based pre-annotation approach combines multiple existing named entity taggers and categorizes annotations into normal annotations with high confidence and candidate annotations with low confidence, to reduce the human annotation time. In this paper, we manually annotate three English datasets under various pre-annotation conditions, report the effects of ensemble based pre-annotation, and analyze the experimental results. In order to verify the effectiveness of ensemble based pre-annotation in other languages, such as Chinese, three Chinese datasets are also tested. The experimental results show that the ensemble based pre-annotation approach significantly reduces the number of annotations which human annotators have to add, and outperforms the baseline approaches in reduction of human annotation time without loss in annotation performance (in terms of F1-measure), on both English and Chinese datasets. |
Tasks | Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5208/ |
https://www.aclweb.org/anthology/W16-5208 | |
PWC | https://paperswithcode.com/paper/evaluating-ensemble-based-pre-annotation-on |
Repo | |
Framework | |
Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter
Title | Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter |
Authors | Michael Yoshitaka Erlewine |
Abstract | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/J16-4012/ |
https://www.aclweb.org/anthology/J16-4012 | |
PWC | https://paperswithcode.com/paper/book-reviews-elements-of-formal-semantics-an |
Repo | |
Framework | |
Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View
Title | Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View |
Authors | Achim Stein |
Abstract | The treatment of medieval texts is a particular challenge for parsers. I compare how two dependency parsers, one graph-based, the other transition-based, perform on Old French, facing some typical problems of medieval texts: graphical variation, relatively free word order, and syntactic variation of several parameters over a diachronic period of about 300 years. Both parsers were trained and evaluated on the {``}Syntactic Reference Corpus of Medieval French{''} (SRCMF), a manually annotated dependency treebank. I discuss the relation between types of parsers and types of language, as well as the differences of the analyses from a linguistic point of view. | |
Tasks | Dependency Parsing |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1112/ |
https://www.aclweb.org/anthology/L16-1112 | |
PWC | https://paperswithcode.com/paper/old-french-dependency-parsing-results-of-two |
Repo | |
Framework | |