May 5, 2019

1818 words 9 mins read

Paper Group NANR 130

Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters. TTS for Low Resource Languages: A Bangla Synthesizer. Optimal Learning for Multi-pass Stochastic Gradient Methods. A Large Rated Lexicon with French Medical Words. Fast and Flexible Monotonic Functions with Ensembles of Lattices. R …

Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters


Title	Towards an Entertaining Natural Language Generation System: Linguistic Peculiarities of Japanese Fictional Characters
Authors	Chiaki Miyazaki, Toru Hirano, Ryuichiro Higashinaka, Yoshihiro Matsuo
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3641/
PDF	https://www.aclweb.org/anthology/W16-3641
PWC	https://paperswithcode.com/paper/towards-an-entertaining-natural-language
Repo
Framework

TTS for Low Resource Languages: A Bangla Synthesizer


Title	TTS for Low Resource Languages: A Bangla Synthesizer
Authors	Alex Gutkin, er, Linne Ha, Martin Jansche, Knot Pipatsrisawat, Richard Sproat
Abstract	We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1317/
PDF	https://www.aclweb.org/anthology/L16-1317
PWC	https://paperswithcode.com/paper/tts-for-low-resource-languages-a-bangla
Repo
Framework

Optimal Learning for Multi-pass Stochastic Gradient Methods


Title	Optimal Learning for Multi-pass Stochastic Gradient Methods
Authors	Junhong Lin, Lorenzo Rosasco
Abstract	We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods
PDF	http://papers.nips.cc/paper/6213-optimal-learning-for-multi-pass-stochastic-gradient-methods.pdf
PWC	https://paperswithcode.com/paper/optimal-learning-for-multi-pass-stochastic
Repo
Framework

A Large Rated Lexicon with French Medical Words


Title	A Large Rated Lexicon with French Medical Words
Authors	Natalia Grabar, Thierry Hamon
Abstract	Patients are often exposed to medical terms, such as anosognosia, myelodysplastic, or hepatojejunostomy, that can be semantically complex and hardly understandable by non-experts in medicine. Hence, it is important to assess which words are potentially non-understandable and require further explanations. The purpose of our work is to build specific lexicon in which the words are rated according to whether they are understandable or non-understandable. We propose to work with medical words in French such as provided by an international medical terminology. The terms are segmented in single words and then each word is manually processed by three annotators. The objective is to assign each word into one of the three categories: I can understand, I am not sure, I cannot understand. The annotators do not have medical training nor they present specific medical problems. They are supposed to represent an average patient. The inter-annotator agreement is then computed. The content of the categories is analyzed. Possible applications in which this lexicon can be helpful are proposed and discussed. The rated lexicon is freely available for the research purposes. It is accessible online at http://natalia.grabar.perso.sfr.fr/rated-lexicon.html
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1420/
PDF	https://www.aclweb.org/anthology/L16-1420
PWC	https://paperswithcode.com/paper/a-large-rated-lexicon-with-french-medical
Repo
Framework

Fast and Flexible Monotonic Functions with Ensembles of Lattices


Title	Fast and Flexible Monotonic Functions with Ensembles of Lattices
Authors	Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta
Abstract	For many machine learning problems, there are some inputs that are known to be positively (or negatively) related to the output, and in such cases training the model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated interpolated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices
PDF	http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices.pdf
PWC	https://paperswithcode.com/paper/fast-and-flexible-monotonic-functions-with
Repo
Framework

RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking


Title	RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking
Authors	Ahmed Magooda, Amr Gomaa, Ashraf Mahgoub, Hany Ahmed, Mohsen Rashwan, Hazem Raafat, Eslam Kamal, Ahmad Al Sallab
Abstract
Tasks	Information Retrieval
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1127/
PDF	https://www.aclweb.org/anthology/S16-1127
PWC	https://paperswithcode.com/paper/rdi_team-at-semeval-2016-task-3-rdi
Repo
Framework

Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups


Title	Steps Toward Automatic Understanding of the Function of Affective Language in Support Groups
Authors	Amit Navindgi, Caroline Brun, C{'e}cile Boulard Masson, Scott Nowson
Abstract
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-6205/
PDF	https://www.aclweb.org/anthology/W16-6205
PWC	https://paperswithcode.com/paper/steps-toward-automatic-understanding-of-the
Repo
Framework

CLIP@UMD at SemEval-2016 Task 8: Parser for Abstract Meaning Representation using Learning to Search


Title	CLIP@UMD at SemEval-2016 Task 8: Parser for Abstract Meaning Representation using Learning to Search
Authors	Sudha Rao, Yogarshi Vyas, Hal Daum{'e} III, Philip Resnik
Abstract
Tasks	Amr Parsing, Coreference Resolution, Dependency Parsing, Named Entity Recognition, Part-Of-Speech Tagging, Structured Prediction
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1184/
PDF	https://www.aclweb.org/anthology/S16-1184
PWC	https://paperswithcode.com/paper/clipumd-at-semeval-2016-task-8-parser-for
Repo
Framework

Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes


Title	Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes
Authors	Gordon Christie, Ankit Laddha, Aishwarya Agrawal, Stanislaw Antol, Yash Goyal, Kevin Kochersberger, Dhruv Batra
Abstract
Tasks	Common Sense Reasoning, Prepositional Phrase Attachment, Semantic Segmentation
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1156/
PDF	https://www.aclweb.org/anthology/D16-1156
PWC	https://paperswithcode.com/paper/resolving-language-and-vision-ambiguities-1
Repo
Framework

Content selection as semantic-based ontology exploration


Title	Content selection as semantic-based ontology exploration
Authors	Laura Perez-Beltrachini, Claire Gardent, Anselme Revuz, B, Saptarashmi yopadhyay
Abstract
Tasks	Common Sense Reasoning, Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3508/
PDF	https://www.aclweb.org/anthology/W16-3508
PWC	https://paperswithcode.com/paper/content-selection-as-semantic-based-ontology
Repo
Framework

Robust k-means: a Theoretical Revisit


Title	Robust k-means: a Theoretical Revisit
Authors	Alexandros Georgogiannis
Abstract	Over the last years, many variations of the quadratic k-means clustering procedure have been proposed, all aiming to robustify the performance of the algorithm in the presence of outliers. In general terms, two main approaches have been developed: one based on penalized regularization methods, and one based on trimming functions. In this work, we present a theoretical analysis of the robustness and consistency properties of a variant of the classical quadratic k-means algorithm, the robust k-means, which borrows ideas from outlier detection in regression. We show that two outliers in a dataset are enough to breakdown this clustering procedure. However, if we focus on “well-structured” datasets, then robust k-means can recover the underlying cluster structure in spite of the outliers. Finally, we show that, with slight modifications, the most general non-asymptotic results for consistency of quadratic k-means remain valid for this robust variant.
Tasks	Outlier Detection
Published	2016-12-01
URL	http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit
PDF	http://papers.nips.cc/paper/6126-robust-k-means-a-theoretical-revisit.pdf
PWC	https://paperswithcode.com/paper/robust-k-means-a-theoretical-revisit
Repo
Framework

TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns


Title	TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns
Authors	Kathrin Eichler, Feiyu Xu, Hans Uszkoreit, Leonhard Hennig, Sebastian Krause
Abstract	The task of relation extraction is to recognize and extract relations between entities or concepts in texts. Dependency parse trees have become a popular source for discovering extraction patterns, which encode the grammatical relations among the phrases that jointly express relation instances. State-of-the-art weakly supervised approaches to relation extraction typically extract thousands of unique patterns only potentially expressing the target relation. Among these patterns, some are semantically equivalent, but differ in their morphological, lexical-semantic or syntactic form. Some express a relation that entails the target relation. We propose a new approach to structuring extraction patterns by utilizing entailment graphs, hierarchical structures representing entailment relations, and present a novel resource of gold-standard entailment graphs based on a set of patterns automatically acquired using distant supervision. We describe the methodology used for creating the dataset and present statistics of the resource as well as an analysis of inference types underlying the entailment decisions.
Tasks	Natural Language Inference, Relation Extraction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1537/
PDF	https://www.aclweb.org/anthology/L16-1537
PWC	https://paperswithcode.com/paper/teg-rep-a-corpus-of-textual-entailment-graphs
Repo
Framework

Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese


Title	Evaluating Ensemble Based Pre-annotation on Named Entity Corpus Construction in English and Chinese
Authors	Tingming Lu, Man Zhu, Zhiqiang Gao, Yaocheng Gui
Abstract	Annotated corpora are crucial language resources, and pre-annotation is an usual way to reduce the cost of corpus construction. Ensemble based pre-annotation approach combines multiple existing named entity taggers and categorizes annotations into normal annotations with high confidence and candidate annotations with low confidence, to reduce the human annotation time. In this paper, we manually annotate three English datasets under various pre-annotation conditions, report the effects of ensemble based pre-annotation, and analyze the experimental results. In order to verify the effectiveness of ensemble based pre-annotation in other languages, such as Chinese, three Chinese datasets are also tested. The experimental results show that the ensemble based pre-annotation approach significantly reduces the number of annotations which human annotators have to add, and outperforms the baseline approaches in reduction of human annotation time without loss in annotation performance (in terms of F1-measure), on both English and Chinese datasets.
Tasks	Named Entity Recognition
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5208/
PDF	https://www.aclweb.org/anthology/W16-5208
PWC	https://paperswithcode.com/paper/evaluating-ensemble-based-pre-annotation-on
Repo
Framework

Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter


Title	Book Reviews: Elements of Formal Semantics: An Introduction to the Mathematical Theory of Meaning in Natural Language by Yoad Winter
Authors	Michael Yoshitaka Erlewine
Abstract
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/J16-4012/
PDF	https://www.aclweb.org/anthology/J16-4012
PWC	https://paperswithcode.com/paper/book-reviews-elements-of-formal-semantics-an
Repo
Framework

Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View


Title	Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View
Authors	Achim Stein
Abstract	The treatment of medieval texts is a particular challenge for parsers. I compare how two dependency parsers, one graph-based, the other transition-based, perform on Old French, facing some typical problems of medieval texts: graphical variation, relatively free word order, and syntactic variation of several parameters over a diachronic period of about 300 years. Both parsers were trained and evaluated on the {``}Syntactic Reference Corpus of Medieval French{''} (SRCMF), a manually annotated dependency treebank. I discuss the relation between types of parsers and types of language, as well as the differences of the analyses from a linguistic point of view. \|
Tasks	Dependency Parsing
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1112/
PDF	https://www.aclweb.org/anthology/L16-1112
PWC	https://paperswithcode.com/paper/old-french-dependency-parsing-results-of-two
Repo
Framework