May 5, 2019

1354 words 7 mins read

Paper Group NANR 29

Paper Group NANR 29

ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task. CEPLEXicon ― A Lexicon of Child European Portuguese. YODA System for WMT16 Shared Task: Bilingual Document Alignment. Referential Translation Machines for Predicting Translation Performance. Extracting Weighted Language Lexicons from Wikipedia. Structured Generative Models of Continuous …

ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task

Title ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task
Authors Hoang Cuong, Stella Frank, Khalil Sima{'}an
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2330/
PDF https://www.aclweb.org/anthology/W16-2330
PWC https://paperswithcode.com/paper/illc-uva-adaptation-system-scorpio-at-wmt16
Repo
Framework

CEPLEXicon ― A Lexicon of Child European Portuguese

Title CEPLEXicon ― A Lexicon of Child European Portuguese
Authors Ana L{'u}cia Santos, Maria Jo{~a}o Freitas, Aida Cardoso
Abstract CEPLEXicon (version 1.1) is a child lexicon resulting from the automatic tagging of two child corpora: the corpus Santos (Santos, 2006; Santos et al. 2014) and the corpus Child ― Adult Interaction (Freitas et al. 2012), which integrates information from the corpus Freitas (Freitas, 1997). This lexicon includes spontaneous speech produced by seven children (1;02.00 to 3;11.12) during approximately 86h of child-adult interaction. The automatic tagging comprised the lemmatization and morphosyntactic classification of the speech produced by the seven children included in the two child corpora; the lexicon contains information pertaining to lemmas and syntactic categories as well as absolute number of occurrences and frequencies in three age intervals: {\textless} 2 years; {\mbox{$\geq$}} 2 years and {\textless} 3 years; {\mbox{$\geq$}} 3 years. The information included in this lexicon and the format in which it is presented enables research in different areas and allows researchers to obtain measures of lexical growth. CEPLEXicon is available through the ELRA catalogue.
Tasks Lemmatization
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1216/
PDF https://www.aclweb.org/anthology/L16-1216
PWC https://paperswithcode.com/paper/ceplexicon-a-lexicon-of-child-european
Repo
Framework

YODA System for WMT16 Shared Task: Bilingual Document Alignment

Title YODA System for WMT16 Shared Task: Bilingual Document Alignment
Authors Aswarth Abhilash Dara, Yiu-Chang Lin
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2366/
PDF https://www.aclweb.org/anthology/W16-2366
PWC https://paperswithcode.com/paper/yoda-system-for-wmt16-shared-task-bilingual
Repo
Framework

Referential Translation Machines for Predicting Translation Performance

Title Referential Translation Machines for Predicting Translation Performance
Authors Ergun Bi{\c{c}}ici
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2382/
PDF https://www.aclweb.org/anthology/W16-2382
PWC https://paperswithcode.com/paper/referential-translation-machines-for-1
Repo
Framework

Extracting Weighted Language Lexicons from Wikipedia

Title Extracting Weighted Language Lexicons from Wikipedia
Authors Gregory Grefenstette
Abstract Language models are used in applications as diverse as speech recognition, optical character recognition and information retrieval. They are used to predict word appearance, and to weight the importance of words in these applications. One basic element of language models is the list of words in a language. Another is the unigram frequency of each word. But this basic information is not available for most languages in the world. Since the multilingual Wikipedia project encourages the production of encyclopedic-like articles in many world languages, we can find there an ever-growing source of text from which to extract these two language modelling elements: word list and frequency. Here we present a simple technique for converting this Wikipedia text into lexicons of weighted unigrams for the more than 280 languages present currently present in Wikipedia. The lexicons produced, and the source code for producing them in a Linux-based system are here made available for free on the Web.
Tasks Information Retrieval, Language Modelling, Optical Character Recognition, Speech Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1217/
PDF https://www.aclweb.org/anthology/L16-1217
PWC https://paperswithcode.com/paper/extracting-weighted-language-lexicons-from
Repo
Framework

Structured Generative Models of Continuous Features for Word Sense Induction

Title Structured Generative Models of Continuous Features for Word Sense Induction
Authors Alex Komninos, ros, Man, Suresh har
Abstract We propose a structured generative latent variable model that integrates information from multiple contextual representations for Word Sense Induction. Our approach jointly models global lexical, local lexical and dependency syntactic context. Each context type is associated with a latent variable and the three types of variables share a hierarchical structure. We use skip-gram based word and dependency context embeddings to construct all three types of representations, reducing the total number of parameters to be estimated and enabling better generalization. We describe an EM algorithm to efficiently estimate model parameters and use the Integrated Complete Likelihood criterion to automatically estimate the number of senses. Our model achieves state-of-the-art results on the SemEval-2010 and SemEval-2013 Word Sense Induction datasets.
Tasks Word Embeddings, Word Sense Disambiguation, Word Sense Induction
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1337/
PDF https://www.aclweb.org/anthology/C16-1337
PWC https://paperswithcode.com/paper/structured-generative-models-of-continuous
Repo
Framework

Unraveling the English-Bengali Code-Mixing Phenomenon

Title Unraveling the English-Bengali Code-Mixing Phenomenon
Authors Ch, Arunavha a, Dipankar Das, Ch Mazumdar, an
Abstract
Tasks
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5810/
PDF https://www.aclweb.org/anthology/W16-5810
PWC https://paperswithcode.com/paper/unraveling-the-english-bengali-code-mixing
Repo
Framework

Context-Dependent Sense Embedding

Title Context-Dependent Sense Embedding
Authors Lin Qiu, Kewei Tu, Yong Yu
Abstract
Tasks Word Embeddings, Word Sense Disambiguation, Word Sense Induction
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1018/
PDF https://www.aclweb.org/anthology/D16-1018
PWC https://paperswithcode.com/paper/context-dependent-sense-embedding
Repo
Framework

A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety

Title A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety
Authors Mathieu Chollet, Torsten W{"o}rtwein, Louis-Philippe Morency, Stefan Scherer
Abstract The ability to efficiently speak in public is an essential asset for many professions and is used in everyday life. As such, tools enabling the improvement of public speaking performance and the assessment and mitigation of anxiety related to public speaking would be very useful. Multimodal interaction technologies, such as computer vision and embodied conversational agents, have recently been investigated for the training and assessment of interpersonal skills. Once central requirement for these technologies is multimodal corpora for training machine learning models. This paper addresses the need of these technologies by presenting and sharing a multimodal corpus of public speaking presentations. These presentations were collected in an experimental study investigating the potential of interactive virtual audiences for public speaking training. This corpus includes audio-visual data and automatically extracted features, measures of public speaking anxiety and personality, annotations of participants{'} behaviors and expert ratings of behavioral aspects and overall performance of the presenters. We hope this corpus will help other research teams in developing tools for supporting public speaking training.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1078/
PDF https://www.aclweb.org/anthology/L16-1078
PWC https://paperswithcode.com/paper/a-multimodal-corpus-for-the-assessment-of
Repo
Framework

Wiktionnaire’s Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary

Title Wiktionnaire’s Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary
Authors Nabil Hathout, Franck Sajous
Abstract GLAWI is a free, large-scale and versatile Machine-Readable Dictionary (MRD) that has been extracted from the French language edition of Wiktionary, called Wiktionnaire. In (Sajous and Hathout, 2015), we introduced GLAWI, gave the rationale behind the creation of this lexicographic resource and described the extraction process, focusing on the conversion and standardization of the heterogeneous data provided by this collaborative dictionary. In the current article, we describe the content of GLAWI and illustrate how it is structured. We also suggest various applications, ranging from linguistic studies, NLP applications to psycholinguistic experimentation. They all can take advantage of the diversity of the lexical knowledge available in GLAWI. Besides this diversity and extensive lexical coverage, GLAWI is also remarkable because it is the only free lexical resource of contemporary French that contains definitions. This unique material opens way to the renewal of MRD-based methods, notably the automated extraction and acquisition of semantic relations.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1218/
PDF https://www.aclweb.org/anthology/L16-1218
PWC https://paperswithcode.com/paper/wiktionnaires-wikicode-glawified-a-workable
Repo
Framework

Transition-Based Parsing for Deep Dependency Structures

Title Transition-Based Parsing for Deep Dependency Structures
Authors Xun Zhang, Yantao Du, Weiwei Sun, Xiaojun Wan
Abstract
Tasks
Published 2016-09-01
URL https://www.aclweb.org/anthology/J16-3001/
PDF https://www.aclweb.org/anthology/J16-3001
PWC https://paperswithcode.com/paper/transition-based-parsing-for-deep-dependency
Repo
Framework

Proceedings of the Sixth Named Entity Workshop

Title Proceedings of the Sixth Named Entity Workshop
Authors
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2700/
PDF https://www.aclweb.org/anthology/W16-2700
PWC https://paperswithcode.com/paper/proceedings-of-the-sixth-named-entity
Repo
Framework

On Bias-free Crawling and Representative Web Corpora

Title On Bias-free Crawling and Representative Web Corpora
Authors Rol Sch{"a}fer,
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2612/
PDF https://www.aclweb.org/anthology/W16-2612
PWC https://paperswithcode.com/paper/on-bias-free-crawling-and-representative-web
Repo
Framework

Genre classification for a corpus of academic webpages

Title Genre classification for a corpus of academic webpages
Authors Erika Dalan, Serge Sharoff
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2611/
PDF https://www.aclweb.org/anthology/W16-2611
PWC https://paperswithcode.com/paper/genre-classification-for-a-corpus-of-academic
Repo
Framework

Experimental Study of Vowels in Nagamese, Ao and Lotha: Languages of Nagaland

Title Experimental Study of Vowels in Nagamese, Ao and Lotha: Languages of Nagaland
Authors Joyanta Basu, Tulika Basu, Soma Khan, Madhab Pal, Rajib Roy, Tapan Kumar Basu
Abstract
Tasks Language Identification, Speech Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6339/
PDF https://www.aclweb.org/anthology/W16-6339
PWC https://paperswithcode.com/paper/experimental-study-of-vowels-in-nagamese-ao
Repo
Framework
comments powered by Disqus