Paper Group NANR 29
ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task. CEPLEXicon ― A Lexicon of Child European Portuguese. YODA System for WMT16 Shared Task: Bilingual Document Alignment. Referential Translation Machines for Predicting Translation Performance. Extracting Weighted Language Lexicons from Wikipedia. Structured Generative Models of Continuous …
ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task
Title | ILLC-UvA Adaptation System (Scorpio) at WMT’16 IT-DOMAIN Task |
Authors | Hoang Cuong, Stella Frank, Khalil Sima{'}an |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2330/ |
https://www.aclweb.org/anthology/W16-2330 | |
PWC | https://paperswithcode.com/paper/illc-uva-adaptation-system-scorpio-at-wmt16 |
Repo | |
Framework | |
CEPLEXicon ― A Lexicon of Child European Portuguese
Title | CEPLEXicon ― A Lexicon of Child European Portuguese |
Authors | Ana L{'u}cia Santos, Maria Jo{~a}o Freitas, Aida Cardoso |
Abstract | CEPLEXicon (version 1.1) is a child lexicon resulting from the automatic tagging of two child corpora: the corpus Santos (Santos, 2006; Santos et al. 2014) and the corpus Child ― Adult Interaction (Freitas et al. 2012), which integrates information from the corpus Freitas (Freitas, 1997). This lexicon includes spontaneous speech produced by seven children (1;02.00 to 3;11.12) during approximately 86h of child-adult interaction. The automatic tagging comprised the lemmatization and morphosyntactic classification of the speech produced by the seven children included in the two child corpora; the lexicon contains information pertaining to lemmas and syntactic categories as well as absolute number of occurrences and frequencies in three age intervals: {\textless} 2 years; {\mbox{$\geq$}} 2 years and {\textless} 3 years; {\mbox{$\geq$}} 3 years. The information included in this lexicon and the format in which it is presented enables research in different areas and allows researchers to obtain measures of lexical growth. CEPLEXicon is available through the ELRA catalogue. |
Tasks | Lemmatization |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1216/ |
https://www.aclweb.org/anthology/L16-1216 | |
PWC | https://paperswithcode.com/paper/ceplexicon-a-lexicon-of-child-european |
Repo | |
Framework | |
YODA System for WMT16 Shared Task: Bilingual Document Alignment
Title | YODA System for WMT16 Shared Task: Bilingual Document Alignment |
Authors | Aswarth Abhilash Dara, Yiu-Chang Lin |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2366/ |
https://www.aclweb.org/anthology/W16-2366 | |
PWC | https://paperswithcode.com/paper/yoda-system-for-wmt16-shared-task-bilingual |
Repo | |
Framework | |
Referential Translation Machines for Predicting Translation Performance
Title | Referential Translation Machines for Predicting Translation Performance |
Authors | Ergun Bi{\c{c}}ici |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2382/ |
https://www.aclweb.org/anthology/W16-2382 | |
PWC | https://paperswithcode.com/paper/referential-translation-machines-for-1 |
Repo | |
Framework | |
Extracting Weighted Language Lexicons from Wikipedia
Title | Extracting Weighted Language Lexicons from Wikipedia |
Authors | Gregory Grefenstette |
Abstract | Language models are used in applications as diverse as speech recognition, optical character recognition and information retrieval. They are used to predict word appearance, and to weight the importance of words in these applications. One basic element of language models is the list of words in a language. Another is the unigram frequency of each word. But this basic information is not available for most languages in the world. Since the multilingual Wikipedia project encourages the production of encyclopedic-like articles in many world languages, we can find there an ever-growing source of text from which to extract these two language modelling elements: word list and frequency. Here we present a simple technique for converting this Wikipedia text into lexicons of weighted unigrams for the more than 280 languages present currently present in Wikipedia. The lexicons produced, and the source code for producing them in a Linux-based system are here made available for free on the Web. |
Tasks | Information Retrieval, Language Modelling, Optical Character Recognition, Speech Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1217/ |
https://www.aclweb.org/anthology/L16-1217 | |
PWC | https://paperswithcode.com/paper/extracting-weighted-language-lexicons-from |
Repo | |
Framework | |
Structured Generative Models of Continuous Features for Word Sense Induction
Title | Structured Generative Models of Continuous Features for Word Sense Induction |
Authors | Alex Komninos, ros, Man, Suresh har |
Abstract | We propose a structured generative latent variable model that integrates information from multiple contextual representations for Word Sense Induction. Our approach jointly models global lexical, local lexical and dependency syntactic context. Each context type is associated with a latent variable and the three types of variables share a hierarchical structure. We use skip-gram based word and dependency context embeddings to construct all three types of representations, reducing the total number of parameters to be estimated and enabling better generalization. We describe an EM algorithm to efficiently estimate model parameters and use the Integrated Complete Likelihood criterion to automatically estimate the number of senses. Our model achieves state-of-the-art results on the SemEval-2010 and SemEval-2013 Word Sense Induction datasets. |
Tasks | Word Embeddings, Word Sense Disambiguation, Word Sense Induction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1337/ |
https://www.aclweb.org/anthology/C16-1337 | |
PWC | https://paperswithcode.com/paper/structured-generative-models-of-continuous |
Repo | |
Framework | |
Unraveling the English-Bengali Code-Mixing Phenomenon
Title | Unraveling the English-Bengali Code-Mixing Phenomenon |
Authors | Ch, Arunavha a, Dipankar Das, Ch Mazumdar, an |
Abstract | |
Tasks | |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-5810/ |
https://www.aclweb.org/anthology/W16-5810 | |
PWC | https://paperswithcode.com/paper/unraveling-the-english-bengali-code-mixing |
Repo | |
Framework | |
Context-Dependent Sense Embedding
Title | Context-Dependent Sense Embedding |
Authors | Lin Qiu, Kewei Tu, Yong Yu |
Abstract | |
Tasks | Word Embeddings, Word Sense Disambiguation, Word Sense Induction |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1018/ |
https://www.aclweb.org/anthology/D16-1018 | |
PWC | https://paperswithcode.com/paper/context-dependent-sense-embedding |
Repo | |
Framework | |
A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety
Title | A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety |
Authors | Mathieu Chollet, Torsten W{"o}rtwein, Louis-Philippe Morency, Stefan Scherer |
Abstract | The ability to efficiently speak in public is an essential asset for many professions and is used in everyday life. As such, tools enabling the improvement of public speaking performance and the assessment and mitigation of anxiety related to public speaking would be very useful. Multimodal interaction technologies, such as computer vision and embodied conversational agents, have recently been investigated for the training and assessment of interpersonal skills. Once central requirement for these technologies is multimodal corpora for training machine learning models. This paper addresses the need of these technologies by presenting and sharing a multimodal corpus of public speaking presentations. These presentations were collected in an experimental study investigating the potential of interactive virtual audiences for public speaking training. This corpus includes audio-visual data and automatically extracted features, measures of public speaking anxiety and personality, annotations of participants{'} behaviors and expert ratings of behavioral aspects and overall performance of the presenters. We hope this corpus will help other research teams in developing tools for supporting public speaking training. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1078/ |
https://www.aclweb.org/anthology/L16-1078 | |
PWC | https://paperswithcode.com/paper/a-multimodal-corpus-for-the-assessment-of |
Repo | |
Framework | |
Wiktionnaire’s Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary
Title | Wiktionnaire’s Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary |
Authors | Nabil Hathout, Franck Sajous |
Abstract | GLAWI is a free, large-scale and versatile Machine-Readable Dictionary (MRD) that has been extracted from the French language edition of Wiktionary, called Wiktionnaire. In (Sajous and Hathout, 2015), we introduced GLAWI, gave the rationale behind the creation of this lexicographic resource and described the extraction process, focusing on the conversion and standardization of the heterogeneous data provided by this collaborative dictionary. In the current article, we describe the content of GLAWI and illustrate how it is structured. We also suggest various applications, ranging from linguistic studies, NLP applications to psycholinguistic experimentation. They all can take advantage of the diversity of the lexical knowledge available in GLAWI. Besides this diversity and extensive lexical coverage, GLAWI is also remarkable because it is the only free lexical resource of contemporary French that contains definitions. This unique material opens way to the renewal of MRD-based methods, notably the automated extraction and acquisition of semantic relations. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1218/ |
https://www.aclweb.org/anthology/L16-1218 | |
PWC | https://paperswithcode.com/paper/wiktionnaires-wikicode-glawified-a-workable |
Repo | |
Framework | |
Transition-Based Parsing for Deep Dependency Structures
Title | Transition-Based Parsing for Deep Dependency Structures |
Authors | Xun Zhang, Yantao Du, Weiwei Sun, Xiaojun Wan |
Abstract | |
Tasks | |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/J16-3001/ |
https://www.aclweb.org/anthology/J16-3001 | |
PWC | https://paperswithcode.com/paper/transition-based-parsing-for-deep-dependency |
Repo | |
Framework | |
Proceedings of the Sixth Named Entity Workshop
Title | Proceedings of the Sixth Named Entity Workshop |
Authors | |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2700/ |
https://www.aclweb.org/anthology/W16-2700 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-sixth-named-entity |
Repo | |
Framework | |
On Bias-free Crawling and Representative Web Corpora
Title | On Bias-free Crawling and Representative Web Corpora |
Authors | Rol Sch{"a}fer, |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2612/ |
https://www.aclweb.org/anthology/W16-2612 | |
PWC | https://paperswithcode.com/paper/on-bias-free-crawling-and-representative-web |
Repo | |
Framework | |
Genre classification for a corpus of academic webpages
Title | Genre classification for a corpus of academic webpages |
Authors | Erika Dalan, Serge Sharoff |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2611/ |
https://www.aclweb.org/anthology/W16-2611 | |
PWC | https://paperswithcode.com/paper/genre-classification-for-a-corpus-of-academic |
Repo | |
Framework | |
Experimental Study of Vowels in Nagamese, Ao and Lotha: Languages of Nagaland
Title | Experimental Study of Vowels in Nagamese, Ao and Lotha: Languages of Nagaland |
Authors | Joyanta Basu, Tulika Basu, Soma Khan, Madhab Pal, Rajib Roy, Tapan Kumar Basu |
Abstract | |
Tasks | Language Identification, Speech Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-6339/ |
https://www.aclweb.org/anthology/W16-6339 | |
PWC | https://paperswithcode.com/paper/experimental-study-of-vowels-in-nagamese-ao |
Repo | |
Framework | |