May 4, 2019

1300 words 7 mins read

Paper Group NANR 165

Paper Group NANR 165

Joint Word Segmentation and Phonetic Category Induction. Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization. BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology. SemEval-2016 Task 14: Semantic Taxonomy Enrichment. A Corpus of Word-Aligned Asked and Anticipated Qu …

Joint Word Segmentation and Phonetic Category Induction

Title Joint Word Segmentation and Phonetic Category Induction
Authors Micha Elsner, Stephanie Antetomaso, Naomi Feldman
Abstract
Tasks Language Acquisition, Speech Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2010/
PDF https://www.aclweb.org/anthology/P16-2010
PWC https://paperswithcode.com/paper/joint-word-segmentation-and-phonetic-category
Repo
Framework

Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization

Title Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization
Authors Hajime Senuma, Akiko Aizawa
Abstract The recent proliferation of smart devices necessitates methods to learn small-sized models. This paper demonstrates that if there are $m$ features in total but only $n = o(\sqrt{m})$ features are required to distinguish examples, with $\Omega(\log m)$ training examples and reasonable settings, it is possible to obtain a good model in a \textit{succinct} representation using $n \log_2 \frac{m}{n} + o(m)$ bits, by using a pipeline of existing compression methods: L1-regularized logistic regression, feature hashing, Elias{–}Fano indices, and randomized quantization. An experiment shows that a noun phrase chunking task for which an existing library requires 27 megabytes can be compressed to less than 13 \textit{kilo}bytes without notable loss of accuracy.
Tasks Chunking, Quantization
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1261/
PDF https://www.aclweb.org/anthology/C16-1261
PWC https://paperswithcode.com/paper/learning-succinct-models-pipelined
Repo
Framework

BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology

Title BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology
Authors Neli Hateva, Petar Mitankin, Stoyan Mihov
Abstract In this paper we introduce a Bulgarian speech database, which was created for the purpose of ASR technology development. The paper describes the design and the content of the speech database. We present also an empirical evaluation of the performance of a LVCSR system for Bulgarian trained on the BulPhonC data. The resource is available free for scientific usage.
Tasks Large Vocabulary Continuous Speech Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1123/
PDF https://www.aclweb.org/anthology/L16-1123
PWC https://paperswithcode.com/paper/bulphonc-bulgarian-speech-corpus-for-the
Repo
Framework

SemEval-2016 Task 14: Semantic Taxonomy Enrichment

Title SemEval-2016 Task 14: Semantic Taxonomy Enrichment
Authors David Jurgens, Mohammad Taher Pilehvar
Abstract
Tasks Information Retrieval, Semantic Textual Similarity, Sentiment Analysis, Word Sense Disambiguation
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1169/
PDF https://www.aclweb.org/anthology/S16-1169
PWC https://paperswithcode.com/paper/semeval-2016-task-14-semantic-taxonomy
Repo
Framework

A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System

Title A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System
Authors Ajda Gokcen, Evan Jaffe, Johnsey Erdmann, Michael White, Douglas Danforth
Abstract We present a corpus of virtual patient dialogues to which we have added manually annotated gold standard word alignments. Since each question asked by a medical student in the dialogues is mapped to a canonical, anticipated version of the question, the corpus implicitly defines a large set of paraphrase (and non-paraphrase) pairs. We also present a novel process for selecting the most useful data to annotate with word alignments and for ensuring consistent paraphrase status decisions. In support of this process, we have enhanced the earlier Edinburgh alignment tool (Cohn et al., 2008) and revised and extended the Edinburgh guidelines, in particular adding guidance intended to ensure that the word alignments are consistent with the overall paraphrase status decision. The finished corpus and the enhanced alignment tool are made freely available.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1506/
PDF https://www.aclweb.org/anthology/L16-1506
PWC https://paperswithcode.com/paper/a-corpus-of-word-aligned-asked-and
Repo
Framework

Analyzing Framing through the Casts of Characters in the News

Title Analyzing Framing through the Casts of Characters in the News
Authors Dallas Card, Justin Gross, Amber Boydstun, Noah A. Smith
Abstract
Tasks Model Selection, Topic Models
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1148/
PDF https://www.aclweb.org/anthology/D16-1148
PWC https://paperswithcode.com/paper/analyzing-framing-through-the-casts-of
Repo
Framework

Interlocking Phrases in Phrase-based Statistical Machine Translation

Title Interlocking Phrases in Phrase-based Statistical Machine Translation
Authors Ye Kyaw Thu, Andrew Finch, Eiichiro Sumita
Abstract
Tasks Language Modelling, Machine Translation
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1124/
PDF https://www.aclweb.org/anthology/N16-1124
PWC https://paperswithcode.com/paper/interlocking-phrases-in-phrase-based
Repo
Framework

CogALex-V Shared Task: Mach5 – A traditional DSM approach to semantic relatedness

Title CogALex-V Shared Task: Mach5 – A traditional DSM approach to semantic relatedness
Authors Stefan Evert
Abstract This contribution provides a strong baseline result for the CogALex-V shared task using a traditional {``}count{''}-type DSM (placed in rank 2 out of 7 in subtask 1 and rank 3 out of 6 in subtask 2). Parameter tuning experiments reveal some surprising effects and suggest that the use of random word pairs as negative examples may be problematic, guiding the parameter optimization in an undesirable direction. |
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5312/
PDF https://www.aclweb.org/anthology/W16-5312
PWC https://paperswithcode.com/paper/cogalex-v-shared-task-mach5-a-a-traditional
Repo
Framework

How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality

Title How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality
Authors Carlos Ramisch, Silvio Cordeiro, Leonardo Zilio, Marco Idiart, Aline Villavicencio
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2026/
PDF https://www.aclweb.org/anthology/P16-2026
PWC https://paperswithcode.com/paper/how-naked-is-the-naked-truth-a-multilingual
Repo
Framework

Detecting Mild Cognitive Impairment by Exploiting Linguistic Information from Transcripts

Title Detecting Mild Cognitive Impairment by Exploiting Linguistic Information from Transcripts
Authors Veronika Vincze, G{'a}bor Gosztolya, L{'a}szl{'o} T{'o}th, Ildik{'o} Hoffmann, Gr{'e}ta Szatl{'o}czki, Zolt{'a}n B{'a}nr{'e}ti, Magdolna P{'a}k{'a}ski, J{'a}nos K{'a}lm{'a}n
Abstract
Tasks Lexical Analysis, Speech Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2030/
PDF https://www.aclweb.org/anthology/P16-2030
PWC https://paperswithcode.com/paper/detecting-mild-cognitive-impairment-by
Repo
Framework

ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data

Title ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data
Authors Jaime Huerta-Cepas, François Serra, Peer Bork
Abstract The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org
Tasks
Published 2016-02-26
URL https://academic.oup.com/mbe/article/33/6/1635/2579822
PDF https://academic.oup.com/mbe/article-pdf/33/6/1635/7953632/msw046.pdf
PWC https://paperswithcode.com/paper/ete-3-reconstruction-analysis-and
Repo
Framework

Do Enterprises Have Emotions?

Title Do Enterprises Have Emotions?
Authors Sven Buechel, Udo Hahn, Jan Goldenstein, Sebastian G. M. H{"a}ndschke, Peter Walgenbach
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0423/
PDF https://www.aclweb.org/anthology/W16-0423
PWC https://paperswithcode.com/paper/do-enterprises-have-emotions
Repo
Framework

Deep Learning for Predicting Human Strategic Behavior

Title Deep Learning for Predicting Human Strategic Behavior
Authors Jason S. Hartford, James R. Wright, Kevin Leyton-Brown
Abstract Predicting the behavior of human participants in strategic settings is an important problem in many domains. Most existing work either assumes that participants are perfectly rational, or attempts to directly model each participant’s cognitive processes based on insights from cognitive psychology and experimental economics. In this work, we present an alternative, a deep learning approach that automatically performs cognitive modeling without relying on such expert knowledge. We introduce a novel architecture that allows a single network to generalize across different input and output dimensions by using matrix units rather than scalar units, and show that its performance significantly outperforms that of the previous state of the art, which relies on expert-constructed features.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6509-deep-learning-for-predicting-human-strategic-behavior
PDF http://papers.nips.cc/paper/6509-deep-learning-for-predicting-human-strategic-behavior.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-predicting-human-strategic
Repo
Framework

ENIAM: Categorial Syntactic-Semantic Parser for Polish

Title ENIAM: Categorial Syntactic-Semantic Parser for Polish
Authors Wojciech Jaworski, Jakub Kozakoszczak
Abstract This paper presents ENIAM, the first syntactic and semantic parser that generates semantic representations for sentences in Polish. The parser processes non-annotated data and performs tokenization, lemmatization, dependency recognition, word sense annotation, thematic role annotation, partial disambiguation and computes the semantic representation.
Tasks Information Retrieval, Lemmatization, Natural Language Inference, Question Answering, Tokenization
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2051/
PDF https://www.aclweb.org/anthology/C16-2051
PWC https://paperswithcode.com/paper/eniam-categorial-syntactic-semantic-parser
Repo
Framework

LAMB: A Good Shepherd of Morphologically Rich Languages

Title LAMB: A Good Shepherd of Morphologically Rich Languages
Authors Sebastian Ebert, Thomas M{"u}ller, Hinrich Sch{"u}tze
Abstract
Tasks Lemmatization
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1071/
PDF https://www.aclweb.org/anthology/D16-1071
PWC https://paperswithcode.com/paper/lamb-a-good-shepherd-of-morphologically-rich
Repo
Framework
comments powered by Disqus