Paper Group NANR 165
Joint Word Segmentation and Phonetic Category Induction. Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization. BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology. SemEval-2016 Task 14: Semantic Taxonomy Enrichment. A Corpus of Word-Aligned Asked and Anticipated Qu …
Joint Word Segmentation and Phonetic Category Induction
Title | Joint Word Segmentation and Phonetic Category Induction |
Authors | Micha Elsner, Stephanie Antetomaso, Naomi Feldman |
Abstract | |
Tasks | Language Acquisition, Speech Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2010/ |
https://www.aclweb.org/anthology/P16-2010 | |
PWC | https://paperswithcode.com/paper/joint-word-segmentation-and-phonetic-category |
Repo | |
Framework | |
Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization
Title | Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization |
Authors | Hajime Senuma, Akiko Aizawa |
Abstract | The recent proliferation of smart devices necessitates methods to learn small-sized models. This paper demonstrates that if there are $m$ features in total but only $n = o(\sqrt{m})$ features are required to distinguish examples, with $\Omega(\log m)$ training examples and reasonable settings, it is possible to obtain a good model in a \textit{succinct} representation using $n \log_2 \frac{m}{n} + o(m)$ bits, by using a pipeline of existing compression methods: L1-regularized logistic regression, feature hashing, Elias{–}Fano indices, and randomized quantization. An experiment shows that a noun phrase chunking task for which an existing library requires 27 megabytes can be compressed to less than 13 \textit{kilo}bytes without notable loss of accuracy. |
Tasks | Chunking, Quantization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1261/ |
https://www.aclweb.org/anthology/C16-1261 | |
PWC | https://paperswithcode.com/paper/learning-succinct-models-pipelined |
Repo | |
Framework | |
BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology
Title | BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology |
Authors | Neli Hateva, Petar Mitankin, Stoyan Mihov |
Abstract | In this paper we introduce a Bulgarian speech database, which was created for the purpose of ASR technology development. The paper describes the design and the content of the speech database. We present also an empirical evaluation of the performance of a LVCSR system for Bulgarian trained on the BulPhonC data. The resource is available free for scientific usage. |
Tasks | Large Vocabulary Continuous Speech Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1123/ |
https://www.aclweb.org/anthology/L16-1123 | |
PWC | https://paperswithcode.com/paper/bulphonc-bulgarian-speech-corpus-for-the |
Repo | |
Framework | |
SemEval-2016 Task 14: Semantic Taxonomy Enrichment
Title | SemEval-2016 Task 14: Semantic Taxonomy Enrichment |
Authors | David Jurgens, Mohammad Taher Pilehvar |
Abstract | |
Tasks | Information Retrieval, Semantic Textual Similarity, Sentiment Analysis, Word Sense Disambiguation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1169/ |
https://www.aclweb.org/anthology/S16-1169 | |
PWC | https://paperswithcode.com/paper/semeval-2016-task-14-semantic-taxonomy |
Repo | |
Framework | |
A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System
Title | A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System |
Authors | Ajda Gokcen, Evan Jaffe, Johnsey Erdmann, Michael White, Douglas Danforth |
Abstract | We present a corpus of virtual patient dialogues to which we have added manually annotated gold standard word alignments. Since each question asked by a medical student in the dialogues is mapped to a canonical, anticipated version of the question, the corpus implicitly defines a large set of paraphrase (and non-paraphrase) pairs. We also present a novel process for selecting the most useful data to annotate with word alignments and for ensuring consistent paraphrase status decisions. In support of this process, we have enhanced the earlier Edinburgh alignment tool (Cohn et al., 2008) and revised and extended the Edinburgh guidelines, in particular adding guidance intended to ensure that the word alignments are consistent with the overall paraphrase status decision. The finished corpus and the enhanced alignment tool are made freely available. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1506/ |
https://www.aclweb.org/anthology/L16-1506 | |
PWC | https://paperswithcode.com/paper/a-corpus-of-word-aligned-asked-and |
Repo | |
Framework | |
Analyzing Framing through the Casts of Characters in the News
Title | Analyzing Framing through the Casts of Characters in the News |
Authors | Dallas Card, Justin Gross, Amber Boydstun, Noah A. Smith |
Abstract | |
Tasks | Model Selection, Topic Models |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1148/ |
https://www.aclweb.org/anthology/D16-1148 | |
PWC | https://paperswithcode.com/paper/analyzing-framing-through-the-casts-of |
Repo | |
Framework | |
Interlocking Phrases in Phrase-based Statistical Machine Translation
Title | Interlocking Phrases in Phrase-based Statistical Machine Translation |
Authors | Ye Kyaw Thu, Andrew Finch, Eiichiro Sumita |
Abstract | |
Tasks | Language Modelling, Machine Translation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1124/ |
https://www.aclweb.org/anthology/N16-1124 | |
PWC | https://paperswithcode.com/paper/interlocking-phrases-in-phrase-based |
Repo | |
Framework | |
CogALex-V Shared Task: Mach5 – A traditional DSM approach to semantic relatedness
Title | CogALex-V Shared Task: Mach5 – A traditional DSM approach to semantic relatedness |
Authors | Stefan Evert |
Abstract | This contribution provides a strong baseline result for the CogALex-V shared task using a traditional {``}count{''}-type DSM (placed in rank 2 out of 7 in subtask 1 and rank 3 out of 6 in subtask 2). Parameter tuning experiments reveal some surprising effects and suggest that the use of random word pairs as negative examples may be problematic, guiding the parameter optimization in an undesirable direction. | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5312/ |
https://www.aclweb.org/anthology/W16-5312 | |
PWC | https://paperswithcode.com/paper/cogalex-v-shared-task-mach5-a-a-traditional |
Repo | |
Framework | |
How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality
Title | How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality |
Authors | Carlos Ramisch, Silvio Cordeiro, Leonardo Zilio, Marco Idiart, Aline Villavicencio |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2026/ |
https://www.aclweb.org/anthology/P16-2026 | |
PWC | https://paperswithcode.com/paper/how-naked-is-the-naked-truth-a-multilingual |
Repo | |
Framework | |
Detecting Mild Cognitive Impairment by Exploiting Linguistic Information from Transcripts
Title | Detecting Mild Cognitive Impairment by Exploiting Linguistic Information from Transcripts |
Authors | Veronika Vincze, G{'a}bor Gosztolya, L{'a}szl{'o} T{'o}th, Ildik{'o} Hoffmann, Gr{'e}ta Szatl{'o}czki, Zolt{'a}n B{'a}nr{'e}ti, Magdolna P{'a}k{'a}ski, J{'a}nos K{'a}lm{'a}n |
Abstract | |
Tasks | Lexical Analysis, Speech Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2030/ |
https://www.aclweb.org/anthology/P16-2030 | |
PWC | https://paperswithcode.com/paper/detecting-mild-cognitive-impairment-by |
Repo | |
Framework | |
ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data
Title | ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data |
Authors | Jaime Huerta-Cepas, François Serra, Peer Bork |
Abstract | The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org |
Tasks | |
Published | 2016-02-26 |
URL | https://academic.oup.com/mbe/article/33/6/1635/2579822 |
https://academic.oup.com/mbe/article-pdf/33/6/1635/7953632/msw046.pdf | |
PWC | https://paperswithcode.com/paper/ete-3-reconstruction-analysis-and |
Repo | |
Framework | |
Do Enterprises Have Emotions?
Title | Do Enterprises Have Emotions? |
Authors | Sven Buechel, Udo Hahn, Jan Goldenstein, Sebastian G. M. H{"a}ndschke, Peter Walgenbach |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0423/ |
https://www.aclweb.org/anthology/W16-0423 | |
PWC | https://paperswithcode.com/paper/do-enterprises-have-emotions |
Repo | |
Framework | |
Deep Learning for Predicting Human Strategic Behavior
Title | Deep Learning for Predicting Human Strategic Behavior |
Authors | Jason S. Hartford, James R. Wright, Kevin Leyton-Brown |
Abstract | Predicting the behavior of human participants in strategic settings is an important problem in many domains. Most existing work either assumes that participants are perfectly rational, or attempts to directly model each participant’s cognitive processes based on insights from cognitive psychology and experimental economics. In this work, we present an alternative, a deep learning approach that automatically performs cognitive modeling without relying on such expert knowledge. We introduce a novel architecture that allows a single network to generalize across different input and output dimensions by using matrix units rather than scalar units, and show that its performance significantly outperforms that of the previous state of the art, which relies on expert-constructed features. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6509-deep-learning-for-predicting-human-strategic-behavior |
http://papers.nips.cc/paper/6509-deep-learning-for-predicting-human-strategic-behavior.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-predicting-human-strategic |
Repo | |
Framework | |
ENIAM: Categorial Syntactic-Semantic Parser for Polish
Title | ENIAM: Categorial Syntactic-Semantic Parser for Polish |
Authors | Wojciech Jaworski, Jakub Kozakoszczak |
Abstract | This paper presents ENIAM, the first syntactic and semantic parser that generates semantic representations for sentences in Polish. The parser processes non-annotated data and performs tokenization, lemmatization, dependency recognition, word sense annotation, thematic role annotation, partial disambiguation and computes the semantic representation. |
Tasks | Information Retrieval, Lemmatization, Natural Language Inference, Question Answering, Tokenization |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2051/ |
https://www.aclweb.org/anthology/C16-2051 | |
PWC | https://paperswithcode.com/paper/eniam-categorial-syntactic-semantic-parser |
Repo | |
Framework | |
LAMB: A Good Shepherd of Morphologically Rich Languages
Title | LAMB: A Good Shepherd of Morphologically Rich Languages |
Authors | Sebastian Ebert, Thomas M{"u}ller, Hinrich Sch{"u}tze |
Abstract | |
Tasks | Lemmatization |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1071/ |
https://www.aclweb.org/anthology/D16-1071 | |
PWC | https://paperswithcode.com/paper/lamb-a-good-shepherd-of-morphologically-rich |
Repo | |
Framework | |