Paper Group NANR 87
Subtask Mining from Search Query Logs for How-Knowledge Acceleration. Online Information Retrieval for Language Learning. Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists. A Multilinear Approach to the Unsupervised Learning of Morphology. Purely sequence-trained neural networks for ASR based on lattice-free …
Subtask Mining from Search Query Logs for How-Knowledge Acceleration
Title | Subtask Mining from Search Query Logs for How-Knowledge Acceleration |
Authors | Chung-Lun Kuo, Hsin-Hsi Chen |
Abstract | How-knowledge is indispensable in daily life, but has relatively less quantity and poorer quality than what-knowledge in publicly available knowledge bases. This paper first extracts task-subtask pairs from wikiHow, then mines linguistic patterns from search query logs, and finally applies the mined patterns to extract subtasks to complete given how-to tasks. To evaluate the proposed methodology, we group tasks and the corresponding recommended subtasks into pairs, and evaluate the results automatically and manually. The automatic evaluation shows the accuracy of 0.4494. We also classify the mined patterns based on prepositions and find that the prepositions like {}on{''}, { }to{''}, and {``}with{''} have the better performance. The results can be used to accelerate how-knowledge base construction. | |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1198/ |
https://www.aclweb.org/anthology/L16-1198 | |
PWC | https://paperswithcode.com/paper/subtask-mining-from-search-query-logs-for-how |
Repo | |
Framework | |
Online Information Retrieval for Language Learning
Title | Online Information Retrieval for Language Learning |
Authors | Maria Chinkina, Madeeswaran Kannan, Detmar Meurers |
Abstract | |
Tasks | Information Retrieval, Language Acquisition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4002/ |
https://www.aclweb.org/anthology/P16-4002 | |
PWC | https://paperswithcode.com/paper/online-information-retrieval-for-language |
Repo | |
Framework | |
Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists
Title | Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists |
Authors | Est{'\i}baliz Iglesias-Franjo, Jes{'u}s Vilares |
Abstract | |
Tasks | Information Retrieval |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2103/ |
https://www.aclweb.org/anthology/W16-2103 | |
PWC | https://paperswithcode.com/paper/searching-four-millenia-old-digitized |
Repo | |
Framework | |
A Multilinear Approach to the Unsupervised Learning of Morphology
Title | A Multilinear Approach to the Unsupervised Learning of Morphology |
Authors | Anthony Meyer, Markus Dickinson |
Abstract | |
Tasks | Transliteration |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2020/ |
https://www.aclweb.org/anthology/W16-2020 | |
PWC | https://paperswithcode.com/paper/a-multilinear-approach-to-the-unsupervised |
Repo | |
Framework | |
Purely sequence-trained neural networks for ASR based on lattice-free MMI
Title | Purely sequence-trained neural networks for ASR based on lattice-free MMI |
Authors | Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur |
Abstract | In this paper we describe a method to perform sequence-discriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI. To make its computation feasible we use a phone n-gram language model, in place of the word language model. To further reduce its space and time complexity we compute the objective function using neural network outputs at one third the standard frame rate. These changes enable us to perform the computation for the forward-backward algorithm on GPUs. Further the reduced output frame-rate also provides a significant speed-up during decoding. We present results on 5 different LVCSR tasks with training data ranging from 100 to 2100 hours. Models trained with LFMMI provide a relative word error rate reduction of ∼11.5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions. A further reduction of ∼2.5%, relative, can be obtained by fine tuning these models with the word-lattice based sMBR objective function. |
Tasks | Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition |
Published | 2016-09-08 |
URL | https://www.danielpovey.com/files/2016_interspeech_mmi.pdf |
https://www.danielpovey.com/files/2016_interspeech_mmi.pdf | |
PWC | https://paperswithcode.com/paper/purely-sequence-trained-neural-networks-for |
Repo | |
Framework | |
Research on attention memory networks as a model for learning natural language inference
Title | Research on attention memory networks as a model for learning natural language inference |
Authors | Zhuang Liu, Degen Huang, Jing Zhang, Kaiyu Huang |
Abstract | |
Tasks | Natural Language Inference, Question Answering, Sentence Pair Modeling, Structured Prediction, Text Summarization |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-5902/ |
https://www.aclweb.org/anthology/W16-5902 | |
PWC | https://paperswithcode.com/paper/research-on-attention-memory-networks-as-a |
Repo | |
Framework | |
new/s/leak – Information Extraction and Visualization for Investigative Data Journalists
Title | new/s/leak – Information Extraction and Visualization for Investigative Data Journalists |
Authors | Seid Muhie Yimam, Heiner Ulrich, von L, Tatiana esberger, Marcel Rosenbach, Michaela Regneri, Alex Panchenko, er, Franziska Lehmann, Uli Fahrer, Chris Biemann, Kathrin Ballweg |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4028/ |
https://www.aclweb.org/anthology/P16-4028 | |
PWC | https://paperswithcode.com/paper/newsleak-a-information-extraction-and |
Repo | |
Framework | |
Privacy Issues in Online Machine Translation Services - European Perspective
Title | Privacy Issues in Online Machine Translation Services - European Perspective |
Authors | Pawel Kamocki, Jim O{'}Regan |
Abstract | In order to develop its full potential, global communication needs linguistic support systems such as Machine Translation (MT). In the past decade, free online MT tools have become available to the general public, and the quality of their output is increasing. However, the use of such tools may entail various legal implications, especially as far as processing of personal data is concerned. This is even more evident if we take into account that their business model is largely based on providing translation in exchange for data, which can subsequently be used to improve the translation model, but also for commercial purposes. The purpose of this paper is to examine how free online MT tools fit in the European data protection framework, harmonised by the EU Data Protection Directive. The perspectives of both the user and the MT service provider are taken into account. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1706/ |
https://www.aclweb.org/anthology/L16-1706 | |
PWC | https://paperswithcode.com/paper/privacy-issues-in-online-machine-translation |
Repo | |
Framework | |
Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks
Title | Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks |
Authors | Ines Rehbein, Merel Scholman, Vera Demberg |
Abstract | In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data. This raises a number of questions regarding interoperability of discourse relation annotation schemes, as well as regarding differences in discourse annotation for written vs. spoken domains. In this paper, we describe ouron annotating two spoken domains from the SPICE Ireland corpus (telephone conversations and broadcast interviews) according todifferent discourse annotation schemes, PDTB 3.0 and CCR. We show that annotations in the two schemes can largely be mappedone another, and discuss differences in operationalisations of discourse relation schemes which present a challenge to automatic mapping. We also observe systematic differences in the prevalence of implicit discourse relations in spoken data compared to written texts,find that there are also differences in the types of causal relations between the domains. Finally, we find that PDTB 3.0 addresses many shortcomings of PDTB 2.0 wrt. the annotation of spoken discourse, and suggest further extensions. The new corpus has roughly theof the CoNLL 2015 Shared Task test set, and we hence hope that it will be a valuable resource for the evaluation of automatic discourse relation labellers. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1165/ |
https://www.aclweb.org/anthology/L16-1165 | |
PWC | https://paperswithcode.com/paper/annotating-discourse-relations-in-spoken |
Repo | |
Framework | |
Linguistica 5: Unsupervised Learning of Linguistic Structure
Title | Linguistica 5: Unsupervised Learning of Linguistic Structure |
Authors | Jackson Lee, John Goldsmith |
Abstract | |
Tasks | Language Acquisition |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3005/ |
https://www.aclweb.org/anthology/N16-3005 | |
PWC | https://paperswithcode.com/paper/linguistica-5-unsupervised-learning-of |
Repo | |
Framework | |
Instant Feedback for Increasing the Presence of Solutions in Peer Reviews
Title | Instant Feedback for Increasing the Presence of Solutions in Peer Reviews |
Authors | Huy Nguyen, Wenting Xiong, Diane Litman |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3002/ |
https://www.aclweb.org/anthology/N16-3002 | |
PWC | https://paperswithcode.com/paper/instant-feedback-for-increasing-the-presence |
Repo | |
Framework | |
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Title | Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations |
Authors | |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3000/ |
https://www.aclweb.org/anthology/N16-3000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2016-conference-of-the-2 |
Repo | |
Framework | |
An incremental model of syntactic bootstrapping
Title | An incremental model of syntactic bootstrapping |
Authors | Christos Christodoulopoulos, Dan Roth, Cynthia Fisher |
Abstract | |
Tasks | Language Acquisition, Semantic Role Labeling |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1906/ |
https://www.aclweb.org/anthology/W16-1906 | |
PWC | https://paperswithcode.com/paper/an-incremental-model-of-syntactic |
Repo | |
Framework | |
Domain Adaptation for Named Entity Recognition Using CRFs
Title | Domain Adaptation for Named Entity Recognition Using CRFs |
Authors | Tian Tian, Marco Dinarelli, Isabelle Tellier, Pedro Dias Cardoso |
Abstract | In this paper we explain how we created a labelled corpus in English for a Named Entity Recognition (NER) task from multi-source and multi-domain data, for an industrial partner. We explain the specificities of this corpus with examples and describe some baseline experiments. We present some results of domain adaptation on this corpus using a labelled Twitter corpus (Ritter et al., 2011). We tested a semi-supervised method from (Garcia-Fernandez et al., 2014) combined with a supervised domain adaptation approach proposed in (Raymond and Fayolle, 2010) for machine learning experiments with CRFs (Conditional Random Fields). We use the same technique to improve the NER results on the Twitter corpus (Ritter et al., 2011). Our contributions thus consist in an industrial corpus creation and NER performance improvements. |
Tasks | Domain Adaptation, Named Entity Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1089/ |
https://www.aclweb.org/anthology/L16-1089 | |
PWC | https://paperswithcode.com/paper/domain-adaptation-for-named-entity-1 |
Repo | |
Framework | |
Identifying Cross-Cultural Differences in Word Usage
Title | Identifying Cross-Cultural Differences in Word Usage |
Authors | Aparna Garimella, Rada Mihalcea, James Pennebaker |
Abstract | Personal writings have inspired researchers in the fields of linguistics and psychology to study the relationship between language and culture to better understand the psychology of people across different cultures. In this paper, we explore this relation by developing cross-cultural word models to identify words with cultural bias {–} i.e., words that are used in significantly different ways by speakers from different cultures. Focusing specifically on two cultures: United States and Australia, we identify a set of words with significant usage differences, and further investigate these words through feature analysis and topic modeling, shedding light on the attributes of language that contribute to these differences. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1065/ |
https://www.aclweb.org/anthology/C16-1065 | |
PWC | https://paperswithcode.com/paper/identifying-cross-cultural-differences-in |
Repo | |
Framework | |