May 5, 2019

1555 words 8 mins read

Paper Group NANR 87

Paper Group NANR 87

Subtask Mining from Search Query Logs for How-Knowledge Acceleration. Online Information Retrieval for Language Learning. Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists. A Multilinear Approach to the Unsupervised Learning of Morphology. Purely sequence-trained neural networks for ASR based on lattice-free …

Subtask Mining from Search Query Logs for How-Knowledge Acceleration

Title Subtask Mining from Search Query Logs for How-Knowledge Acceleration
Authors Chung-Lun Kuo, Hsin-Hsi Chen
Abstract How-knowledge is indispensable in daily life, but has relatively less quantity and poorer quality than what-knowledge in publicly available knowledge bases. This paper first extracts task-subtask pairs from wikiHow, then mines linguistic patterns from search query logs, and finally applies the mined patterns to extract subtasks to complete given how-to tasks. To evaluate the proposed methodology, we group tasks and the corresponding recommended subtasks into pairs, and evaluate the results automatically and manually. The automatic evaluation shows the accuracy of 0.4494. We also classify the mined patterns based on prepositions and find that the prepositions like {}on{''}, {}to{''}, and {``}with{''} have the better performance. The results can be used to accelerate how-knowledge base construction. |
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1198/
PDF https://www.aclweb.org/anthology/L16-1198
PWC https://paperswithcode.com/paper/subtask-mining-from-search-query-logs-for-how
Repo
Framework

Online Information Retrieval for Language Learning

Title Online Information Retrieval for Language Learning
Authors Maria Chinkina, Madeeswaran Kannan, Detmar Meurers
Abstract
Tasks Information Retrieval, Language Acquisition
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4002/
PDF https://www.aclweb.org/anthology/P16-4002
PWC https://paperswithcode.com/paper/online-information-retrieval-for-language
Repo
Framework

Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists

Title Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists
Authors Est{'\i}baliz Iglesias-Franjo, Jes{'u}s Vilares
Abstract
Tasks Information Retrieval
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2103/
PDF https://www.aclweb.org/anthology/W16-2103
PWC https://paperswithcode.com/paper/searching-four-millenia-old-digitized
Repo
Framework

A Multilinear Approach to the Unsupervised Learning of Morphology

Title A Multilinear Approach to the Unsupervised Learning of Morphology
Authors Anthony Meyer, Markus Dickinson
Abstract
Tasks Transliteration
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2020/
PDF https://www.aclweb.org/anthology/W16-2020
PWC https://paperswithcode.com/paper/a-multilinear-approach-to-the-unsupervised
Repo
Framework

Purely sequence-trained neural networks for ASR based on lattice-free MMI

Title Purely sequence-trained neural networks for ASR based on lattice-free MMI
Authors Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur
Abstract In this paper we describe a method to perform sequence-discriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI. To make its computation feasible we use a phone n-gram language model, in place of the word language model. To further reduce its space and time complexity we compute the objective function using neural network outputs at one third the standard frame rate. These changes enable us to perform the computation for the forward-backward algorithm on GPUs. Further the reduced output frame-rate also provides a significant speed-up during decoding. We present results on 5 different LVCSR tasks with training data ranging from 100 to 2100 hours. Models trained with LFMMI provide a relative word error rate reduction of ∼11.5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions. A further reduction of ∼2.5%, relative, can be obtained by fine tuning these models with the word-lattice based sMBR objective function.
Tasks Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition
Published 2016-09-08
URL https://www.danielpovey.com/files/2016_interspeech_mmi.pdf
PDF https://www.danielpovey.com/files/2016_interspeech_mmi.pdf
PWC https://paperswithcode.com/paper/purely-sequence-trained-neural-networks-for
Repo
Framework

Research on attention memory networks as a model for learning natural language inference

Title Research on attention memory networks as a model for learning natural language inference
Authors Zhuang Liu, Degen Huang, Jing Zhang, Kaiyu Huang
Abstract
Tasks Natural Language Inference, Question Answering, Sentence Pair Modeling, Structured Prediction, Text Summarization
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5902/
PDF https://www.aclweb.org/anthology/W16-5902
PWC https://paperswithcode.com/paper/research-on-attention-memory-networks-as-a
Repo
Framework

new/s/leak – Information Extraction and Visualization for Investigative Data Journalists

Title new/s/leak – Information Extraction and Visualization for Investigative Data Journalists
Authors Seid Muhie Yimam, Heiner Ulrich, von L, Tatiana esberger, Marcel Rosenbach, Michaela Regneri, Alex Panchenko, er, Franziska Lehmann, Uli Fahrer, Chris Biemann, Kathrin Ballweg
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4028/
PDF https://www.aclweb.org/anthology/P16-4028
PWC https://paperswithcode.com/paper/newsleak-a-information-extraction-and
Repo
Framework

Privacy Issues in Online Machine Translation Services - European Perspective

Title Privacy Issues in Online Machine Translation Services - European Perspective
Authors Pawel Kamocki, Jim O{'}Regan
Abstract In order to develop its full potential, global communication needs linguistic support systems such as Machine Translation (MT). In the past decade, free online MT tools have become available to the general public, and the quality of their output is increasing. However, the use of such tools may entail various legal implications, especially as far as processing of personal data is concerned. This is even more evident if we take into account that their business model is largely based on providing translation in exchange for data, which can subsequently be used to improve the translation model, but also for commercial purposes. The purpose of this paper is to examine how free online MT tools fit in the European data protection framework, harmonised by the EU Data Protection Directive. The perspectives of both the user and the MT service provider are taken into account.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1706/
PDF https://www.aclweb.org/anthology/L16-1706
PWC https://paperswithcode.com/paper/privacy-issues-in-online-machine-translation
Repo
Framework

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks

Title Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks
Authors Ines Rehbein, Merel Scholman, Vera Demberg
Abstract In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data. This raises a number of questions regarding interoperability of discourse relation annotation schemes, as well as regarding differences in discourse annotation for written vs. spoken domains. In this paper, we describe ouron annotating two spoken domains from the SPICE Ireland corpus (telephone conversations and broadcast interviews) according todifferent discourse annotation schemes, PDTB 3.0 and CCR. We show that annotations in the two schemes can largely be mappedone another, and discuss differences in operationalisations of discourse relation schemes which present a challenge to automatic mapping. We also observe systematic differences in the prevalence of implicit discourse relations in spoken data compared to written texts,find that there are also differences in the types of causal relations between the domains. Finally, we find that PDTB 3.0 addresses many shortcomings of PDTB 2.0 wrt. the annotation of spoken discourse, and suggest further extensions. The new corpus has roughly theof the CoNLL 2015 Shared Task test set, and we hence hope that it will be a valuable resource for the evaluation of automatic discourse relation labellers.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1165/
PDF https://www.aclweb.org/anthology/L16-1165
PWC https://paperswithcode.com/paper/annotating-discourse-relations-in-spoken
Repo
Framework

Linguistica 5: Unsupervised Learning of Linguistic Structure

Title Linguistica 5: Unsupervised Learning of Linguistic Structure
Authors Jackson Lee, John Goldsmith
Abstract
Tasks Language Acquisition
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3005/
PDF https://www.aclweb.org/anthology/N16-3005
PWC https://paperswithcode.com/paper/linguistica-5-unsupervised-learning-of
Repo
Framework

Instant Feedback for Increasing the Presence of Solutions in Peer Reviews

Title Instant Feedback for Increasing the Presence of Solutions in Peer Reviews
Authors Huy Nguyen, Wenting Xiong, Diane Litman
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3002/
PDF https://www.aclweb.org/anthology/N16-3002
PWC https://paperswithcode.com/paper/instant-feedback-for-increasing-the-presence
Repo
Framework

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

Title Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Authors
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3000/
PDF https://www.aclweb.org/anthology/N16-3000
PWC https://paperswithcode.com/paper/proceedings-of-the-2016-conference-of-the-2
Repo
Framework

An incremental model of syntactic bootstrapping

Title An incremental model of syntactic bootstrapping
Authors Christos Christodoulopoulos, Dan Roth, Cynthia Fisher
Abstract
Tasks Language Acquisition, Semantic Role Labeling
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1906/
PDF https://www.aclweb.org/anthology/W16-1906
PWC https://paperswithcode.com/paper/an-incremental-model-of-syntactic
Repo
Framework

Domain Adaptation for Named Entity Recognition Using CRFs

Title Domain Adaptation for Named Entity Recognition Using CRFs
Authors Tian Tian, Marco Dinarelli, Isabelle Tellier, Pedro Dias Cardoso
Abstract In this paper we explain how we created a labelled corpus in English for a Named Entity Recognition (NER) task from multi-source and multi-domain data, for an industrial partner. We explain the specificities of this corpus with examples and describe some baseline experiments. We present some results of domain adaptation on this corpus using a labelled Twitter corpus (Ritter et al., 2011). We tested a semi-supervised method from (Garcia-Fernandez et al., 2014) combined with a supervised domain adaptation approach proposed in (Raymond and Fayolle, 2010) for machine learning experiments with CRFs (Conditional Random Fields). We use the same technique to improve the NER results on the Twitter corpus (Ritter et al., 2011). Our contributions thus consist in an industrial corpus creation and NER performance improvements.
Tasks Domain Adaptation, Named Entity Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1089/
PDF https://www.aclweb.org/anthology/L16-1089
PWC https://paperswithcode.com/paper/domain-adaptation-for-named-entity-1
Repo
Framework

Identifying Cross-Cultural Differences in Word Usage

Title Identifying Cross-Cultural Differences in Word Usage
Authors Aparna Garimella, Rada Mihalcea, James Pennebaker
Abstract Personal writings have inspired researchers in the fields of linguistics and psychology to study the relationship between language and culture to better understand the psychology of people across different cultures. In this paper, we explore this relation by developing cross-cultural word models to identify words with cultural bias {–} i.e., words that are used in significantly different ways by speakers from different cultures. Focusing specifically on two cultures: United States and Australia, we identify a set of words with significant usage differences, and further investigate these words through feature analysis and topic modeling, shedding light on the attributes of language that contribute to these differences.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1065/
PDF https://www.aclweb.org/anthology/C16-1065
PWC https://paperswithcode.com/paper/identifying-cross-cultural-differences-in
Repo
Framework
comments powered by Disqus