May 5, 2019

1555 words 8 mins read

Paper Group NANR 87

Subtask Mining from Search Query Logs for How-Knowledge Acceleration. Online Information Retrieval for Language Learning. Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists. A Multilinear Approach to the Unsupervised Learning of Morphology. Purely sequence-trained neural networks for ASR based on lattice-free …

Subtask Mining from Search Query Logs for How-Knowledge Acceleration


Title	Subtask Mining from Search Query Logs for How-Knowledge Acceleration
Authors	Chung-Lun Kuo, Hsin-Hsi Chen
Abstract	How-knowledge is indispensable in daily life, but has relatively less quantity and poorer quality than what-knowledge in publicly available knowledge bases. This paper first extracts task-subtask pairs from wikiHow, then mines linguistic patterns from search query logs, and finally applies the mined patterns to extract subtasks to complete given how-to tasks. To evaluate the proposed methodology, we group tasks and the corresponding recommended subtasks into pairs, and evaluate the results automatically and manually. The automatic evaluation shows the accuracy of 0.4494. We also classify the mined patterns based on prepositions and find that the prepositions like {`}on{''}, {`}to{''}, and {``}with{''} have the better performance. The results can be used to accelerate how-knowledge base construction. \|
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1198/
PDF	https://www.aclweb.org/anthology/L16-1198
PWC	https://paperswithcode.com/paper/subtask-mining-from-search-query-logs-for-how
Repo
Framework

Online Information Retrieval for Language Learning


Title	Online Information Retrieval for Language Learning
Authors	Maria Chinkina, Madeeswaran Kannan, Detmar Meurers
Abstract
Tasks	Information Retrieval, Language Acquisition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-4002/
PDF	https://www.aclweb.org/anthology/P16-4002
PWC	https://paperswithcode.com/paper/online-information-retrieval-for-language
Repo
Framework

Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists


Title	Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists
Authors	Est{'\i}baliz Iglesias-Franjo, Jes{'u}s Vilares
Abstract
Tasks	Information Retrieval
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2103/
PDF	https://www.aclweb.org/anthology/W16-2103
PWC	https://paperswithcode.com/paper/searching-four-millenia-old-digitized
Repo
Framework

A Multilinear Approach to the Unsupervised Learning of Morphology


Title	A Multilinear Approach to the Unsupervised Learning of Morphology
Authors	Anthony Meyer, Markus Dickinson
Abstract
Tasks	Transliteration
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2020/
PDF	https://www.aclweb.org/anthology/W16-2020
PWC	https://paperswithcode.com/paper/a-multilinear-approach-to-the-unsupervised
Repo
Framework

Purely sequence-trained neural networks for ASR based on lattice-free MMI


Title	Purely sequence-trained neural networks for ASR based on lattice-free MMI
Authors	Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur
Abstract	In this paper we describe a method to perform sequence-discriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI. To make its computation feasible we use a phone n-gram language model, in place of the word language model. To further reduce its space and time complexity we compute the objective function using neural network outputs at one third the standard frame rate. These changes enable us to perform the computation for the forward-backward algorithm on GPUs. Further the reduced output frame-rate also provides a significant speed-up during decoding. We present results on 5 different LVCSR tasks with training data ranging from 100 to 2100 hours. Models trained with LFMMI provide a relative word error rate reduction of ∼11.5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions. A further reduction of ∼2.5%, relative, can be obtained by fine tuning these models with the word-lattice based sMBR objective function.
Tasks	Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition
Published	2016-09-08
URL	https://www.danielpovey.com/files/2016_interspeech_mmi.pdf
PDF	https://www.danielpovey.com/files/2016_interspeech_mmi.pdf
PWC	https://paperswithcode.com/paper/purely-sequence-trained-neural-networks-for
Repo
Framework

Research on attention memory networks as a model for learning natural language inference


Title	Research on attention memory networks as a model for learning natural language inference
Authors	Zhuang Liu, Degen Huang, Jing Zhang, Kaiyu Huang
Abstract
Tasks	Natural Language Inference, Question Answering, Sentence Pair Modeling, Structured Prediction, Text Summarization
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-5902/
PDF	https://www.aclweb.org/anthology/W16-5902
PWC	https://paperswithcode.com/paper/research-on-attention-memory-networks-as-a
Repo
Framework

new/s/leak – Information Extraction and Visualization for Investigative Data Journalists


Title	new/s/leak – Information Extraction and Visualization for Investigative Data Journalists
Authors	Seid Muhie Yimam, Heiner Ulrich, von L, Tatiana esberger, Marcel Rosenbach, Michaela Regneri, Alex Panchenko, er, Franziska Lehmann, Uli Fahrer, Chris Biemann, Kathrin Ballweg
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-4028/
PDF	https://www.aclweb.org/anthology/P16-4028
PWC	https://paperswithcode.com/paper/newsleak-a-information-extraction-and
Repo
Framework

Privacy Issues in Online Machine Translation Services - European Perspective


Title	Privacy Issues in Online Machine Translation Services - European Perspective
Authors	Pawel Kamocki, Jim O{'}Regan
Abstract	In order to develop its full potential, global communication needs linguistic support systems such as Machine Translation (MT). In the past decade, free online MT tools have become available to the general public, and the quality of their output is increasing. However, the use of such tools may entail various legal implications, especially as far as processing of personal data is concerned. This is even more evident if we take into account that their business model is largely based on providing translation in exchange for data, which can subsequently be used to improve the translation model, but also for commercial purposes. The purpose of this paper is to examine how free online MT tools fit in the European data protection framework, harmonised by the EU Data Protection Directive. The perspectives of both the user and the MT service provider are taken into account.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1706/
PDF	https://www.aclweb.org/anthology/L16-1706
PWC	https://paperswithcode.com/paper/privacy-issues-in-online-machine-translation
Repo
Framework

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks


Title	Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks
Authors	Ines Rehbein, Merel Scholman, Vera Demberg
Abstract	In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data. This raises a number of questions regarding interoperability of discourse relation annotation schemes, as well as regarding differences in discourse annotation for written vs. spoken domains. In this paper, we describe ouron annotating two spoken domains from the SPICE Ireland corpus (telephone conversations and broadcast interviews) according todifferent discourse annotation schemes, PDTB 3.0 and CCR. We show that annotations in the two schemes can largely be mappedone another, and discuss differences in operationalisations of discourse relation schemes which present a challenge to automatic mapping. We also observe systematic differences in the prevalence of implicit discourse relations in spoken data compared to written texts,find that there are also differences in the types of causal relations between the domains. Finally, we find that PDTB 3.0 addresses many shortcomings of PDTB 2.0 wrt. the annotation of spoken discourse, and suggest further extensions. The new corpus has roughly theof the CoNLL 2015 Shared Task test set, and we hence hope that it will be a valuable resource for the evaluation of automatic discourse relation labellers.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1165/
PDF	https://www.aclweb.org/anthology/L16-1165
PWC	https://paperswithcode.com/paper/annotating-discourse-relations-in-spoken
Repo
Framework

Linguistica 5: Unsupervised Learning of Linguistic Structure


Title	Linguistica 5: Unsupervised Learning of Linguistic Structure
Authors	Jackson Lee, John Goldsmith
Abstract
Tasks	Language Acquisition
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-3005/
PDF	https://www.aclweb.org/anthology/N16-3005
PWC	https://paperswithcode.com/paper/linguistica-5-unsupervised-learning-of
Repo
Framework

Instant Feedback for Increasing the Presence of Solutions in Peer Reviews


Title	Instant Feedback for Increasing the Presence of Solutions in Peer Reviews
Authors	Huy Nguyen, Wenting Xiong, Diane Litman
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-3002/
PDF	https://www.aclweb.org/anthology/N16-3002
PWC	https://paperswithcode.com/paper/instant-feedback-for-increasing-the-presence
Repo
Framework

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations


Title	Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Authors
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-3000/
PDF	https://www.aclweb.org/anthology/N16-3000
PWC	https://paperswithcode.com/paper/proceedings-of-the-2016-conference-of-the-2
Repo
Framework

An incremental model of syntactic bootstrapping


Title	An incremental model of syntactic bootstrapping
Authors	Christos Christodoulopoulos, Dan Roth, Cynthia Fisher
Abstract
Tasks	Language Acquisition, Semantic Role Labeling
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1906/
PDF	https://www.aclweb.org/anthology/W16-1906
PWC	https://paperswithcode.com/paper/an-incremental-model-of-syntactic
Repo
Framework

Domain Adaptation for Named Entity Recognition Using CRFs


Title	Domain Adaptation for Named Entity Recognition Using CRFs
Authors	Tian Tian, Marco Dinarelli, Isabelle Tellier, Pedro Dias Cardoso
Abstract	In this paper we explain how we created a labelled corpus in English for a Named Entity Recognition (NER) task from multi-source and multi-domain data, for an industrial partner. We explain the specificities of this corpus with examples and describe some baseline experiments. We present some results of domain adaptation on this corpus using a labelled Twitter corpus (Ritter et al., 2011). We tested a semi-supervised method from (Garcia-Fernandez et al., 2014) combined with a supervised domain adaptation approach proposed in (Raymond and Fayolle, 2010) for machine learning experiments with CRFs (Conditional Random Fields). We use the same technique to improve the NER results on the Twitter corpus (Ritter et al., 2011). Our contributions thus consist in an industrial corpus creation and NER performance improvements.
Tasks	Domain Adaptation, Named Entity Recognition
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1089/
PDF	https://www.aclweb.org/anthology/L16-1089
PWC	https://paperswithcode.com/paper/domain-adaptation-for-named-entity-1
Repo
Framework

Identifying Cross-Cultural Differences in Word Usage


Title	Identifying Cross-Cultural Differences in Word Usage
Authors	Aparna Garimella, Rada Mihalcea, James Pennebaker
Abstract	Personal writings have inspired researchers in the fields of linguistics and psychology to study the relationship between language and culture to better understand the psychology of people across different cultures. In this paper, we explore this relation by developing cross-cultural word models to identify words with cultural bias {–} i.e., words that are used in significantly different ways by speakers from different cultures. Focusing specifically on two cultures: United States and Australia, we identify a set of words with significant usage differences, and further investigate these words through feature analysis and topic modeling, shedding light on the attributes of language that contribute to these differences.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1065/
PDF	https://www.aclweb.org/anthology/C16-1065
PWC	https://paperswithcode.com/paper/identifying-cross-cultural-differences-in
Repo
Framework