July 26, 2019

2023 words 10 mins read

Paper Group NANR 58

Paper Group NANR 58

chrF++: words helping character n-grams. Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems. Risk Bounds for Transferring Representations With and Without Fine-Tuning. Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts. C …

chrF++: words helping character n-grams

Title chrF++: words helping character n-grams
Authors Maja Popovi{'c}
Abstract
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4770/
PDF https://www.aclweb.org/anthology/W17-4770
PWC https://paperswithcode.com/paper/chrf-words-helping-character-n-grams
Repo
Framework

Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems

Title Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems
Authors Louisa Pragst, Koichiro Yoshino, Wolfgang Minker, Satoshi Nakamura, Stefan Ultes
Abstract In a dialogue system, the dialogue manager selects one of several system actions and thereby determines the system{'}s behaviour. Defining all possible system actions in a dialogue system by hand is a tedious work. While efforts have been made to automatically generate such system actions, those approaches are mostly focused on providing functional system behaviour. Adapting the system behaviour to the user becomes a difficult task due to the limited amount of system actions available. We aim to increase the adaptability of a dialogue system by automatically generating variants of system actions. In this work, we introduce an approach to automatically generate action variants for elaborateness and indirectness. Our proposed algorithm extracts RDF triplets from a knowledge base and rates their relevance to the original system action to find suitable content. We show that the results of our algorithm are mostly perceived similarly to human generated elaborateness and indirectness and can be used to adapt a conversation to the current user and situation. We also discuss where the results of our algorithm are still lacking and how this could be improved: Taking into account the conversation topic as well as the culture of the user is likely to have beneficial effect on the user{'}s perception.
Tasks Spoken Dialogue Systems
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1092/
PDF https://www.aclweb.org/anthology/I17-1092
PWC https://paperswithcode.com/paper/acquisition-and-assessment-of-semantic
Repo
Framework

Risk Bounds for Transferring Representations With and Without Fine-Tuning

Title Risk Bounds for Transferring Representations With and Without Fine-Tuning
Authors Daniel McNamara, Maria-Florina Balcan
Abstract A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.
Tasks Word Embeddings
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=869
PDF http://proceedings.mlr.press/v70/mcnamara17a/mcnamara17a.pdf
PWC https://paperswithcode.com/paper/risk-bounds-for-transferring-representations
Repo
Framework

Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts

Title Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts
Authors Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya
Abstract The differences in the frequencies of some parts of speech (POS), particularly function words, and lexical diversity in male and female speech have been pointed out in a number of papers. The classifiers using exclusively context-independent parameters have proved to be highly effective. However, there are still issues that have to be addressed as a lot of studies are performed for English and the genre and topic of texts is sometimes neglected. The aim of this paper is to investigate the association between context-independent parameters of Russian written texts and the gender of their authors and to design predictive re-gression models. A number of correlations were found. The obtained data is in good agreement with the results obtained for other languages. The model based on 5 parameters with the highest correlation coefficients was designed.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4909/
PDF https://www.aclweb.org/anthology/W17-4909
PWC https://paperswithcode.com/paper/differences-in-type-token-ratio-and-part-of
Repo
Framework

Cross-Lingual Word Embeddings for Low-Resource Language Modeling

Title Cross-Lingual Word Embeddings for Low-Resource Language Modeling
Authors Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, Trevor Cohn
Abstract Most languages have no established writing system and minimal written records. However, textual data is essential for natural language processing, and particularly important for training language models to support speech recognition. Even in cases where text data is missing, there are some languages for which bilingual lexicons are available, since creating lexicons is a fundamental task of documentary linguistics. We investigate the use of such lexicons to improve language models when textual training data is limited to as few as a thousand sentences. The method involves learning cross-lingual word embeddings as a preliminary step in training monolingual language models. Results across a number of languages show that language models are improved by this pre-training. Application to Yongning Na, a threatened language, highlights challenges in deploying the approach in real low-resource environments.
Tasks Language Modelling, Speech Recognition, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1088/
PDF https://www.aclweb.org/anthology/E17-1088
PWC https://paperswithcode.com/paper/cross-lingual-word-embeddings-for-low
Repo
Framework

A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST

Title A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST
Authors Amir Zeldes
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3603/
PDF https://www.aclweb.org/anthology/W17-3603
PWC https://paperswithcode.com/paper/a-distributional-view-of-discourse
Repo
Framework

Towards Quantum Language Models

Title Towards Quantum Language Models
Authors Ivano Basile, Fabio Tamburini
Abstract This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this probability calculus it is possible to build stochastic models able to benefit from quantum correlations due to interference and entanglement. We extensively tested our approach showing its superior performances, both in terms of model perplexity and inserting it into an automatic speech recognition evaluation setting, when compared with state-of-the-art language modelling techniques.
Tasks Decision Making, Information Retrieval, Language Modelling, Machine Translation, Part-Of-Speech Tagging, Speech Recognition
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1196/
PDF https://www.aclweb.org/anthology/D17-1196
PWC https://paperswithcode.com/paper/towards-quantum-language-models
Repo
Framework

Is writing style predictive of scientific fraud?

Title Is writing style predictive of scientific fraud?
Authors Chlo{'e} Braud, Anders S{\o}gaard
Abstract The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4905/
PDF https://www.aclweb.org/anthology/W17-4905
PWC https://paperswithcode.com/paper/is-writing-style-predictive-of-scientific
Repo
Framework

Tackling Biomedical Text Summarization: OAQA at BioASQ 5B

Title Tackling Biomedical Text Summarization: OAQA at BioASQ 5B
Authors Ch, Khyathi u, Aakanksha Naik, Ch, Aditya rasekar, Zi Yang, Niloy Gupta, Eric Nyberg
Abstract In this paper, we describe our participation in phase B of task 5b of the fifth edition of the annual BioASQ challenge, which includes answering factoid, list, yes-no and summary questions from biomedical data. We describe our techniques with an emphasis on ideal answer generation, where the goal is to produce a relevant, precise, non-redundant, query-oriented summary from multiple relevant documents. We make use of extractive summarization techniques to address this task and experiment with different biomedical ontologies and various algorithms including agglomerative clustering, Maximum Marginal Relevance (MMR) and sentence compression. We propose a novel word embedding based tf-idf similarity metric and a soft positional constraint which improve our system performance. We evaluate our techniques on test batch 4 from the fourth edition of the challenge. Our best system achieves a ROUGE-2 score of 0.6534 and ROUGE-SU4 score of 0.6536.
Tasks Question Answering, Sentence Compression, Text Summarization
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2307/
PDF https://www.aclweb.org/anthology/W17-2307
PWC https://paperswithcode.com/paper/tackling-biomedical-text-summarization-oaqa
Repo
Framework

Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets

Title Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets
Authors Pranav Goel, Devang Kulshreshtha, Prayas Jain, Kaushal Kumar Shukla
Abstract The paper describes the best performing system for EmoInt - a shared task to predict the intensity of emotions in tweets. Intensity is a real valued score, between 0 and 1. The emotions are classified as - anger, fear, joy and sadness. We apply three different deep neural network based models, which approach the problem from essentially different directions. Our final performance quantified by an average pearson correlation score of 74.7 and an average spearman correlation score of 73.5 is obtained using an ensemble of the three models. We outperform the baseline model of the shared task by 9.9{%} and 9.4{%} pearson and spearman correlation scores respectively.
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5207/
PDF https://www.aclweb.org/anthology/W17-5207
PWC https://paperswithcode.com/paper/prayas-at-emoint-2017-an-ensemble-of-deep
Repo
Framework

Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study

Title Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study
Authors Nan Wang
Abstract
Tasks
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-3023/
PDF https://www.aclweb.org/anthology/P17-3023
PWC https://paperswithcode.com/paper/negotiation-of-antibiotic-treatment-in
Repo
Framework

Active Sentiment Domain Adaptation

Title Active Sentiment Domain Adaptation
Authors Fangzhao Wu, Yongfeng Huang, Jun Yan
Abstract Domain adaptation is an important technology to handle domain dependence problem in sentiment analysis field. Existing methods usually rely on sentiment classifiers trained in source domains. However, their performance may heavily decline if the distributions of sentiment features in source and target domains have significant difference. In this paper, we propose an active sentiment domain adaptation approach to handle this problem. Instead of the source domain sentiment classifiers, our approach adapts the general-purpose sentiment lexicons to target domain with the help of a small number of labeled samples which are selected and annotated in an active learning mode, as well as the domain-specific sentiment similarities among words mined from unlabeled samples of target domain. A unified model is proposed to fuse different types of sentiment information and train sentiment classifier for target domain. Extensive experiments on benchmark datasets show that our approach can train accurate sentiment classifier with less labeled samples.
Tasks Active Learning, Domain Adaptation, Sentiment Analysis, Transfer Learning
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1156/
PDF https://www.aclweb.org/anthology/P17-1156
PWC https://paperswithcode.com/paper/active-sentiment-domain-adaptation
Repo
Framework

Improving End-to-End Memory Networks with Unified Weight Tying

Title Improving End-to-End Memory Networks with Unified Weight Tying
Authors Fei Liu, Trevor Cohn, Timothy Baldwin
Abstract
Tasks Image Classification, Speech Recognition
Published 2017-12-01
URL https://www.aclweb.org/anthology/U17-1002/
PDF https://www.aclweb.org/anthology/U17-1002
PWC https://paperswithcode.com/paper/improving-end-to-end-memory-networks-with
Repo
Framework

Proceedings of the 15th International Conference on Parsing Technologies

Title Proceedings of the 15th International Conference on Parsing Technologies
Authors
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6300/
PDF https://www.aclweb.org/anthology/W17-6300
PWC https://paperswithcode.com/paper/proceedings-of-the-15th-international-1
Repo
Framework

Speech segmentation with a neural encoder model of working memory

Title Speech segmentation with a neural encoder model of working memory
Authors Micha Elsner, Cory Shain
Abstract We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input. Cognitive biases toward phonological and syntactic predictability in speech are rooted in the limitations of human memory (Baddeley et al., 1998); compressed representations are easier to acquire and retain in memory. To model the biases introduced by these memory limitations, our system uses an LSTM-based encoder-decoder with a small number of hidden units, then searches for a segmentation that minimizes autoencoding loss. Linguistically meaningful segments (e.g. words) should share regular patterns of features that facilitate decoder performance in comparison to random segmentations, and we show that our learner discovers these patterns when trained on either phoneme sequences or raw acoustics. To our knowledge, ours is the first fully unsupervised system to be able to segment both symbolic and acoustic representations of speech.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1112/
PDF https://www.aclweb.org/anthology/D17-1112
PWC https://paperswithcode.com/paper/speech-segmentation-with-a-neural-encoder
Repo
Framework
comments powered by Disqus