Paper Group NANR 58
chrF++: words helping character n-grams. Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems. Risk Bounds for Transferring Representations With and Without Fine-Tuning. Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts. C …
chrF++: words helping character n-grams
Title | chrF++: words helping character n-grams |
Authors | Maja Popovi{'c} |
Abstract | |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4770/ |
https://www.aclweb.org/anthology/W17-4770 | |
PWC | https://paperswithcode.com/paper/chrf-words-helping-character-n-grams |
Repo | |
Framework | |
Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems
Title | Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems |
Authors | Louisa Pragst, Koichiro Yoshino, Wolfgang Minker, Satoshi Nakamura, Stefan Ultes |
Abstract | In a dialogue system, the dialogue manager selects one of several system actions and thereby determines the system{'}s behaviour. Defining all possible system actions in a dialogue system by hand is a tedious work. While efforts have been made to automatically generate such system actions, those approaches are mostly focused on providing functional system behaviour. Adapting the system behaviour to the user becomes a difficult task due to the limited amount of system actions available. We aim to increase the adaptability of a dialogue system by automatically generating variants of system actions. In this work, we introduce an approach to automatically generate action variants for elaborateness and indirectness. Our proposed algorithm extracts RDF triplets from a knowledge base and rates their relevance to the original system action to find suitable content. We show that the results of our algorithm are mostly perceived similarly to human generated elaborateness and indirectness and can be used to adapt a conversation to the current user and situation. We also discuss where the results of our algorithm are still lacking and how this could be improved: Taking into account the conversation topic as well as the culture of the user is likely to have beneficial effect on the user{'}s perception. |
Tasks | Spoken Dialogue Systems |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1092/ |
https://www.aclweb.org/anthology/I17-1092 | |
PWC | https://paperswithcode.com/paper/acquisition-and-assessment-of-semantic |
Repo | |
Framework | |
Risk Bounds for Transferring Representations With and Without Fine-Tuning
Title | Risk Bounds for Transferring Representations With and Without Fine-Tuning |
Authors | Daniel McNamara, Maria-Florina Balcan |
Abstract | A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments. |
Tasks | Word Embeddings |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=869 |
http://proceedings.mlr.press/v70/mcnamara17a/mcnamara17a.pdf | |
PWC | https://paperswithcode.com/paper/risk-bounds-for-transferring-representations |
Repo | |
Framework | |
Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts
Title | Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts |
Authors | Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya |
Abstract | The differences in the frequencies of some parts of speech (POS), particularly function words, and lexical diversity in male and female speech have been pointed out in a number of papers. The classifiers using exclusively context-independent parameters have proved to be highly effective. However, there are still issues that have to be addressed as a lot of studies are performed for English and the genre and topic of texts is sometimes neglected. The aim of this paper is to investigate the association between context-independent parameters of Russian written texts and the gender of their authors and to design predictive re-gression models. A number of correlations were found. The obtained data is in good agreement with the results obtained for other languages. The model based on 5 parameters with the highest correlation coefficients was designed. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4909/ |
https://www.aclweb.org/anthology/W17-4909 | |
PWC | https://paperswithcode.com/paper/differences-in-type-token-ratio-and-part-of |
Repo | |
Framework | |
Cross-Lingual Word Embeddings for Low-Resource Language Modeling
Title | Cross-Lingual Word Embeddings for Low-Resource Language Modeling |
Authors | Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, Trevor Cohn |
Abstract | Most languages have no established writing system and minimal written records. However, textual data is essential for natural language processing, and particularly important for training language models to support speech recognition. Even in cases where text data is missing, there are some languages for which bilingual lexicons are available, since creating lexicons is a fundamental task of documentary linguistics. We investigate the use of such lexicons to improve language models when textual training data is limited to as few as a thousand sentences. The method involves learning cross-lingual word embeddings as a preliminary step in training monolingual language models. Results across a number of languages show that language models are improved by this pre-training. Application to Yongning Na, a threatened language, highlights challenges in deploying the approach in real low-resource environments. |
Tasks | Language Modelling, Speech Recognition, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1088/ |
https://www.aclweb.org/anthology/E17-1088 | |
PWC | https://paperswithcode.com/paper/cross-lingual-word-embeddings-for-low |
Repo | |
Framework | |
A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST
Title | A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST |
Authors | Amir Zeldes |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3603/ |
https://www.aclweb.org/anthology/W17-3603 | |
PWC | https://paperswithcode.com/paper/a-distributional-view-of-discourse |
Repo | |
Framework | |
Towards Quantum Language Models
Title | Towards Quantum Language Models |
Authors | Ivano Basile, Fabio Tamburini |
Abstract | This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this probability calculus it is possible to build stochastic models able to benefit from quantum correlations due to interference and entanglement. We extensively tested our approach showing its superior performances, both in terms of model perplexity and inserting it into an automatic speech recognition evaluation setting, when compared with state-of-the-art language modelling techniques. |
Tasks | Decision Making, Information Retrieval, Language Modelling, Machine Translation, Part-Of-Speech Tagging, Speech Recognition |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1196/ |
https://www.aclweb.org/anthology/D17-1196 | |
PWC | https://paperswithcode.com/paper/towards-quantum-language-models |
Repo | |
Framework | |
Is writing style predictive of scientific fraud?
Title | Is writing style predictive of scientific fraud? |
Authors | Chlo{'e} Braud, Anders S{\o}gaard |
Abstract | The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4905/ |
https://www.aclweb.org/anthology/W17-4905 | |
PWC | https://paperswithcode.com/paper/is-writing-style-predictive-of-scientific |
Repo | |
Framework | |
Tackling Biomedical Text Summarization: OAQA at BioASQ 5B
Title | Tackling Biomedical Text Summarization: OAQA at BioASQ 5B |
Authors | Ch, Khyathi u, Aakanksha Naik, Ch, Aditya rasekar, Zi Yang, Niloy Gupta, Eric Nyberg |
Abstract | In this paper, we describe our participation in phase B of task 5b of the fifth edition of the annual BioASQ challenge, which includes answering factoid, list, yes-no and summary questions from biomedical data. We describe our techniques with an emphasis on ideal answer generation, where the goal is to produce a relevant, precise, non-redundant, query-oriented summary from multiple relevant documents. We make use of extractive summarization techniques to address this task and experiment with different biomedical ontologies and various algorithms including agglomerative clustering, Maximum Marginal Relevance (MMR) and sentence compression. We propose a novel word embedding based tf-idf similarity metric and a soft positional constraint which improve our system performance. We evaluate our techniques on test batch 4 from the fourth edition of the challenge. Our best system achieves a ROUGE-2 score of 0.6534 and ROUGE-SU4 score of 0.6536. |
Tasks | Question Answering, Sentence Compression, Text Summarization |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2307/ |
https://www.aclweb.org/anthology/W17-2307 | |
PWC | https://paperswithcode.com/paper/tackling-biomedical-text-summarization-oaqa |
Repo | |
Framework | |
Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets
Title | Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets |
Authors | Pranav Goel, Devang Kulshreshtha, Prayas Jain, Kaushal Kumar Shukla |
Abstract | The paper describes the best performing system for EmoInt - a shared task to predict the intensity of emotions in tweets. Intensity is a real valued score, between 0 and 1. The emotions are classified as - anger, fear, joy and sadness. We apply three different deep neural network based models, which approach the problem from essentially different directions. Our final performance quantified by an average pearson correlation score of 74.7 and an average spearman correlation score of 73.5 is obtained using an ensemble of the three models. We outperform the baseline model of the shared task by 9.9{%} and 9.4{%} pearson and spearman correlation scores respectively. |
Tasks | Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5207/ |
https://www.aclweb.org/anthology/W17-5207 | |
PWC | https://paperswithcode.com/paper/prayas-at-emoint-2017-an-ensemble-of-deep |
Repo | |
Framework | |
Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study
Title | Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study |
Authors | Nan Wang |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-3023/ |
https://www.aclweb.org/anthology/P17-3023 | |
PWC | https://paperswithcode.com/paper/negotiation-of-antibiotic-treatment-in |
Repo | |
Framework | |
Active Sentiment Domain Adaptation
Title | Active Sentiment Domain Adaptation |
Authors | Fangzhao Wu, Yongfeng Huang, Jun Yan |
Abstract | Domain adaptation is an important technology to handle domain dependence problem in sentiment analysis field. Existing methods usually rely on sentiment classifiers trained in source domains. However, their performance may heavily decline if the distributions of sentiment features in source and target domains have significant difference. In this paper, we propose an active sentiment domain adaptation approach to handle this problem. Instead of the source domain sentiment classifiers, our approach adapts the general-purpose sentiment lexicons to target domain with the help of a small number of labeled samples which are selected and annotated in an active learning mode, as well as the domain-specific sentiment similarities among words mined from unlabeled samples of target domain. A unified model is proposed to fuse different types of sentiment information and train sentiment classifier for target domain. Extensive experiments on benchmark datasets show that our approach can train accurate sentiment classifier with less labeled samples. |
Tasks | Active Learning, Domain Adaptation, Sentiment Analysis, Transfer Learning |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1156/ |
https://www.aclweb.org/anthology/P17-1156 | |
PWC | https://paperswithcode.com/paper/active-sentiment-domain-adaptation |
Repo | |
Framework | |
Improving End-to-End Memory Networks with Unified Weight Tying
Title | Improving End-to-End Memory Networks with Unified Weight Tying |
Authors | Fei Liu, Trevor Cohn, Timothy Baldwin |
Abstract | |
Tasks | Image Classification, Speech Recognition |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/U17-1002/ |
https://www.aclweb.org/anthology/U17-1002 | |
PWC | https://paperswithcode.com/paper/improving-end-to-end-memory-networks-with |
Repo | |
Framework | |
Proceedings of the 15th International Conference on Parsing Technologies
Title | Proceedings of the 15th International Conference on Parsing Technologies |
Authors | |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6300/ |
https://www.aclweb.org/anthology/W17-6300 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-15th-international-1 |
Repo | |
Framework | |
Speech segmentation with a neural encoder model of working memory
Title | Speech segmentation with a neural encoder model of working memory |
Authors | Micha Elsner, Cory Shain |
Abstract | We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input. Cognitive biases toward phonological and syntactic predictability in speech are rooted in the limitations of human memory (Baddeley et al., 1998); compressed representations are easier to acquire and retain in memory. To model the biases introduced by these memory limitations, our system uses an LSTM-based encoder-decoder with a small number of hidden units, then searches for a segmentation that minimizes autoencoding loss. Linguistically meaningful segments (e.g. words) should share regular patterns of features that facilitate decoder performance in comparison to random segmentations, and we show that our learner discovers these patterns when trained on either phoneme sequences or raw acoustics. To our knowledge, ours is the first fully unsupervised system to be able to segment both symbolic and acoustic representations of speech. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1112/ |
https://www.aclweb.org/anthology/D17-1112 | |
PWC | https://paperswithcode.com/paper/speech-segmentation-with-a-neural-encoder |
Repo | |
Framework | |