July 26, 2019

2023 words 10 mins read

Paper Group NANR 58

chrF++: words helping character n-grams. Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems. Risk Bounds for Transferring Representations With and Without Fine-Tuning. Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts. C …

chrF++: words helping character n-grams


Title	chrF++: words helping character n-grams
Authors	Maja Popovi{'c}
Abstract
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4770/
PDF	https://www.aclweb.org/anthology/W17-4770
PWC	https://paperswithcode.com/paper/chrf-words-helping-character-n-grams
Repo
Framework

Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems


Title	Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems
Authors	Louisa Pragst, Koichiro Yoshino, Wolfgang Minker, Satoshi Nakamura, Stefan Ultes
Abstract	In a dialogue system, the dialogue manager selects one of several system actions and thereby determines the system{'}s behaviour. Defining all possible system actions in a dialogue system by hand is a tedious work. While efforts have been made to automatically generate such system actions, those approaches are mostly focused on providing functional system behaviour. Adapting the system behaviour to the user becomes a difficult task due to the limited amount of system actions available. We aim to increase the adaptability of a dialogue system by automatically generating variants of system actions. In this work, we introduce an approach to automatically generate action variants for elaborateness and indirectness. Our proposed algorithm extracts RDF triplets from a knowledge base and rates their relevance to the original system action to find suitable content. We show that the results of our algorithm are mostly perceived similarly to human generated elaborateness and indirectness and can be used to adapt a conversation to the current user and situation. We also discuss where the results of our algorithm are still lacking and how this could be improved: Taking into account the conversation topic as well as the culture of the user is likely to have beneficial effect on the user{'}s perception.
Tasks	Spoken Dialogue Systems
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1092/
PDF	https://www.aclweb.org/anthology/I17-1092
PWC	https://paperswithcode.com/paper/acquisition-and-assessment-of-semantic
Repo
Framework

Risk Bounds for Transferring Representations With and Without Fine-Tuning


Title	Risk Bounds for Transferring Representations With and Without Fine-Tuning
Authors	Daniel McNamara, Maria-Florina Balcan
Abstract	A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.
Tasks	Word Embeddings
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=869
PDF	http://proceedings.mlr.press/v70/mcnamara17a/mcnamara17a.pdf
PWC	https://paperswithcode.com/paper/risk-bounds-for-transferring-representations
Repo
Framework

Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts


Title	Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts
Authors	Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya
Abstract	The differences in the frequencies of some parts of speech (POS), particularly function words, and lexical diversity in male and female speech have been pointed out in a number of papers. The classifiers using exclusively context-independent parameters have proved to be highly effective. However, there are still issues that have to be addressed as a lot of studies are performed for English and the genre and topic of texts is sometimes neglected. The aim of this paper is to investigate the association between context-independent parameters of Russian written texts and the gender of their authors and to design predictive re-gression models. A number of correlations were found. The obtained data is in good agreement with the results obtained for other languages. The model based on 5 parameters with the highest correlation coefficients was designed.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4909/
PDF	https://www.aclweb.org/anthology/W17-4909
PWC	https://paperswithcode.com/paper/differences-in-type-token-ratio-and-part-of
Repo
Framework

Cross-Lingual Word Embeddings for Low-Resource Language Modeling


Title	Cross-Lingual Word Embeddings for Low-Resource Language Modeling
Authors	Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, Trevor Cohn
Abstract	Most languages have no established writing system and minimal written records. However, textual data is essential for natural language processing, and particularly important for training language models to support speech recognition. Even in cases where text data is missing, there are some languages for which bilingual lexicons are available, since creating lexicons is a fundamental task of documentary linguistics. We investigate the use of such lexicons to improve language models when textual training data is limited to as few as a thousand sentences. The method involves learning cross-lingual word embeddings as a preliminary step in training monolingual language models. Results across a number of languages show that language models are improved by this pre-training. Application to Yongning Na, a threatened language, highlights challenges in deploying the approach in real low-resource environments.
Tasks	Language Modelling, Speech Recognition, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1088/
PDF	https://www.aclweb.org/anthology/E17-1088
PWC	https://paperswithcode.com/paper/cross-lingual-word-embeddings-for-low
Repo
Framework

A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST


Title	A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST
Authors	Amir Zeldes
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3603/
PDF	https://www.aclweb.org/anthology/W17-3603
PWC	https://paperswithcode.com/paper/a-distributional-view-of-discourse
Repo
Framework

Towards Quantum Language Models


Title	Towards Quantum Language Models
Authors	Ivano Basile, Fabio Tamburini
Abstract	This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this probability calculus it is possible to build stochastic models able to benefit from quantum correlations due to interference and entanglement. We extensively tested our approach showing its superior performances, both in terms of model perplexity and inserting it into an automatic speech recognition evaluation setting, when compared with state-of-the-art language modelling techniques.
Tasks	Decision Making, Information Retrieval, Language Modelling, Machine Translation, Part-Of-Speech Tagging, Speech Recognition
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1196/
PDF	https://www.aclweb.org/anthology/D17-1196
PWC	https://paperswithcode.com/paper/towards-quantum-language-models
Repo
Framework

Is writing style predictive of scientific fraud?


Title	Is writing style predictive of scientific fraud?
Authors	Chlo{'e} Braud, Anders S{\o}gaard
Abstract	The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4905/
PDF	https://www.aclweb.org/anthology/W17-4905
PWC	https://paperswithcode.com/paper/is-writing-style-predictive-of-scientific
Repo
Framework

Tackling Biomedical Text Summarization: OAQA at BioASQ 5B


Title	Tackling Biomedical Text Summarization: OAQA at BioASQ 5B
Authors	Ch, Khyathi u, Aakanksha Naik, Ch, Aditya rasekar, Zi Yang, Niloy Gupta, Eric Nyberg
Abstract	In this paper, we describe our participation in phase B of task 5b of the fifth edition of the annual BioASQ challenge, which includes answering factoid, list, yes-no and summary questions from biomedical data. We describe our techniques with an emphasis on ideal answer generation, where the goal is to produce a relevant, precise, non-redundant, query-oriented summary from multiple relevant documents. We make use of extractive summarization techniques to address this task and experiment with different biomedical ontologies and various algorithms including agglomerative clustering, Maximum Marginal Relevance (MMR) and sentence compression. We propose a novel word embedding based tf-idf similarity metric and a soft positional constraint which improve our system performance. We evaluate our techniques on test batch 4 from the fourth edition of the challenge. Our best system achieves a ROUGE-2 score of 0.6534 and ROUGE-SU4 score of 0.6536.
Tasks	Question Answering, Sentence Compression, Text Summarization
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2307/
PDF	https://www.aclweb.org/anthology/W17-2307
PWC	https://paperswithcode.com/paper/tackling-biomedical-text-summarization-oaqa
Repo
Framework

Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets


Title	Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets
Authors	Pranav Goel, Devang Kulshreshtha, Prayas Jain, Kaushal Kumar Shukla
Abstract	The paper describes the best performing system for EmoInt - a shared task to predict the intensity of emotions in tweets. Intensity is a real valued score, between 0 and 1. The emotions are classified as - anger, fear, joy and sadness. We apply three different deep neural network based models, which approach the problem from essentially different directions. Our final performance quantified by an average pearson correlation score of 74.7 and an average spearman correlation score of 73.5 is obtained using an ensemble of the three models. We outperform the baseline model of the shared task by 9.9{%} and 9.4{%} pearson and spearman correlation scores respectively.
Tasks	Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5207/
PDF	https://www.aclweb.org/anthology/W17-5207
PWC	https://paperswithcode.com/paper/prayas-at-emoint-2017-an-ensemble-of-deep
Repo
Framework

Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study


Title	Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study
Authors	Nan Wang
Abstract
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-3023/
PDF	https://www.aclweb.org/anthology/P17-3023
PWC	https://paperswithcode.com/paper/negotiation-of-antibiotic-treatment-in
Repo
Framework

Active Sentiment Domain Adaptation


Title	Active Sentiment Domain Adaptation
Authors	Fangzhao Wu, Yongfeng Huang, Jun Yan
Abstract	Domain adaptation is an important technology to handle domain dependence problem in sentiment analysis field. Existing methods usually rely on sentiment classifiers trained in source domains. However, their performance may heavily decline if the distributions of sentiment features in source and target domains have significant difference. In this paper, we propose an active sentiment domain adaptation approach to handle this problem. Instead of the source domain sentiment classifiers, our approach adapts the general-purpose sentiment lexicons to target domain with the help of a small number of labeled samples which are selected and annotated in an active learning mode, as well as the domain-specific sentiment similarities among words mined from unlabeled samples of target domain. A unified model is proposed to fuse different types of sentiment information and train sentiment classifier for target domain. Extensive experiments on benchmark datasets show that our approach can train accurate sentiment classifier with less labeled samples.
Tasks	Active Learning, Domain Adaptation, Sentiment Analysis, Transfer Learning
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1156/
PDF	https://www.aclweb.org/anthology/P17-1156
PWC	https://paperswithcode.com/paper/active-sentiment-domain-adaptation
Repo
Framework

Improving End-to-End Memory Networks with Unified Weight Tying


Title	Improving End-to-End Memory Networks with Unified Weight Tying
Authors	Fei Liu, Trevor Cohn, Timothy Baldwin
Abstract
Tasks	Image Classification, Speech Recognition
Published	2017-12-01
URL	https://www.aclweb.org/anthology/U17-1002/
PDF	https://www.aclweb.org/anthology/U17-1002
PWC	https://paperswithcode.com/paper/improving-end-to-end-memory-networks-with
Repo
Framework

Proceedings of the 15th International Conference on Parsing Technologies


Title	Proceedings of the 15th International Conference on Parsing Technologies
Authors
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6300/
PDF	https://www.aclweb.org/anthology/W17-6300
PWC	https://paperswithcode.com/paper/proceedings-of-the-15th-international-1
Repo
Framework

Speech segmentation with a neural encoder model of working memory


Title	Speech segmentation with a neural encoder model of working memory
Authors	Micha Elsner, Cory Shain
Abstract	We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input. Cognitive biases toward phonological and syntactic predictability in speech are rooted in the limitations of human memory (Baddeley et al., 1998); compressed representations are easier to acquire and retain in memory. To model the biases introduced by these memory limitations, our system uses an LSTM-based encoder-decoder with a small number of hidden units, then searches for a segmentation that minimizes autoencoding loss. Linguistically meaningful segments (e.g. words) should share regular patterns of features that facilitate decoder performance in comparison to random segmentations, and we show that our learner discovers these patterns when trained on either phoneme sequences or raw acoustics. To our knowledge, ours is the first fully unsupervised system to be able to segment both symbolic and acoustic representations of speech.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1112/
PDF	https://www.aclweb.org/anthology/D17-1112
PWC	https://paperswithcode.com/paper/speech-segmentation-with-a-neural-encoder
Repo
Framework