Paper Group NANR 176
Diversifying Neural Conversation Model with Maximal Marginal Relevance. A Statistical Framework for Product Description Generation. Unsupervised Pretraining for Sequence to Sequence Learning. Using Analytic Scoring Rubrics in the Automatic Assessment of College-Level Summary Writing Tasks in L2. Grouping business news stories based on salience of n …
Diversifying Neural Conversation Model with Maximal Marginal Relevance
Title | Diversifying Neural Conversation Model with Maximal Marginal Relevance |
Authors | Yiping Song, Zhiliang Tian, Dongyan Zhao, Ming Zhang, Rui Yan |
Abstract | Neural conversation systems, typically using sequence-to-sequence (seq2seq) models, are showing promising progress recently. However, traditional seq2seq suffer from a severe weakness: during beam search decoding, they tend to rank universal replies at the top of the candidate list, resulting in the lack of diversity among candidate replies. Maximum Marginal Relevance (MMR) is a ranking algorithm that has been widely used for subset selection. In this paper, we propose the MMR-BS decoding method, which incorporates MMR into the beam search (BS) process of seq2seq. The MMR-BS method improves the diversity of generated replies without sacrificing their high relevance with the user-issued query. Experiments show that our proposed model achieves the best performance among other comparison methods. |
Tasks | Document Summarization, Information Retrieval, Text Categorization |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2029/ |
https://www.aclweb.org/anthology/I17-2029 | |
PWC | https://paperswithcode.com/paper/diversifying-neural-conversation-model-with |
Repo | |
Framework | |
A Statistical Framework for Product Description Generation
Title | A Statistical Framework for Product Description Generation |
Authors | Jinpeng Wang, Yutai Hou, Jing Liu, Yunbo Cao, Chin-Yew Lin |
Abstract | We present in this paper a statistical framework that generates accurate and fluent product description from product attributes. Specifically, after extracting templates and learning writing knowledge from attribute-description parallel data, we use the learned knowledge to decide what to say and how to say for product description generation. To evaluate accuracy and fluency for the generated descriptions, in addition to BLEU and Recall, we propose to measure what to say (in terms of attribute coverage) and to measure how to say (by attribute-specified generation) separately. Experimental results show that our framework is effective. |
Tasks | Data-to-Text Generation, Text Generation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2032/ |
https://www.aclweb.org/anthology/I17-2032 | |
PWC | https://paperswithcode.com/paper/a-statistical-framework-for-product |
Repo | |
Framework | |
Unsupervised Pretraining for Sequence to Sequence Learning
Title | Unsupervised Pretraining for Sequence to Sequence Learning |
Authors | Ramach, Prajit ran, Peter Liu, Quoc Le |
Abstract | This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models. In our method, the weights of the encoder and decoder of a seq2seq model are initialized with the pretrained weights of two language models and then fine-tuned with labeled data. We apply this method to challenging benchmarks in machine translation and abstractive summarization and find that it significantly improves the subsequent supervised models. Our main result is that pretraining improves the generalization of seq2seq models. We achieve state-of-the-art results on the WMT English→German task, surpassing a range of methods using both phrase-based machine translation and neural machine translation. Our method achieves a significant improvement of 1.3 BLEU from th previous best models on both WMT{'}14 and WMT{'}15 English→German. We also conduct human evaluations on abstractive summarization and find that our method outperforms a purely supervised learning baseline in a statistically significant manner. |
Tasks | Abstractive Text Summarization, Language Modelling, Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1039/ |
https://www.aclweb.org/anthology/D17-1039 | |
PWC | https://paperswithcode.com/paper/unsupervised-pretraining-for-sequence-to-1 |
Repo | |
Framework | |
Using Analytic Scoring Rubrics in the Automatic Assessment of College-Level Summary Writing Tasks in L2
Title | Using Analytic Scoring Rubrics in the Automatic Assessment of College-Level Summary Writing Tasks in L2 |
Authors | Tamara Sladoljev-Agejev, Jan {\v{S}}najder |
Abstract | Assessing summaries is a demanding, yet useful task which provides valuable information on language competence, especially for second language learners. We consider automated scoring of college-level summary writing task in English as a second language (EL2). We adopt the Reading-for-Understanding (RU) cognitive framework, extended with the Reading-to-Write (RW) element, and use analytic scoring with six rubrics covering content and writing quality. We show that regression models with reference-based and linguistic features considerably outperform the baselines across all the rubrics. Moreover, we find interesting correlations between summary features and analytic rubrics, revealing the links between the RU and RW constructs. |
Tasks | Reading Comprehension |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2031/ |
https://www.aclweb.org/anthology/I17-2031 | |
PWC | https://paperswithcode.com/paper/using-analytic-scoring-rubrics-in-the |
Repo | |
Framework | |
Grouping business news stories based on salience of named entities
Title | Grouping business news stories based on salience of named entities |
Authors | Lloren{\c{c}} Escoter, Lidia Pivovarova, Mian Du, Anisia Katinskaia, Roman Yangarber |
Abstract | In news aggregation systems focused on broad news domains, certain stories may appear in multiple articles. Depending on the relative importance of the story, the number of versions can reach dozens or hundreds within a day. The text in these versions may be nearly identical or quite different. Linking multiple versions of a story into a single group brings several important benefits to the end-user{–}reducing the cognitive load on the reader, as well as signaling the relative importance of the story. We present a grouping algorithm, and explore several vector-based representations of input documents: from a baseline using keywords, to a method using salience{–}a measure of importance of named entities in the text. We demonstrate that features beyond keywords yield substantial improvements, verified on a manually-annotated corpus of business news stories. |
Tasks | Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1103/ |
https://www.aclweb.org/anthology/E17-1103 | |
PWC | https://paperswithcode.com/paper/grouping-business-news-stories-based-on |
Repo | |
Framework | |
SSAS: Semantic Similarity for Abstractive Summarization
Title | SSAS: Semantic Similarity for Abstractive Summarization |
Authors | Raghuram Vadapalli, Litton J Kurisinkel, Manish Gupta, Vasudeva Varma |
Abstract | Ideally a metric evaluating an abstract system summary should represent the extent to which the system-generated summary approximates the semantic inference conceived by the reader using a human-written reference summary. Most of the previous approaches relied upon word or syntactic sub-sequence overlap to evaluate system-generated summaries. Such metrics cannot evaluate the summary at semantic inference level. Through this work we introduce the metric of Semantic Similarity for Abstractive Summarization (SSAS), which leverages natural language inference and paraphrasing techniques to frame a novel approach to evaluate system summaries at semantic inference level. SSAS is based upon a weighted composition of quantities representing the level of agreement, contradiction, independence, paraphrasing, and optionally ROUGE score between a system-generated and a human-written summary. |
Tasks | Abstractive Text Summarization, Natural Language Inference, Semantic Similarity, Semantic Textual Similarity |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2034/ |
https://www.aclweb.org/anthology/I17-2034 | |
PWC | https://paperswithcode.com/paper/ssas-semantic-similarity-for-abstractive |
Repo | |
Framework | |
Hyperspherical Query Likelihood Models with Word Embeddings
Title | Hyperspherical Query Likelihood Models with Word Embeddings |
Authors | Ryo Masumura, Taichi Asami, Hirokazu Masataki, Kugatsu Sadamitsu, Kyosuke Nishida, Ryuichiro Higashinaka |
Abstract | This paper presents an initial study on hyperspherical query likelihood models (QLMs) for information retrieval (IR). Our motivation is to naturally utilize pre-trained word embeddings for probabilistic IR. To this end, key idea is to directly leverage the word embeddings as random variables for directional probabilistic models based on von Mises-Fisher distributions which are familiar to cosine distances. The proposed method enables us to theoretically take semantic similarities between document and target queries into consideration without introducing heuristic expansion techniques. In addition, this paper reveals relationships between hyperspherical QLMs and conventional QLMs. Experiments show document retrieval evaluation results in which a hyperspherical QLM is compared to conventional QLMs and document distance metrics using word or document embeddings. |
Tasks | Information Retrieval, Language Modelling, Semantic Textual Similarity, Word Embeddings |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2036/ |
https://www.aclweb.org/anthology/I17-2036 | |
PWC | https://paperswithcode.com/paper/hyperspherical-query-likelihood-models-with |
Repo | |
Framework | |
Dual Constrained Question Embeddings with Relational Knowledge Bases for Simple Question Answering
Title | Dual Constrained Question Embeddings with Relational Knowledge Bases for Simple Question Answering |
Authors | Kaustubh Kulkarni, Riku Togashi, Hideyuki Maeda, Sumio Fujita |
Abstract | Embedding based approaches are shown to be effective for solving simple Question Answering (QA) problems in recent works. The major drawback of current approaches is that they look only at the similarity (constraint) between a question and a head, relation pair. Due to the absence of tail (answer) in the questions, these models often require paraphrase datasets to obtain adequate embeddings. In this paper, we propose a dual constraint model which exploits the embeddings obtained by Trans* family of algorithms to solve the simple QA problem without using any additional resources such as paraphrase datasets. The results obtained prove that the embeddings learned using dual constraints are better than those with single constraint models having similar architecture. |
Tasks | Question Answering |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2037/ |
https://www.aclweb.org/anthology/I17-2037 | |
PWC | https://paperswithcode.com/paper/dual-constrained-question-embeddings-with |
Repo | |
Framework | |
Projecting Multiword Expression Resources on a Polish Treebank
Title | Projecting Multiword Expression Resources on a Polish Treebank |
Authors | Agata Savary, Jakub Waszczuk |
Abstract | Multiword expressions (MWEs) are linguistic objects containing two or more words and showing idiosyncratic behavior at different levels. Treebanks with annotated MWEs enable studies of such properties, as well as training and evaluation of MWE-aware parsers. However, few treebanks contain full-fledged MWE annotations. We show how this gap can be bridged in Polish by projecting 3 MWE resources on a constituency treebank. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1404/ |
https://www.aclweb.org/anthology/W17-1404 | |
PWC | https://paperswithcode.com/paper/projecting-multiword-expression-resources-on |
Repo | |
Framework | |
Identifying Empathetic Messages in Online Health Communities
Title | Identifying Empathetic Messages in Online Health Communities |
Authors | Hamed Khanpour, Cornelia Caragea, Prakhar Biyani |
Abstract | Empathy captures one{'}s ability to correlate with and understand others{'} emotional states and experiences. Messages with empathetic content are considered as one of the main advantages for joining online health communities due to their potential to improve people{'}s moods. Unfortunately, to this date, no computational studies exist that automatically identify empathetic messages in online health communities. We propose a combination of Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks, and show that the proposed model outperforms each individual model (CNN and LSTM) as well as several baselines. |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2042/ |
https://www.aclweb.org/anthology/I17-2042 | |
PWC | https://paperswithcode.com/paper/identifying-empathetic-messages-in-online |
Repo | |
Framework | |
Improving Neural Text Normalization with Data Augmentation at Character- and Morphological Levels
Title | Improving Neural Text Normalization with Data Augmentation at Character- and Morphological Levels |
Authors | Itsumi Saito, Jun Suzuki, Kyosuke Nishida, Kugatsu Sadamitsu, Satoshi Kobashikawa, Ryo Masumura, Yuji Matsumoto, Junji Tomita |
Abstract | In this study, we investigated the effectiveness of augmented data for encoder-decoder-based neural normalization models. Attention based encoder-decoder models are greatly effective in generating many natural languages. {%} such as machine translation or machine summarization. In general, we have to prepare for a large amount of training data to train an encoder-decoder model. Unlike machine translation, there are few training data for text-normalization tasks. In this paper, we propose two methods for generating augmented data. The experimental results with Japanese dialect normalization indicate that our methods are effective for an encoder-decoder model and achieve higher BLEU score than that of baselines. We also investigated the oracle performance and revealed that there is sufficient room for improving an encoder-decoder model. |
Tasks | Data Augmentation, Machine Translation, Text Summarization |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2044/ |
https://www.aclweb.org/anthology/I17-2044 | |
PWC | https://paperswithcode.com/paper/improving-neural-text-normalization-with-data |
Repo | |
Framework | |
A Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering
Title | A Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering |
Authors | Luigi Yu-Cheng Liu, Damien Nouvel |
Abstract | In many languages such as Bambara or Arabic, tone markers (diacritics) may be written but are actually often omitted. NLP applications are confronted to ambiguities and subsequent difficulties when processing texts. To circumvent this problem, tonalization may be used, as a word sense disambiguation task, relying on context to add diacritics that partially disambiguate words as well as senses. In this paper, we describe our implementation of a Bambara tonalizer that adds tone markers using machine learning (CRFs). To make our tool efficient, we used differential coding, word segmentation and edit operation filtering. We describe our approach that allows tractable machine learning and improves accuracy: our model may be learned within minutes on a 358K-word corpus and reaches 92.3{%} accuracy. |
Tasks | Word Sense Disambiguation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1070/ |
https://www.aclweb.org/anthology/I17-1070 | |
PWC | https://paperswithcode.com/paper/a-bambara-tonalization-system-for-word-sense |
Repo | |
Framework | |
AutoExtend: Combining Word Embeddings with Semantic Resources
Title | AutoExtend: Combining Word Embeddings with Semantic Resources |
Authors | Sascha Rothe, Hinrich Sch{"u}tze |
Abstract | We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks. |
Tasks | Learning Word Embeddings, Sentiment Analysis, Word Embeddings, Word Sense Disambiguation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/J17-3004/ |
https://www.aclweb.org/anthology/J17-3004 | |
PWC | https://paperswithcode.com/paper/autoextend-combining-word-embeddings-with |
Repo | |
Framework | |
Tensor Belief Propagation
Title | Tensor Belief Propagation |
Authors | Andrew Wrigley, Wee Sun Lee, Nan Ye |
Abstract | We propose a new approximate inference algorithm for graphical models, tensor belief propagation, based on approximating the messages passed in the junction tree algorithm. Our algorithm represents the potential functions of the graphical model and all messages on the junction tree compactly as mixtures of rank-1 tensors. Using this representation, we show how to perform the operations required for inference on the junction tree efficiently: marginalisation can be computed quickly due to the factored form of rank-1 tensors while multiplication can be approximated using sampling. Our analysis gives sufficient conditions for the algorithm to perform well, including for the case of high-treewidth graphs, for which exact inference is intractable. We compare our algorithm experimentally with several approximate inference algorithms and show that it performs well. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=651 |
http://proceedings.mlr.press/v70/wrigley17a/wrigley17a.pdf | |
PWC | https://paperswithcode.com/paper/tensor-belief-propagation |
Repo | |
Framework | |
A Web Interface for Diachronic Semantic Search in Spanish
Title | A Web Interface for Diachronic Semantic Search in Spanish |
Authors | Pablo Gamallo, Iv{'a}n Rodr{'\i}guez-Torres, Marcos Garcia |
Abstract | This article describes a semantic system which is based on distributional models obtained from a chronologically structured language resource, namely Google Books Syntactic Ngrams.The models were created using dependency-based contexts and a strategy for reducing the vector space, which consists in selecting the more informative and relevant word contexts. The system allowslinguists to analize meaning change of Spanish words in the written language across time. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-3012/ |
https://www.aclweb.org/anthology/E17-3012 | |
PWC | https://paperswithcode.com/paper/a-web-interface-for-diachronic-semantic |
Repo | |
Framework | |