July 26, 2019

1912 words 9 mins read

Paper Group NANR 187

Paper Group NANR 187

Argumentation Quality Assessment: Theory vs. Practice. TL;DR: Mining Reddit to Learn Automatic Summarization. Unsupervised Event Clustering and Aggregation from Newswire and Web Articles. Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization. Lexical Chains meet Word Embeddings in Document-level Statistical Machin …

Argumentation Quality Assessment: Theory vs. Practice

Title Argumentation Quality Assessment: Theory vs. Practice
Authors Henning Wachsmuth, Nona Naderi, Ivan Habernal, Yufang Hou, Graeme Hirst, Iryna Gurevych, Benno Stein
Abstract Argumentation quality is viewed differently in argumentation theory and in practical assessment approaches. This paper studies to what extent the views match empirically. We find that most observations on quality phrased spontaneously are in fact adequately represented by theory. Even more, relative comparisons of arguments in practice correlate with absolute quality ratings based on theory. Our results clarify how the two views can learn from each other.
Tasks Argument Mining
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2039/
PDF https://www.aclweb.org/anthology/P17-2039
PWC https://paperswithcode.com/paper/argumentation-quality-assessment-theory-vs
Repo
Framework

TL;DR: Mining Reddit to Learn Automatic Summarization

Title TL;DR: Mining Reddit to Learn Automatic Summarization
Authors Michael V{"o}lske, Martin Potthast, Shahbaz Syed, Benno Stein
Abstract Recent advances in automatic text summarization have used deep neural networks to generate high-quality abstractive summaries, but the performance of these models strongly depends on large amounts of suitable training data. We propose a new method for mining social media for author-provided summaries, taking advantage of the common practice of appending a {``}TL;DR{''} to long posts. A case study using a large Reddit crawl yields the Webis-TLDR-17 dataset, complementing existing corpora primarily from the news genre. Our technique is likely applicable to other social media sites and general web crawls. |
Tasks Abstractive Text Summarization, Document Summarization, Text Summarization
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4508/
PDF https://www.aclweb.org/anthology/W17-4508
PWC https://paperswithcode.com/paper/tldr-mining-reddit-to-learn-automatic
Repo
Framework

Unsupervised Event Clustering and Aggregation from Newswire and Web Articles

Title Unsupervised Event Clustering and Aggregation from Newswire and Web Articles
Authors Swen Ribeiro, Olivier Ferret, Xavier Tannier
Abstract In this paper, we present an unsupervised pipeline approach for clustering news articles based on identified event instances in their content. We leverage press agency newswire and monolingual word alignment techniques to build meaningful and linguistically varied clusters of articles from the web in the perspective of a broader event type detection task. We validate our approach on a manually annotated corpus of Web articles.
Tasks Document Summarization, Word Alignment
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4211/
PDF https://www.aclweb.org/anthology/W17-4211
PWC https://paperswithcode.com/paper/unsupervised-event-clustering-and-aggregation
Repo
Framework

Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization

Title Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization
Authors Gholipour Ghal, Demian ari
Abstract The centroid-based model for extractive document summarization is a simple and fast baseline that ranks sentences based on their similarity to a centroid vector. In this paper, we apply this ranking to possible summaries instead of sentences and use a simple greedy algorithm to find the best summary. Furthermore, we show possibilities to scale up to larger input document collections by selecting a small number of sentences from each document prior to constructing the summary. Experiments were done on the DUC2004 dataset for multi-document summarization. We observe a higher performance over the original model, on par with more complex state-of-the-art methods.
Tasks Document Summarization, Extractive Document Summarization, Multi-Document Summarization
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4511/
PDF https://www.aclweb.org/anthology/W17-4511
PWC https://paperswithcode.com/paper/revisiting-the-centroid-based-method-a-strong-1
Repo
Framework

Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation

Title Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation
Authors Laura Mascarell
Abstract Currently under review for EMNLP 2017 The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder. We focus on word embeddings to deal with the lexical chains, contrary to the traditional approach that uses lexical resources. Experimental results on German-to-English show that our method produces correct translations in up to 88{%} of the changes, improving the translation in 36{%}-48{%} of them over the baseline.
Tasks Document Summarization, Information Retrieval, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4813/
PDF https://www.aclweb.org/anthology/W17-4813
PWC https://paperswithcode.com/paper/lexical-chains-meet-word-embeddings-in
Repo
Framework

Graph-Based Approach to Recognizing CST Relations in Polish Texts

Title Graph-Based Approach to Recognizing CST Relations in Polish Texts
Authors Pawe{\l} K{\k{e}}dzia, Maciej Piasecki, Arkadiusz Janz
Abstract This paper presents an supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. In the proposed, graph-based representation is constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relation extracted from text. Similarity between sentences is calculated from graph, and the similarity values are input to classifiers trained by Logistic Model Tree. Several different configurations of graph, as well as graph similarity methods were analysed for this tasks. The approach was evaluated on a large open corpus annotated manually with 17 types of selected CST relations. The configuration of experiments was similar to those known from SEMEVAL and we obtained very promising results.
Tasks Document Summarization, Graph Similarity, Information Retrieval, Multi-Document Summarization, Natural Language Inference
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1048/
PDF https://doi.org/10.26615/978-954-452-049-6_048
PWC https://paperswithcode.com/paper/graph-based-approach-to-recognizing-cst
Repo
Framework

List-only Entity Linking

Title List-only Entity Linking
Authors Ying Lin, Chin-Yew Lin, Heng Ji
Abstract Traditional Entity Linking (EL) technologies rely on rich structures and properties in the target knowledge base (KB). However, in many applications, the KB may be as simple and sparse as lists of names of the same type (e.g., lists of products). We call it as List-only Entity Linking problem. Fortunately, some mentions may have more cues for linking, which can be used as seed mentions to bridge other mentions and the uninformative entities. In this work, we select most linkable mentions as seed mentions and disambiguate other mentions by comparing them with the seed mentions rather than directly with the entities. Our experiments on linking mentions to seven automatically mined lists show promising results and demonstrate the effectiveness of our approach.
Tasks Entity Linking
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2085/
PDF https://www.aclweb.org/anthology/P17-2085
PWC https://paperswithcode.com/paper/list-only-entity-linking
Repo
Framework

Summarizing World Speak : A Preliminary Graph Based Approach

Title Summarizing World Speak : A Preliminary Graph Based Approach
Authors Nikhil Londhe, Rohini Srihari
Abstract Social media platforms play a crucial role in piecing together global news stories via their corresponding online discussions. Thus, in this work, we introduce the problem of automatically summarizing massively multilingual microblog text streams. We discuss the challenges involved in both generating summaries as well as evaluating them. We introduce a simple word graph based approach that utilizes node neighborhoods to identify keyphrases and thus in turn, pick summary candidates. We also demonstrate the effectiveness of our method in generating precise summaries as compared to other popular techniques.
Tasks Document Summarization, Multi-Document Summarization
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1060/
PDF https://doi.org/10.26615/978-954-452-049-6_060
PWC https://paperswithcode.com/paper/summarizing-world-speak-a-preliminary-graph
Repo
Framework

Between Reading Time and Information Structure

Title Between Reading Time and Information Structure
Authors Masayuki Asahara
Abstract
Tasks Document Summarization, Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1006/
PDF https://www.aclweb.org/anthology/Y17-1006
PWC https://paperswithcode.com/paper/between-reading-time-and-information
Repo
Framework

Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory

Title Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory
Authors Yu-Lun Hsieh, Yung-Chun Chang, Nai-Wen Chang, Wen-Lian Hsu
Abstract In this paper, we propose a recurrent neural network model for identifying protein-protein interactions in biomedical literature. Experiments on two largest public benchmark datasets, AIMed and BioInfer, demonstrate that our approach significantly surpasses state-of-the-art methods with relative improvements of 10{%} and 18{%}, respectively. Cross-corpus evaluation also demonstrate that the proposed model remains robust despite using different training data. These results suggest that RNN can effectively capture semantic relationships among proteins as well as generalizes over different corpora, without any feature engineering.
Tasks Feature Engineering
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2041/
PDF https://www.aclweb.org/anthology/I17-2041
PWC https://paperswithcode.com/paper/identifying-protein-protein-interactions-in
Repo
Framework

A Corpus Analysis of Social Connections and Social Isolation in Adolescents Suffering from Depressive Disorders

Title A Corpus Analysis of Social Connections and Social Isolation in Adolescents Suffering from Depressive Disorders
Authors Jia-Wen Guo, Danielle L Mowery, Djin Lai, Katherine Sward, Mike Conway
Abstract Social connection and social isolation are associated with depressive symptoms, particularly in adolescents and young adults, but how these concepts are documented in clinical notes is unknown. This pilot study aimed to identify the topics relevant to social connection and isolation by analyzing 145 clinical notes from patients with depression diagnosis. We found that providers, including physicians, nurses, social workers, and psychologists, document descriptions of both social connection and social isolation.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3103/
PDF https://www.aclweb.org/anthology/W17-3103
PWC https://paperswithcode.com/paper/a-corpus-analysis-of-social-connections-and
Repo
Framework

Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision

Title Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision
Authors Leyi Wang, Rui Xia
Abstract Sentiment lexicon is an important tool for identifying the sentiment polarity of words and texts. How to automatically construct sentiment lexicons has become a research topic in the field of sentiment analysis and opinion mining. Recently there were some attempts to employ representation learning algorithms to construct a sentiment lexicon with sentiment-aware word embedding. However, these methods were normally trained under document-level sentiment supervision. In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon. Experiments on the SemEval 2013-2016 datasets indicate that the sentiment lexicon generated by our approach achieves the state-of-the-art performance in both supervised and unsupervised sentiment classification, in comparison with several strong sentiment lexicon construction methods.
Tasks Opinion Mining, Representation Learning, Sentiment Analysis, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1052/
PDF https://www.aclweb.org/anthology/D17-1052
PWC https://paperswithcode.com/paper/sentiment-lexicon-construction-with
Repo
Framework

A study on irony within the context of 7x1-PT corpus

Title A study on irony within the context of 7x1-PT corpus
Authors Silvia Moraes, Rackel Machado, Matheus Redecker, Rafael Cadaval, Felipe Meneguzzi
Abstract
Tasks Opinion Mining, Sentiment Analysis
Published 2017-10-01
URL https://www.aclweb.org/anthology/W17-6604/
PDF https://www.aclweb.org/anthology/W17-6604
PWC https://paperswithcode.com/paper/a-study-on-irony-within-the-context-of-7x1-pt
Repo
Framework

Toward Stance Classification Based on Claim Microstructures

Title Toward Stance Classification Based on Claim Microstructures
Authors Filip Boltu{\v{z}}i{'c}, Jan {\v{S}}najder
Abstract Claims are the building blocks of arguments and the reasons underpinning opinions, thus analyzing claims is important for both argumentation mining and opinion mining. We propose a framework for representing claims as microstructures, which express the beliefs, judgments, and policies about the relations between domain-specific concepts. In a proof-of-concept study, we manually build microstructures for over 800 claims extracted from an online debate. We test the so-obtained microstructures on the task of claim stance classification, achieving considerable improvements over text-based baselines.
Tasks Argument Mining, Fine-Grained Opinion Analysis, Opinion Mining
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5210/
PDF https://www.aclweb.org/anthology/W17-5210
PWC https://paperswithcode.com/paper/toward-stance-classification-based-on-claim
Repo
Framework

Supervised and unsupervised approaches to measuring usage similarity

Title Supervised and unsupervised approaches to measuring usage similarity
Authors Milton King, Paul Cook
Abstract Usage similarity (USim) is an approach to determining word meaning in context that does not rely on a sense inventory. Instead, pairs of usages of a target lemma are rated on a scale. In this paper we propose unsupervised approaches to USim based on embeddings for words, contexts, and sentences, and achieve state-of-the-art results over two USim datasets. We further consider supervised approaches to USim, and find that although they outperform unsupervised approaches, they are unable to generalize to lemmas that are unseen in the training data.
Tasks Word Sense Disambiguation, Word Sense Induction
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1906/
PDF https://www.aclweb.org/anthology/W17-1906
PWC https://paperswithcode.com/paper/supervised-and-unsupervised-approaches-to
Repo
Framework
comments powered by Disqus