July 26, 2019

1912 words 9 mins read

Paper Group NANR 187

Argumentation Quality Assessment: Theory vs. Practice. TL;DR: Mining Reddit to Learn Automatic Summarization. Unsupervised Event Clustering and Aggregation from Newswire and Web Articles. Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization. Lexical Chains meet Word Embeddings in Document-level Statistical Machin …

Argumentation Quality Assessment: Theory vs. Practice


Title	Argumentation Quality Assessment: Theory vs. Practice
Authors	Henning Wachsmuth, Nona Naderi, Ivan Habernal, Yufang Hou, Graeme Hirst, Iryna Gurevych, Benno Stein
Abstract	Argumentation quality is viewed differently in argumentation theory and in practical assessment approaches. This paper studies to what extent the views match empirically. We find that most observations on quality phrased spontaneously are in fact adequately represented by theory. Even more, relative comparisons of arguments in practice correlate with absolute quality ratings based on theory. Our results clarify how the two views can learn from each other.
Tasks	Argument Mining
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2039/
PDF	https://www.aclweb.org/anthology/P17-2039
PWC	https://paperswithcode.com/paper/argumentation-quality-assessment-theory-vs
Repo
Framework

TL;DR: Mining Reddit to Learn Automatic Summarization


Title	TL;DR: Mining Reddit to Learn Automatic Summarization
Authors	Michael V{"o}lske, Martin Potthast, Shahbaz Syed, Benno Stein
Abstract	Recent advances in automatic text summarization have used deep neural networks to generate high-quality abstractive summaries, but the performance of these models strongly depends on large amounts of suitable training data. We propose a new method for mining social media for author-provided summaries, taking advantage of the common practice of appending a {``}TL;DR{''} to long posts. A case study using a large Reddit crawl yields the Webis-TLDR-17 dataset, complementing existing corpora primarily from the news genre. Our technique is likely applicable to other social media sites and general web crawls. \|
Tasks	Abstractive Text Summarization, Document Summarization, Text Summarization
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4508/
PDF	https://www.aclweb.org/anthology/W17-4508
PWC	https://paperswithcode.com/paper/tldr-mining-reddit-to-learn-automatic
Repo
Framework

Unsupervised Event Clustering and Aggregation from Newswire and Web Articles


Title	Unsupervised Event Clustering and Aggregation from Newswire and Web Articles
Authors	Swen Ribeiro, Olivier Ferret, Xavier Tannier
Abstract	In this paper, we present an unsupervised pipeline approach for clustering news articles based on identified event instances in their content. We leverage press agency newswire and monolingual word alignment techniques to build meaningful and linguistically varied clusters of articles from the web in the perspective of a broader event type detection task. We validate our approach on a manually annotated corpus of Web articles.
Tasks	Document Summarization, Word Alignment
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4211/
PDF	https://www.aclweb.org/anthology/W17-4211
PWC	https://paperswithcode.com/paper/unsupervised-event-clustering-and-aggregation
Repo
Framework

Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization


Title	Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization
Authors	Gholipour Ghal, Demian ari
Abstract	The centroid-based model for extractive document summarization is a simple and fast baseline that ranks sentences based on their similarity to a centroid vector. In this paper, we apply this ranking to possible summaries instead of sentences and use a simple greedy algorithm to find the best summary. Furthermore, we show possibilities to scale up to larger input document collections by selecting a small number of sentences from each document prior to constructing the summary. Experiments were done on the DUC2004 dataset for multi-document summarization. We observe a higher performance over the original model, on par with more complex state-of-the-art methods.
Tasks	Document Summarization, Extractive Document Summarization, Multi-Document Summarization
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4511/
PDF	https://www.aclweb.org/anthology/W17-4511
PWC	https://paperswithcode.com/paper/revisiting-the-centroid-based-method-a-strong-1
Repo
Framework

Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation


Title	Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation
Authors	Laura Mascarell
Abstract	Currently under review for EMNLP 2017 The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder. We focus on word embeddings to deal with the lexical chains, contrary to the traditional approach that uses lexical resources. Experimental results on German-to-English show that our method produces correct translations in up to 88{%} of the changes, improving the translation in 36{%}-48{%} of them over the baseline.
Tasks	Document Summarization, Information Retrieval, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4813/
PDF	https://www.aclweb.org/anthology/W17-4813
PWC	https://paperswithcode.com/paper/lexical-chains-meet-word-embeddings-in
Repo
Framework

Graph-Based Approach to Recognizing CST Relations in Polish Texts


Title	Graph-Based Approach to Recognizing CST Relations in Polish Texts
Authors	Pawe{\l} K{\k{e}}dzia, Maciej Piasecki, Arkadiusz Janz
Abstract	This paper presents an supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. In the proposed, graph-based representation is constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relation extracted from text. Similarity between sentences is calculated from graph, and the similarity values are input to classifiers trained by Logistic Model Tree. Several different configurations of graph, as well as graph similarity methods were analysed for this tasks. The approach was evaluated on a large open corpus annotated manually with 17 types of selected CST relations. The configuration of experiments was similar to those known from SEMEVAL and we obtained very promising results.
Tasks	Document Summarization, Graph Similarity, Information Retrieval, Multi-Document Summarization, Natural Language Inference
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1048/
PDF	https://doi.org/10.26615/978-954-452-049-6_048
PWC	https://paperswithcode.com/paper/graph-based-approach-to-recognizing-cst
Repo
Framework

List-only Entity Linking


Title	List-only Entity Linking
Authors	Ying Lin, Chin-Yew Lin, Heng Ji
Abstract	Traditional Entity Linking (EL) technologies rely on rich structures and properties in the target knowledge base (KB). However, in many applications, the KB may be as simple and sparse as lists of names of the same type (e.g., lists of products). We call it as List-only Entity Linking problem. Fortunately, some mentions may have more cues for linking, which can be used as seed mentions to bridge other mentions and the uninformative entities. In this work, we select most linkable mentions as seed mentions and disambiguate other mentions by comparing them with the seed mentions rather than directly with the entities. Our experiments on linking mentions to seven automatically mined lists show promising results and demonstrate the effectiveness of our approach.
Tasks	Entity Linking
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2085/
PDF	https://www.aclweb.org/anthology/P17-2085
PWC	https://paperswithcode.com/paper/list-only-entity-linking
Repo
Framework

Summarizing World Speak : A Preliminary Graph Based Approach


Title	Summarizing World Speak : A Preliminary Graph Based Approach
Authors	Nikhil Londhe, Rohini Srihari
Abstract	Social media platforms play a crucial role in piecing together global news stories via their corresponding online discussions. Thus, in this work, we introduce the problem of automatically summarizing massively multilingual microblog text streams. We discuss the challenges involved in both generating summaries as well as evaluating them. We introduce a simple word graph based approach that utilizes node neighborhoods to identify keyphrases and thus in turn, pick summary candidates. We also demonstrate the effectiveness of our method in generating precise summaries as compared to other popular techniques.
Tasks	Document Summarization, Multi-Document Summarization
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1060/
PDF	https://doi.org/10.26615/978-954-452-049-6_060
PWC	https://paperswithcode.com/paper/summarizing-world-speak-a-preliminary-graph
Repo
Framework

Between Reading Time and Information Structure


Title	Between Reading Time and Information Structure
Authors	Masayuki Asahara
Abstract
Tasks	Document Summarization, Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1006/
PDF	https://www.aclweb.org/anthology/Y17-1006
PWC	https://paperswithcode.com/paper/between-reading-time-and-information
Repo
Framework

Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory


Title	Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory
Authors	Yu-Lun Hsieh, Yung-Chun Chang, Nai-Wen Chang, Wen-Lian Hsu
Abstract	In this paper, we propose a recurrent neural network model for identifying protein-protein interactions in biomedical literature. Experiments on two largest public benchmark datasets, AIMed and BioInfer, demonstrate that our approach significantly surpasses state-of-the-art methods with relative improvements of 10{%} and 18{%}, respectively. Cross-corpus evaluation also demonstrate that the proposed model remains robust despite using different training data. These results suggest that RNN can effectively capture semantic relationships among proteins as well as generalizes over different corpora, without any feature engineering.
Tasks	Feature Engineering
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2041/
PDF	https://www.aclweb.org/anthology/I17-2041
PWC	https://paperswithcode.com/paper/identifying-protein-protein-interactions-in
Repo
Framework


Title	A Corpus Analysis of Social Connections and Social Isolation in Adolescents Suffering from Depressive Disorders
Authors	Jia-Wen Guo, Danielle L Mowery, Djin Lai, Katherine Sward, Mike Conway
Abstract	Social connection and social isolation are associated with depressive symptoms, particularly in adolescents and young adults, but how these concepts are documented in clinical notes is unknown. This pilot study aimed to identify the topics relevant to social connection and isolation by analyzing 145 clinical notes from patients with depression diagnosis. We found that providers, including physicians, nurses, social workers, and psychologists, document descriptions of both social connection and social isolation.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-3103/
PDF	https://www.aclweb.org/anthology/W17-3103
PWC	https://paperswithcode.com/paper/a-corpus-analysis-of-social-connections-and
Repo
Framework

Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision


Title	Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision
Authors	Leyi Wang, Rui Xia
Abstract	Sentiment lexicon is an important tool for identifying the sentiment polarity of words and texts. How to automatically construct sentiment lexicons has become a research topic in the field of sentiment analysis and opinion mining. Recently there were some attempts to employ representation learning algorithms to construct a sentiment lexicon with sentiment-aware word embedding. However, these methods were normally trained under document-level sentiment supervision. In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon. Experiments on the SemEval 2013-2016 datasets indicate that the sentiment lexicon generated by our approach achieves the state-of-the-art performance in both supervised and unsupervised sentiment classification, in comparison with several strong sentiment lexicon construction methods.
Tasks	Opinion Mining, Representation Learning, Sentiment Analysis, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1052/
PDF	https://www.aclweb.org/anthology/D17-1052
PWC	https://paperswithcode.com/paper/sentiment-lexicon-construction-with
Repo
Framework

A study on irony within the context of 7x1-PT corpus


Title	A study on irony within the context of 7x1-PT corpus
Authors	Silvia Moraes, Rackel Machado, Matheus Redecker, Rafael Cadaval, Felipe Meneguzzi
Abstract
Tasks	Opinion Mining, Sentiment Analysis
Published	2017-10-01
URL	https://www.aclweb.org/anthology/W17-6604/
PDF	https://www.aclweb.org/anthology/W17-6604
PWC	https://paperswithcode.com/paper/a-study-on-irony-within-the-context-of-7x1-pt
Repo
Framework

Toward Stance Classification Based on Claim Microstructures


Title	Toward Stance Classification Based on Claim Microstructures
Authors	Filip Boltu{\v{z}}i{'c}, Jan {\v{S}}najder
Abstract	Claims are the building blocks of arguments and the reasons underpinning opinions, thus analyzing claims is important for both argumentation mining and opinion mining. We propose a framework for representing claims as microstructures, which express the beliefs, judgments, and policies about the relations between domain-specific concepts. In a proof-of-concept study, we manually build microstructures for over 800 claims extracted from an online debate. We test the so-obtained microstructures on the task of claim stance classification, achieving considerable improvements over text-based baselines.
Tasks	Argument Mining, Fine-Grained Opinion Analysis, Opinion Mining
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5210/
PDF	https://www.aclweb.org/anthology/W17-5210
PWC	https://paperswithcode.com/paper/toward-stance-classification-based-on-claim
Repo
Framework

Supervised and unsupervised approaches to measuring usage similarity


Title	Supervised and unsupervised approaches to measuring usage similarity
Authors	Milton King, Paul Cook
Abstract	Usage similarity (USim) is an approach to determining word meaning in context that does not rely on a sense inventory. Instead, pairs of usages of a target lemma are rated on a scale. In this paper we propose unsupervised approaches to USim based on embeddings for words, contexts, and sentences, and achieve state-of-the-art results over two USim datasets. We further consider supervised approaches to USim, and find that although they outperform unsupervised approaches, they are unable to generalize to lemmas that are unseen in the training data.
Tasks	Word Sense Disambiguation, Word Sense Induction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1906/
PDF	https://www.aclweb.org/anthology/W17-1906
PWC	https://paperswithcode.com/paper/supervised-and-unsupervised-approaches-to
Repo
Framework