Paper Group NANR 127
Unsupervised corpus–wide claim detection. Structured Generation of Technical Reading Lists. Unit Segmentation of Argumentative Texts. Sense Contextualization in a Dependency-Based Compositional Distributional Model. Fermi at SemEval-2017 Task 7: Detection and Interpretation of Homographic puns in English Language. Elliptic Constructions: Spotting …
Unsupervised corpus–wide claim detection
Title | Unsupervised corpus–wide claim detection |
Authors | Ran Levy, Shai Gretz, Benjamin Sznajder, Shay Hummel, Ranit Aharonov, Noam Slonim |
Abstract | Automatic claim detection is a fundamental argument mining task that aims to automatically mine claims regarding a topic of consideration. Previous works on mining argumentative content have assumed that a set of relevant documents is given in advance. Here, we present a first corpus{–} wide claim detection framework, that can be directly applied to massive corpora. Using simple and intuitive empirical observations, we derive a claim sentence query by which we are able to directly retrieve sentences in which the prior probability to include topic-relevant claims is greatly enhanced. Next, we employ simple heuristics to rank the sentences, leading to an unsupervised corpus{–}wide claim detection system, with precision that outperforms previously reported results on the task of claim detection given relevant documents and labeled data. |
Tasks | Argument Mining, Decision Making |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5110/ |
https://www.aclweb.org/anthology/W17-5110 | |
PWC | https://paperswithcode.com/paper/unsupervised-corpusawide-claim-detection |
Repo | |
Framework | |
Structured Generation of Technical Reading Lists
Title | Structured Generation of Technical Reading Lists |
Authors | Jonathan Gordon, Stephen Aguilar, Emily Sheng, Gully Burns |
Abstract | Learners need to find suitable documents to read and prioritize them in an appropriate order. We present a method of automatically generating reading lists, selecting documents based on their pedagogical value to the learner and ordering them using the structure of concepts in the domain. Resulting reading lists related to computational linguistics were evaluated by advanced learners and judged to be near the quality of those generated by domain experts. We provide an open-source implementation of our method to enable future work on reading list generation. |
Tasks | Information Retrieval, Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5029/ |
https://www.aclweb.org/anthology/W17-5029 | |
PWC | https://paperswithcode.com/paper/structured-generation-of-technical-reading |
Repo | |
Framework | |
Unit Segmentation of Argumentative Texts
Title | Unit Segmentation of Argumentative Texts |
Authors | Yamen Ajjour, Wei-Fan Chen, Johannes Kiesel, Henning Wachsmuth, Benno Stein |
Abstract | The segmentation of an argumentative text into argument units and their non-argumentative counterparts is the first step in identifying the argumentative structure of the text. Despite its importance for argument mining, unit segmentation has been approached only sporadically so far. This paper studies the major parameters of unit segmentation systematically. We explore the effectiveness of various features, when capturing words separately, along with their neighbors, or even along with the entire text. Each such context is reflected by one machine learning model that we evaluate within and across three domains of texts. Among the models, our new deep learning approach capturing the entire text turns out best within all domains, with an F-score of up to 88.54. While structural features generalize best across domains, the domain transfer remains hard, which points to major challenges of unit segmentation. |
Tasks | Argument Mining |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5115/ |
https://www.aclweb.org/anthology/W17-5115 | |
PWC | https://paperswithcode.com/paper/unit-segmentation-of-argumentative-texts |
Repo | |
Framework | |
Sense Contextualization in a Dependency-Based Compositional Distributional Model
Title | Sense Contextualization in a Dependency-Based Compositional Distributional Model |
Authors | Pablo Gamallo |
Abstract | Little attention has been paid to distributional compositional methods which employ syntactically structured vector models. As word vectors belonging to different syntactic categories have incompatible syntactic distributions, no trivial compositional operation can be applied to combine them into a new compositional vector. In this article, we generalize the method described by Erk and Pad{'o} (2009) by proposing a dependency-base framework that contextualize not only lemmas but also selectional preferences. The main contribution of the article is to expand their model to a fully compositional framework in which syntactic dependencies are put at the core of semantic composition. We claim that semantic composition is mainly driven by syntactic dependencies. Each syntactic dependency generates two new compositional vectors representing the contextualized sense of the two related lemmas. The sequential application of the compositional operations associated to the dependencies results in as many contextualized vectors as lemmas the composite expression contains. At the end of the semantic process, we do not obtain a single compositional vector representing the semantic denotation of the whole composite expression, but one contextualized vector for each lemma of the whole expression. Our method avoids the troublesome high-order tensor representations by defining lemmas and selectional restrictions as first-order tensors (i.e. standard vectors). A corpus-based experiment is performed to both evaluate the quality of the compositional vectors built with our strategy, and to compare them to other approaches on distributional compositional semantics. The experiments show that our dependency-based compositional method performs as (or even better than) the state-of-the-art. |
Tasks | Representation Learning, Semantic Composition, Word Sense Disambiguation |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2601/ |
https://www.aclweb.org/anthology/W17-2601 | |
PWC | https://paperswithcode.com/paper/sense-contextualization-in-a-dependency-based |
Repo | |
Framework | |
Fermi at SemEval-2017 Task 7: Detection and Interpretation of Homographic puns in English Language
Title | Fermi at SemEval-2017 Task 7: Detection and Interpretation of Homographic puns in English Language |
Authors | Vijayasaradhi Indurthi, Subba Reddy Oota |
Abstract | This paper describes our system for detection and interpretation of English puns. We participated in 2 subtasks related to homographic puns achieve comparable results for these tasks. Through the paper we provide detailed description of the approach, as well as the results obtained in the task. Our models achieved a F1-score of 77.65{%} for Subtask 1 and 52.15{%} for Subtask 2. |
Tasks | Feature Engineering, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2079/ |
https://www.aclweb.org/anthology/S17-2079 | |
PWC | https://paperswithcode.com/paper/fermi-at-semeval-2017-task-7-detection-and |
Repo | |
Framework | |
Elliptic Constructions: Spotting Patterns in UD Treebanks
Title | Elliptic Constructions: Spotting Patterns in UD Treebanks |
Authors | Kira Droganova, Daniel Zeman |
Abstract | |
Tasks | |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0406/ |
https://www.aclweb.org/anthology/W17-0406 | |
PWC | https://paperswithcode.com/paper/elliptic-constructions-spotting-patterns-in |
Repo | |
Framework | |
Functions of Silences towards Information Flow in Spoken Conversation
Title | Functions of Silences towards Information Flow in Spoken Conversation |
Authors | Shammur Absar Chowdhury, Evgeny Stepanov, Morena Danieli, Giuseppe Riccardi |
Abstract | Silence is an integral part of the most frequent turn-taking phenomena in spoken conversations. Silence is sized and placed within the conversation flow and it is coordinated by the speakers along with the other speech acts. The objective of this analytical study is twofold: to explore the functions of silence with duration of one second and above, towards information flow in a dyadic conversation utilizing the sequences of dialog acts present in the turns surrounding the silence itself; and to design a feature space useful for clustering the silences using a hierarchical concept formation algorithm. The resulting clusters are manually grouped into functional categories based on their similarities. It is observed that the silence plays an important role in response preparation, also can indicate speakers{'} hesitation or indecisiveness. It is also observed that sometimes long silences can be used deliberately to get a forced response from another speaker thus making silence a multi-functional and an important catalyst towards information flow. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4601/ |
https://www.aclweb.org/anthology/W17-4601 | |
PWC | https://paperswithcode.com/paper/functions-of-silences-towards-information |
Repo | |
Framework | |
A Multidimensional Lexicon for Interpersonal Stancetaking
Title | A Multidimensional Lexicon for Interpersonal Stancetaking |
Authors | Umashanthi Pavalanathan, Jim Fitzpatrick, Scott Kiesling, Jacob Eisenstein |
Abstract | The sociolinguistic construct of stancetaking describes the activities through which discourse participants create and signal relationships to their interlocutors, to the topic of discussion, and to the talk itself. Stancetaking underlies a wide range of interactional phenomena, relating to formality, politeness, affect, and subjectivity. We present a computational approach to stancetaking, in which we build a theoretically-motivated lexicon of stance markers, and then use multidimensional analysis to identify a set of underlying stance dimensions. We validate these dimensions intrinscially and extrinsically, showing that they are internally coherent, match pre-registered hypotheses, and correlate with social phenomena. |
Tasks | Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1082/ |
https://www.aclweb.org/anthology/P17-1082 | |
PWC | https://paperswithcode.com/paper/a-multidimensional-lexicon-for-interpersonal |
Repo | |
Framework | |
Inflection Generation for Spanish Verbs using Supervised Learning
Title | Inflection Generation for Spanish Verbs using Supervised Learning |
Authors | Cristina Barros, Dimitra Gkatzia, Elena Lloret |
Abstract | We present a novel supervised approach to inflection generation for verbs in Spanish. Our system takes as input the verb{'}s lemma form and the desired features such as person, number, tense, and is able to predict the appropriate grammatical conjugation. Even though our approach learns from fewer examples comparing to previous work, it is able to deal with all the Spanish moods (indicative, subjunctive and imperative) in contrast to previous work which only focuses on indicative and subjunctive moods. We show that in an intrinsic evaluation, our system achieves 99{%} accuracy, outperforming (although not significantly) two competitive state-of-art systems. The successful results obtained clearly indicate that our approach could be integrated into wider approaches related to text generation in Spanish. |
Tasks | Machine Translation, Morphological Inflection, Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4120/ |
https://www.aclweb.org/anthology/W17-4120 | |
PWC | https://paperswithcode.com/paper/inflection-generation-for-spanish-verbs-using |
Repo | |
Framework | |
Generating flexible proper name references in text: Data, models and evaluation
Title | Generating flexible proper name references in text: Data, models and evaluation |
Authors | Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er |
Abstract | This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the discourse context and variation. The model relies on the REGnames corpus, a dataset with 53,102 proper name references to 1,000 people in different discourse contexts. We evaluate the versions of our model from the perspective of how human writers produce proper names, and also how human readers process them. The corpus and the model are publicly available. |
Tasks | Text Generation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1062/ |
https://www.aclweb.org/anthology/E17-1062 | |
PWC | https://paperswithcode.com/paper/generating-flexible-proper-name-references-in |
Repo | |
Framework | |
Language-independent Gender Prediction on Twitter
Title | Language-independent Gender Prediction on Twitter |
Authors | Nikola Ljube{\v{s}}i{'c}, Darja Fi{\v{s}}er, Toma{\v{z}} Erjavec |
Abstract | In this paper we present a set of experiments and analyses on predicting the gender of Twitter users based on language-independent features extracted either from the text or the metadata of users{'} tweets. We perform our experiments on the TwiSty dataset containing manual gender annotations for users speaking six different languages. Our classification results show that, while the prediction model based on language-independent features performs worse than the bag-of-words model when training and testing on the same language, it regularly outperforms the bag-of-words model when applied to different languages, showing very stable results across various languages. Finally we perform a comparative analysis of feature effect sizes across the six languages and show that differences in our features correspond to cultural distances. |
Tasks | Gender Prediction |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2901/ |
https://www.aclweb.org/anthology/W17-2901 | |
PWC | https://paperswithcode.com/paper/language-independent-gender-prediction-on |
Repo | |
Framework | |
Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation
Title | Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation |
Authors | Ye Jiang, Xingyi Song, Jackie Harrison, Shaun Quegan, Diana Maynard |
Abstract | News media typically present biased accounts of news stories, and different publications present different angles on the same event. In this research, we investigate how different publications differ in their approach to stories about climate change, by examining the sentiment and topics presented. To understand these attitudes, we find sentiment targets by combining Latent Dirichlet Allocation (LDA) with SentiWordNet, a general sentiment lexicon. Using LDA, we generate topics containing keywords which represent the sentiment targets, and then annotate the data using SentiWordNet before regrouping the articles based on topic similarity. Preliminary analysis identifies clearly different attitudes on the same issue presented in different news sources. Ongoing work is investigating how systematic these attitudes are between different publications, and how these may change over time. |
Tasks | Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4205/ |
https://www.aclweb.org/anthology/W17-4205 | |
PWC | https://paperswithcode.com/paper/comparing-attitudes-to-climate-change-in-the |
Repo | |
Framework | |
Language-based Construction of Explorable News Graphs for Journalists
Title | Language-based Construction of Explorable News Graphs for Journalists |
Authors | R{'e}mi Bois, Guillaume Gravier, Eric Jamet, Emmanuel Morin, Pascale S{'e}billot, Maxime Robert |
Abstract | Faced with ever-growing news archives, media professionals are in need of advanced tools to explore the information surrounding specific events. This problem is most commonly answered by browsing news datasets, going from article to article and viewing unaltered original content. In this article, we introduce an efficient way to generate links between news items, allowing such browsing through an easily explorable graph, and enrich this graph by automatically typing links in order to inform the user on the nature of the relation between two news pieces. User evaluations are conducted on real world data with journalists in order to assess for the interest of both the graph representation and link typing in a press reviewing task, showing the system to be of significant help for their work. |
Tasks | Entity Extraction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4206/ |
https://www.aclweb.org/anthology/W17-4206 | |
PWC | https://paperswithcode.com/paper/language-based-construction-of-explorable |
Repo | |
Framework | |
Neural Machine Translation with Source Dependency Representation
Title | Neural Machine Translation with Source Dependency Representation |
Authors | Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, Tiejun Zhao |
Abstract | Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its dependency label together. In this paper, we propose a novel NMT with source dependency representation to improve translation performance of NMT, especially long sentences. Empirical results on NIST Chinese-to-English translation task show that our method achieves 1.6 BLEU improvements on average over a strong NMT system. |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1304/ |
https://www.aclweb.org/anthology/D17-1304 | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-with-source |
Repo | |
Framework | |
Incongruent Headlines: Yet Another Way to Mislead Your Readers
Title | Incongruent Headlines: Yet Another Way to Mislead Your Readers |
Authors | Sophie Chesney, Maria Liakata, Massimo Poesio, Matthew Purver |
Abstract | This paper discusses the problem of incongruent headlines: those which do not accurately represent the information contained in the article with which they occur. We emphasise that this phenomenon should be considered separately from recognised problematic headline types such as clickbait and sensationalism, arguing that existing natural language processing (NLP) methods applied to these related concepts are not appropriate for the automatic detection of headline incongruence, as an analysis beyond stylistic traits is necessary. We therefore suggest a number of alternative methodologies that may be appropriate to the task at hand as a foundation for future work in this area. In addition, we provide an analysis of existing data sets which are related to this work, and motivate the need for a novel data set in this domain. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4210/ |
https://www.aclweb.org/anthology/W17-4210 | |
PWC | https://paperswithcode.com/paper/incongruent-headlines-yet-another-way-to |
Repo | |
Framework | |