October 15, 2019

2740 words 13 mins read

Paper Group NANR 160

A Rich Annotation Scheme for Mental Events. Recursive Neural Structural Correspondence Network for Cross-domain Aspect and Opinion Co-Extraction. Cross-Document Narrative Alignment of Environmental News: A Position Paper on the Challenge of Using Event Chains to Proxy Narrative Features. Improved Evaluation Framework for Complex Plagiarism Detectio …

A Rich Annotation Scheme for Mental Events


Title	A Rich Annotation Scheme for Mental Events
Authors	William Croft, Pavl{'\i}na Pe{\v{s}}kov{'a}, Michael Regan, Sook-kyung Lee
Abstract	We present a rich annotation scheme for the structure of mental events. Mental events are those in which the verb describes a mental state or process, usually oriented towards an external situation. While physical events have been described in detail and there are numerous studies of their semantic analysis and annotation, mental events are less thoroughly studied. The annotation scheme proposed here is based on decompositional analyses in the semantic and typological linguistic literature. The scheme was applied to the news corpus from the 2016 Events workshop, and error analysis of the test annotation provides suggestions for refinement and clarification of the annotation scheme.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4302/
PDF	https://www.aclweb.org/anthology/W18-4302
PWC	https://paperswithcode.com/paper/a-rich-annotation-scheme-for-mental-events
Repo
Framework

Recursive Neural Structural Correspondence Network for Cross-domain Aspect and Opinion Co-Extraction


Title	Recursive Neural Structural Correspondence Network for Cross-domain Aspect and Opinion Co-Extraction
Authors	Wenya Wang, Sinno Jialin Pan
Abstract	Fine-grained opinion analysis aims to extract aspect and opinion terms from each sentence for opinion summarization. Supervised learning methods have proven to be effective for this task. However, in many domains, the lack of labeled data hinders the learning of a precise extraction model. In this case, unsupervised domain adaptation methods are desired to transfer knowledge from the source domain to any unlabeled target domain. In this paper, we develop a novel recursive neural network that could reduce domain shift effectively in word level through syntactic relations. We treat these relations as invariant {``}pivot information{''} across domains to build structural correspondences and generate an auxiliary task to predict the relation between any two adjacent words in the dependency tree. In the end, we demonstrate state-of-the-art results on three benchmark datasets. \|
Tasks	Domain Adaptation, Extract Aspect, Fine-Grained Opinion Analysis, Sentiment Analysis, Unsupervised Domain Adaptation
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1202/
PDF	https://www.aclweb.org/anthology/P18-1202
PWC	https://paperswithcode.com/paper/recursive-neural-structural-correspondence
Repo
Framework

Cross-Document Narrative Alignment of Environmental News: A Position Paper on the Challenge of Using Event Chains to Proxy Narrative Features


Title	Cross-Document Narrative Alignment of Environmental News: A Position Paper on the Challenge of Using Event Chains to Proxy Narrative Features
Authors	Ben Miller
Abstract	Cross-document event chain co-referencing in corpora of news articles would achieve increased precision and generalizability from a method that consistently recognizes narrative, discursive, and phenomenological features such as tense, mood, tone, canonicity and breach, person, hermeneutic composability, speed, and time. Current models that capture primarily linguistic data such as entities, times, and relations or causal relationships may only incidentally capture narrative framing features of events. That limits efforts at narrative and event chain segmentation, among other predicate tasks for narrative search and narrative-based reasoning. It further limits research on audience engagement with journalism about complex subjects. This position paper explores the above proposition with respect to narrative theory and ongoing research on segmenting event chains into narrative units. Our own work in progress approaches this task using event segmentation, word embeddings, and variable length pattern matching in a corpus of 2,000 articles describing environmental events. Our position is that narrative features may or may not be implicitly captured by current methods explicitly focused on events as linguistic phenomena, that they are not explicitly captured, and that further research is required.
Tasks	Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4303/
PDF	https://www.aclweb.org/anthology/W18-4303
PWC	https://paperswithcode.com/paper/cross-document-narrative-alignment-of
Repo
Framework

Improved Evaluation Framework for Complex Plagiarism Detection


Title	Improved Evaluation Framework for Complex Plagiarism Detection
Authors	Anton Belyy, Marina Dubova, Dmitry Nekrasov
Abstract	Plagiarism is a major issue in science and education. Complex plagiarism, such as plagiarism of ideas, is hard to detect, and therefore it is especially important to track improvement of methods correctly. In this paper, we study the performance of plagdet, the main measure for plagiarim detection, on manually paraphrased datasets (such as PAN Summary). We reveal its fallibility under certain conditions and propose an evaluation framework with normalization of inner terms, which is resilient to the dataset imbalance. We conclude with the experimental justification of the proposed measure. The implementation of the new framework is made publicly available as a Github repository.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2026/
PDF	https://www.aclweb.org/anthology/P18-2026
PWC	https://paperswithcode.com/paper/improved-evaluation-framework-for-complex
Repo
Framework

Identifying the Discourse Function of News Article Paragraphs


Title	Identifying the Discourse Function of News Article Paragraphs
Authors	W. Victor Yarlott, Cristina Cornelio, Tian Gao, Mark Finlayson
Abstract	Discourse structure is a key aspect of all forms of text, providing valuable information both to humans and machines. We applied the hierarchical theory of news discourse developed by van Dijk to examine how paragraphs operate as units of discourse structure within news articles{—}what we refer to here as document-level discourse. This document-level discourse provides a characterization of the content of each paragraph that describes its relation to the events presented in the article (such as main events, backgrounds, and consequences) as well as to other components of the story (such as commentary and evaluation). The purpose of a news discourse section is of great utility to story understanding as it affects both the importance and temporal order of items introduced in the text{—}therefore, if we know the news discourse purpose for different sections, we should be able to better rank events for their importance and better construct timelines. We test two hypotheses: first, that people can reliably annotate news articles with van Dijk{'}s theory; second, that we can reliably predict these labels using machine learning. We show that people have a high degree of agreement with each other when annotating the theory (F1 {\textgreater} 0.8, Cohen{'}s kappa {\textgreater} 0.6), demonstrating that it can be both learned and reliably applied by human annotators. Additionally, we demonstrate first steps toward machine learning of the theory, achieving a performance of F1 = 0.54, which is 65{%} of human performance. Moreover, we have generated a gold-standard, adjudicated corpus of 50 documents for document-level discourse annotation based on the ACE Phase 2 corpus.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4304/
PDF	https://www.aclweb.org/anthology/W18-4304
PWC	https://paperswithcode.com/paper/identifying-the-discourse-function-of-news
Repo
Framework

An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines


Title	An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines
Authors	Shi Yuan, Bei Yu
Abstract	This study evaluates the performance of four information extraction tools (extractors) on identifying health claims in health news headlines. A health claim is defined as a triplet: IV (what is being manipulated), DV (what is being measured) and their relation. Tools that can identify health claims provide the foundation for evaluating the accuracy of these claims against authoritative resources. The evaluation result shows that 26{%} headlines do not in-clude health claims, and all extractors face difficulty separating them from the rest. For those with health claims, OPENIE-5.0 performed the best with F-measure at 0.6 level for ex-tracting {``}IV-relation-DV{''}. However, the characteristic linguistic structures in health news headlines, such as incomplete sentences and non-verb relations, pose particular challenge to existing tools. \|
Tasks	Relation Extraction
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4305/
PDF	https://www.aclweb.org/anthology/W18-4305
PWC	https://paperswithcode.com/paper/an-evaluation-of-information-extraction-tools
Repo
Framework

Efficient Large-Scale Neural Domain Classification with Personalized Attention


Title	Efficient Large-Scale Neural Domain Classification with Personalized Attention
Authors	Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya
Abstract	In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs). This scenario is observed in mainstream IPDAs in industry that allow third parties to develop thousands of new domains to augment built-in first party domains to rapidly increase domain coverage and overall IPDA capabilities. We propose a scalable neural model architecture with a shared encoder, a novel attention mechanism that incorporates personalization information and domain-specific classifiers that solves the problem efficiently. Our architecture is designed to efficiently accommodate incremental domain additions achieving two orders of magnitude speed up compared to full model retraining. We consider the practical constraints of real-time production systems, and design to minimize memory footprint and runtime latency. We demonstrate that incorporating personalization significantly improves domain classification accuracy in a setting with thousands of overlapping domains.
Tasks	Spoken Language Understanding
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1206/
PDF	https://www.aclweb.org/anthology/P18-1206
PWC	https://paperswithcode.com/paper/efficient-large-scale-neural-domain
Repo
Framework

Extending the gold standard for a lexical substitution task: is it worth it?


Title	Extending the gold standard for a lexical substitution task: is it worth it?
Authors	Ludovic Tanguy, C{'e}cile Fabre, Laura Rivi{`e}re
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1310/
PDF	https://www.aclweb.org/anthology/L18-1310
PWC	https://paperswithcode.com/paper/extending-the-gold-standard-for-a-lexical
Repo
Framework

Cost-Sensitive Active Learning for Dialogue State Tracking


Title	Cost-Sensitive Active Learning for Dialogue State Tracking
Authors	Kaige Xie, Cheng Chang, Liliang Ren, Lu Chen, Kai Yu
Abstract	Dialogue state tracking (DST), when formulated as a supervised learning problem, relies on labelled data. Since dialogue state annotation usually requires labelling all turns of a single dialogue and utilizing context information, it is very expensive to annotate all available unlabelled data. In this paper, a novel cost-sensitive active learning framework is proposed based on a set of new dialogue-level query strategies. This is the first attempt to apply active learning for dialogue state tracking. Experiments on DSTC2 show that active learning with mixed data query strategies can effectively achieve the same DST performance with significantly less data annotation compared to traditional training approaches.
Tasks	Active Learning, Dialogue State Tracking
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5022/
PDF	https://www.aclweb.org/anthology/W18-5022
PWC	https://paperswithcode.com/paper/cost-sensitive-active-learning-for-dialogue
Repo
Framework

Learning to Flip the Bias of News Headlines


Title	Learning to Flip the Bias of News Headlines
Authors	Wei-Fan Chen, Henning Wachsmuth, Khalid Al-Khatib, Benno Stein
Abstract	This paper introduces the task of {``}flipping{''} the bias of news articles: Given an article with a political bias (left or right), generate an article with the same topic but opposite bias. To study this task, we create a corpus with bias-labeled articles from \textit{all-sides.com}. As a first step, we analyze the corpus and discuss intrinsic characteristics of bias. They point to the main challenges of bias flipping, which in turn lead to a specific setting in the generation process. The paper in hand narrows down the general bias flipping task to focus on bias flipping for news article \textit{headlines}. A manual annotation of headlines from each side reveals that they are self-informative in general and often convey bias. We apply an autoencoder incorporating information from an article{'}s content to learn how to automatically flip the bias. From 200 generated headlines, 73 are classified as understandable by annotators, and 83 maintain the topic while having opposite bias. Insights from our analysis shed light on how to solve the main challenges of bias flipping. \|
Tasks	Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6509/
PDF	https://www.aclweb.org/anthology/W18-6509
PWC	https://paperswithcode.com/paper/learning-to-flip-the-bias-of-news-headlines
Repo
Framework

Investigating the Influence of Bilingual MWU on Trainee Translation Quality


Title	Investigating the Influence of Bilingual MWU on Trainee Translation Quality
Authors	Yu Yuan, Serge Sharoff
Abstract
Tasks	Machine Translation, Word Alignment
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1312/
PDF	https://www.aclweb.org/anthology/L18-1312
PWC	https://paperswithcode.com/paper/investigating-the-influence-of-bilingual-mwu
Repo
Framework

HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks


Title	HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks
Authors	Yi Zhou, Liwen Hu, Jun Xing, Weikai Chen, Han-Wei Kung, Xin Tong, Hao Li
Abstract	We introduce a deep learning-based method to generate full 3D hair geometry from an unconstrained image. Our method can recover local strand details and has real-time performance. State-of-the-art hair modeling techniques rely on large hairstyle collections for nearest neighbor retrieval and then perform ad-hoc refinement. Our deep learning approach, in contrast, is highly efficient in storage and can run 1000 times faster while generating hair with 30K strands. The convolutional neural network takes the 2D orientation field of a hair image as input and generates strand features that are evenly distributed on the parameterized 2D scalp. We introduce a collision loss to synthesize more plausible hairstyles, and the visibility of each strand is also used as a weight term to improve the reconstruction accuracy. The encoder-decoder architecture of our network naturally provides a compact and continuous representation for hairstyles, which allows us to interpolate naturally between hairstyles. We use a large set of rendered synthetic hair models to train our network. Our method scales to real images because an intermediate 2D orientation field, automatically calculated from the real image, factors out the difference between synthetic and real hairs. We demonstrate the effectiveness and robustness of our method on a wide range of challenging real Internet pictures, and show reconstructed hair sequences from videos.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Yi_Zhou_Single-view_Hair_Reconstruction_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Yi_Zhou_Single-view_Hair_Reconstruction_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/hairnet-single-view-hair-reconstruction-using
Repo
Framework

Semantic Match Consistency for Long-Term Visual Localization


Title	Semantic Match Consistency for Long-Term Visual Localization
Authors	Carl Toft, Erik Stenborg, Lars Hammarstrand, Lucas Brynte, Marc Pollefeys, Torsten Sattler, Fredrik Kahl
Abstract	Robust and accurate visual localization across large appearance variations due to changes in time of day, seasons, or changes of the environment is a challenging problem which is of importance to application areas such as navigation of autonomous robots. Traditional feature-based methods often struggle in these conditions due to the significant number of erroneous matches between the image and the 3D model. In this paper, we present a method for scoring the individual correspondences by exploiting semantic information about the query image and the scene. In this way, erroneous correspondences tend to get a low semantic consistency score, whereas correct correspondences tend to get a high score. By incorporating this information in a standard localization pipeline, we show that the localization performance can be significantly improved compared to the state-of-the-art, as evaluated on two challenging long-term localization benchmarks.
Tasks	Visual Localization
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Carl_Toft_Semantic_Match_Consistency_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Carl_Toft_Semantic_Match_Consistency_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/semantic-match-consistency-for-long-term
Repo
Framework

Learning to Define Terms in the Software Domain


Title	Learning to Define Terms in the Software Domain
Authors	Balach, Vidhisha ran, Dheeraj Rajagopal, Rose Catherine Kanjirathinkal, William Cohen
Abstract	One way to test a person{'}s knowledge of a domain is to ask them to define domain-specific terms. Here, we investigate the task of automatically generating definitions of technical terms by reading text from the technical domain. Specifically, we learn definitions of software entities from a large corpus built from the user forum Stack Overflow. To model definitions, we train a language model and incorporate additional domain-specific information like word co-occurrence, and ontological category information. Our approach improves previous baselines by 2 BLEU points for the definition generation task. Our experiments also show the additional challenges associated with the task and the short-comings of language-model based architectures for definition generation.
Tasks	Knowledge Base Population, Language Modelling, Relationship Extraction (Distant Supervised)
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6122/
PDF	https://www.aclweb.org/anthology/W18-6122
PWC	https://paperswithcode.com/paper/learning-to-define-terms-in-the-software
Repo
Framework

Cross-Corpus Training with TreeLSTM for the Extraction of Biomedical Relationships from Text


Title	Cross-Corpus Training with TreeLSTM for the Extraction of Biomedical Relationships from Text
Authors	Legrand Joël, Yannick Toussaint, Chedy Raïssi, Adrien Coulet
Abstract	A bottleneck problem in machine learning-based relationship extraction (RE) algorithms, and particularly of deep learning-based ones, is the availability of training data in the form of annotated corpora. For specific domains, such as biomedicine, the long time and high expertise required for the development of manually annotated corpora explain that most of the existing one are relatively small (i.e., hundreds of sentences). Beside, larger corpora focusing on general or domain-specific relationships (such as citizenship or drug-drug interactions) have been developed. In this paper, we study how large annotated corpora developed for alternative tasks may improve the performances on biomedicine related tasks, for which few annotated resources are available. We experiment two deep learning-based models to extract relationships from biomedical texts with high performance. The first one combine locally extracted features using a Convolutional Neural Network (CNN) model, while the second exploit the syntactic structure of sentences using a Recursive Neural Network (RNN) architecture. Our experiments show that, contrary to the former, the latter benefits from a cross-corpus learning strategy to improve the performance of relationship extraction tasks. Indeed our approach leads to the best published performances for two biomedical RE tasks, and to state-of-the-art results for two other biomedical RE tasks, for which few annotated resources are available (less than 400 manually annotated sentences). This may be particularly impactful in specialized domains in which training resources are scarce, because they would benefit from the training data of other domains for which large annotated corpora does exist.
Tasks	Relationship Extraction (Distant Supervised)
Published	2018-01-01
URL	https://openreview.net/forum?id=S1LXVnxRb
PDF	https://openreview.net/pdf?id=S1LXVnxRb
PWC	https://paperswithcode.com/paper/cross-corpus-training-with-treelstm-for-the
Repo
Framework