July 26, 2019

2075 words 10 mins read

Paper Group NANR 173

Domain-Adaptable Hybrid Generation of RDF Entity Descriptions. Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling. Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts. Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial). The Expxorci …

Domain-Adaptable Hybrid Generation of RDF Entity Descriptions


Title	Domain-Adaptable Hybrid Generation of RDF Entity Descriptions
Authors	Or Biran, Kathleen McKeown
Abstract	RDF ontologies provide structured data on entities in many domains and continue to grow in size and diversity. While they can be useful as a starting point for generating descriptions of entities, they often miss important information about an entity that cannot be captured as simple relations. In addition, generic approaches to generation from RDF cannot capture the unique style and content of specific domains. We describe a framework for hybrid generation of entity descriptions, which combines generation from RDF data with text extracted from a corpus, and extracts unique aspects of the domain from the corpus to create domain-specific generation systems. We show that each component of our approach significantly increases the satisfaction of readers with the text across multiple applications and domains.
Tasks	Domain Adaptation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1031/
PDF	https://www.aclweb.org/anthology/I17-1031
PWC	https://paperswithcode.com/paper/domain-adaptable-hybrid-generation-of-rdf
Repo
Framework

Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling


Title	Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling
Authors	Riyaz A. Bhat, Irshad Bhat, Dipti Sharma
Abstract	We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency parser trained on a newswire treebank is strongly biased towards the canonical structures and degrades when applied to conversational data. Inspired by Transformational Generative Grammar (Chomsky, 1965), we mitigate the sampling bias by generating all theoretically possible alternative word orders of a clause from the existing (kernel) structures in the treebank. Training our parser on canonical and transformed structures improves performance on conversational data by around 9{%} LAS over the baseline newswire parser.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6309/
PDF	https://www.aclweb.org/anthology/W17-6309
PWC	https://paperswithcode.com/paper/leveraging-newswire-treebanks-for-parsing-1
Repo
Framework

Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts


Title	Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts
Authors	Gerold Schneider, Eva Pettersson, Michael Percillier
Abstract
Tasks	Machine Translation
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0508/
PDF	https://www.aclweb.org/anthology/W17-0508
PWC	https://paperswithcode.com/paper/comparing-rule-based-and-smt-based-spelling
Repo
Framework

Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)


Title	Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
Authors
Abstract
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1200/
PDF	https://www.aclweb.org/anthology/W17-1200
PWC	https://paperswithcode.com/paper/proceedings-of-the-fourth-workshop-on-nlp-for-1
Repo
Framework

The Expxorcist: Nonparametric Graphical Models Via Conditional Exponential Densities


Title	The Expxorcist: Nonparametric Graphical Models Via Conditional Exponential Densities
Authors	Arun Suggala, Mladen Kolar, Pradeep K. Ravikumar
Abstract	Non-parametric multivariate density estimation faces strong statistical and computational bottlenecks, and the more practical approaches impose near-parametric assumptions on the form of the density functions. In this paper, we leverage recent developments to propose a class of non-parametric models which have very attractive computational and statistical properties. Our approach relies on the simple function space assumption that the conditional distribution of each variable conditioned on the other variables has a non-parametric exponential family form.
Tasks	Density Estimation
Published	2017-12-01
URL	http://papers.nips.cc/paper/7031-the-expxorcist-nonparametric-graphical-models-via-conditional-exponential-densities
PDF	http://papers.nips.cc/paper/7031-the-expxorcist-nonparametric-graphical-models-via-conditional-exponential-densities.pdf
PWC	https://paperswithcode.com/paper/the-expxorcist-nonparametric-graphical-models
Repo
Framework

Using Explicit Discourse Connectives in Translation for Implicit Discourse Relation Classification


Title	Using Explicit Discourse Connectives in Translation for Implicit Discourse Relation Classification
Authors	Wei Shi, Frances Yung, Raphael Rubino, Vera Demberg
Abstract	Implicit discourse relation recognition is an extremely challenging task due to the lack of indicative connectives. Various neural network architectures have been proposed for this task recently, but most of them suffer from the shortage of labeled data. In this paper, we address this problem by procuring additional training data from parallel corpora: When humans translate a text, they sometimes add connectives (a process known as \textit{explicitation}). Weautomatically back-translate it into an English connective and use it to infera label with high confidence. We show that a training set several times largerthan the original training set can be generated this way. With the extralabeled instances, we show that even a simple bidirectional Long Short-TermMemory Network can outperform the current state-of-the-art.
Tasks	Implicit Discourse Relation Classification, Machine Translation, Question Answering, Relation Classification
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1049/
PDF	https://www.aclweb.org/anthology/I17-1049
PWC	https://paperswithcode.com/paper/using-explicit-discourse-connectives-in
Repo
Framework

LIMSI Submission for WMT’17 Shared Task on Bandit Learning


Title	LIMSI Submission for WMT’17 Shared Task on Bandit Learning
Authors	Guillaume Wisniewski
Abstract
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4779/
PDF	https://www.aclweb.org/anthology/W17-4779
PWC	https://paperswithcode.com/paper/limsi-submission-for-wmt17-shared-task-on
Repo
Framework

Sentence Modeling with Deep Neural Architecture using Lexicon and Character Attention Mechanism for Sentiment Classification


Title	Sentence Modeling with Deep Neural Architecture using Lexicon and Character Attention Mechanism for Sentiment Classification
Authors	Huy Thanh Nguyen, Minh Le Nguyen
Abstract	Tweet-level sentiment classification in Twitter social networking has many challenges: exploiting syntax, semantic, sentiment, and context in tweets. To address these problems, we propose a novel approach to sentiment analysis that uses lexicon features for building lexicon embeddings (LexW2Vs) and generates character attention vectors (CharAVs) by using a Deep Convolutional Neural Network (DeepCNN). Our approach integrates LexW2Vs and CharAVs with continuous word embeddings (ContinuousW2Vs) and dependency-based word embeddings (DependencyW2Vs) simultaneously in order to increase information for each word into a Bidirectional Contextual Gated Recurrent Neural Network (Bi-CGRNN). We evaluate our model on two Twitter sentiment classification datasets. Experimental results show that our model can improve the classification accuracy of sentence-level sentiment analysis in Twitter social networking.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1054/
PDF	https://www.aclweb.org/anthology/I17-1054
PWC	https://paperswithcode.com/paper/sentence-modeling-with-deep-neural
Repo
Framework

Boundary-based MWE segmentation with text partitioning


Title	Boundary-based MWE segmentation with text partitioning
Authors	Jake Williams
Abstract	This submission describes the development of a fine-grained, text-chunking algorithm for the task of comprehensive MWE segmentation. This task notably focuses on the identification of colloquial and idiomatic language. The submission also includes a thorough model evaluation in the context of two recent shared tasks, spanning 19 different languages and many text domains, including noisy, user-generated text. Evaluations exhibit the presented model as the best overall for purposes of MWE segmentation, and open-source software is released with the submission (although links are withheld for purposes of anonymity). Additionally, the authors acknowledge the existence of a pre-print document on arxiv.org, which should be avoided to maintain anonymity in review.
Tasks	Chunking, Information Retrieval, Machine Translation, Tokenization
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4401/
PDF	https://www.aclweb.org/anthology/W17-4401
PWC	https://paperswithcode.com/paper/boundary-based-mwe-segmentation-with-text
Repo
Framework

Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks


Title	Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks
Authors	Shikib Mehri, Giuseppe Carenini
Abstract	Thread disentanglement is a precursor to any high-level analysis of multiparticipant chats. Existing research approaches the problem by calculating the likelihood of two messages belonging in the same thread. Our approach leverages a newly annotated dataset to identify reply relationships. Furthermore, we explore the usage of an RNN, along with large quantities of unlabeled data, to learn semantic relationships between messages. Our proposed pipeline, which utilizes a reply classifier and an RNN to generate a set of disentangled threads, is novel and performs well against previous work.
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1062/
PDF	https://www.aclweb.org/anthology/I17-1062
PWC	https://paperswithcode.com/paper/chat-disentanglement-identifying-semantic
Repo
Framework

The SUMMA Platform Prototype


Title	The SUMMA Platform Prototype
Authors	Renars Liepins, Ulrich Germann, Guntis Barzdins, Alex Birch, ra, Steve Renals, Susanne Weber, Peggy van der Kreeft, Herv{'e} Bourlard, Jo{~a}o Prieto, Ond{\v{r}}ej Klejch, Peter Bell, Alex Lazaridis, ros, Alfonso Mendes, Sebastian Riedel, Mariana S. C. Almeida, Pedro Balage, Shay B. Cohen, Tomasz Dwojak, Philip N. Garner, Andreas Giefer, Marcin Junczys-Dowmunt, Hina Imran, David Nogueira, Ahmed Ali, Mir, Sebasti{~a}o a, Andrei Popescu-Belis, Lesly Miculicich Werlen, Nikos Papasarantopoulos, Abiola Obamuyide, Clive Jones, Fahim Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong, Rico Sennrich, Nikolaos Pappas, Shashi Narayan, Marco Damonte, Nadir Durrani, Sameer Khurana, Ahmed Abdelali, Hassan Sajjad, Stephan Vogel, David Sheppey, Chris Hernon, Jeff Mitchell
Abstract	We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring. The platform contains a rich suite of low-level and high-level natural language processing technologies: automatic speech recognition of broadcast media, machine translation, automated tagging and classification of named entities, semantic parsing to detect relationships between entities, and automatic construction / augmentation of factual knowledge bases. Implemented on the Docker platform, it can easily be deployed, customised, and scaled to large volumes of incoming media streams.
Tasks	Machine Translation, Semantic Parsing, Speech Recognition
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-3029/
PDF	https://www.aclweb.org/anthology/E17-3029
PWC	https://paperswithcode.com/paper/the-summa-platform-prototype
Repo
Framework

A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora


Title	A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora
Authors	Aleksi Vesanto, Filip Ginter, Hannu Salmi, Asko Nivala, Tapio Salakoski
Abstract
Tasks	Optical Character Recognition
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0249/
PDF	https://www.aclweb.org/anthology/W17-0249
PWC	https://paperswithcode.com/paper/a-system-for-identifying-and-exploring-text
Repo
Framework

Cascading Multiway Attentions for Document-level Sentiment Classification


Title	Cascading Multiway Attentions for Document-level Sentiment Classification
Authors	Dehong Ma, Sujian Li, Xiaodong Zhang, Houfeng Wang, Xu Sun
Abstract	Document-level sentiment classification aims to assign the user reviews a sentiment polarity. Previous methods either just utilized the document content without consideration of user and product information, or did not comprehensively consider what roles the three kinds of information play in text modeling. In this paper, to reasonably use all the information, we present the idea that user, product and their combination can all influence the generation of attentions to words and sentences, when judging the sentiment of a document. With this idea, we propose a cascading multiway attention (CMA) model, where multiple ways of using user and product information are cascaded to influence the generation of attentions on the word and sentence layers. Then, sentences and documents are well modeled by multiple representation vectors, which provide rich information for sentiment classification. Experiments on IMDB and Yelp datasets demonstrate the effectiveness of our model.
Tasks	Product Recommendation, Sentiment Analysis
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1064/
PDF	https://www.aclweb.org/anthology/I17-1064
PWC	https://paperswithcode.com/paper/cascading-multiway-attentions-for-document
Repo
Framework

Measuring Semantic Relations between Human Activities


Title	Measuring Semantic Relations between Human Activities
Authors	Steven Wilson, Rada Mihalcea
Abstract	The things people do in their daily lives can provide valuable insights into their personality, values, and interests. Unstructured text data on social media platforms are rich in behavioral content, and automated systems can be deployed to learn about human activity on a broad scale if these systems are able to reason about the content of interest. In order to aid in the evaluation of such systems, we introduce a new phrase-level semantic textual similarity dataset comprised of human activity phrases, providing a testbed for automated systems that analyze relationships between phrasal descriptions of people{'}s actions. Our set of 1,000 pairs of activities is annotated by human judges across four relational dimensions including similarity, relatedness, motivational alignment, and perceived actor congruence. We evaluate a set of strong baselines for the task of generating scores that correlate highly with human ratings, and we introduce several new approaches to the phrase-level similarity task in the domain of human activities.
Tasks	Semantic Textual Similarity
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1067/
PDF	https://www.aclweb.org/anthology/I17-1067
PWC	https://paperswithcode.com/paper/measuring-semantic-relations-between-human
Repo
Framework

Open-Domain Neural Dialogue Systems


Title	Open-Domain Neural Dialogue Systems
Authors	Yun-Nung Chen, Jianfeng Gao
Abstract	In the past decade, spoken dialogue systems have been the most prominent component in today{'}s personal assistants. A lot of devices have incorporated dialogue system modules, which allow users to speak naturally in order to finish tasks more efficiently. The traditional conversational systems have rather complex and/or modular pipelines. The advance of deep learning technologies has recently risen the applications of neural models to dialogue modeling. Nevertheless, applying deep learning technologies for building robust and scalable dialogue systems is still a challenging task and an open research area as it requires deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work and the recent state-of-the-art work. Therefore, this tutorial is designed to focus on an overview of the dialogue system development while describing most recent research for building task-oriented and chit-chat dialogue systems, and summarizing the challenges. We target the audience of students and practitioners who have some deep learning background, who want to get more familiar with conversational dialogue systems.
Tasks	Dialogue Management, Dialogue State Tracking, Intent Classification, Spoken Dialogue Systems, Task-Oriented Dialogue Systems, Text Generation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-5003/
PDF	https://www.aclweb.org/anthology/I17-5003
PWC	https://paperswithcode.com/paper/open-domain-neural-dialogue-systems
Repo
Framework