October 15, 2019

2354 words 12 mins read

Paper Group NANR 137

E2E NLG Challenge Submission: Towards Controllable Generation of Diverse Natural Language. An Attribute Enhanced Domain Adaptive Model for Cold-Start Spam Review Detection. The Role of Syntax During Pronoun Resolution: Evidence from fMRI. Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP. Predictin …

E2E NLG Challenge Submission: Towards Controllable Generation of Diverse Natural Language


Title	E2E NLG Challenge Submission: Towards Controllable Generation of Diverse Natural Language
Authors	Henry Elder, Sebastian Gehrmann, Alex O{'}Connor, er, Qun Liu
Abstract	In natural language generation (NLG), the task is to generate utterances from a more abstract input, such as structured data. An added challenge is to generate utterances that contain an accurate representation of the input, while reflecting the fluency and variety of human-generated text. In this paper, we report experiments with NLG models that can be used in task oriented dialogue systems. We explore the use of additional input to the model to encourage diversity and control of outputs. While our submission does not rank highly using automated metrics, qualitative investigation of generated utterances suggests the use of additional information in neural network NLG systems to be a promising research direction.
Tasks	Machine Translation, Task-Oriented Dialogue Systems, Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6556/
PDF	https://www.aclweb.org/anthology/W18-6556
PWC	https://paperswithcode.com/paper/e2e-nlg-challenge-submission-towards
Repo
Framework

An Attribute Enhanced Domain Adaptive Model for Cold-Start Spam Review Detection


Title	An Attribute Enhanced Domain Adaptive Model for Cold-Start Spam Review Detection
Authors	Zhenni You, Tieyun Qian, Bing Liu
Abstract	Spam detection has long been a research topic in both academic and industry due to its wide applications. Previous studies are mainly focused on extracting linguistic or behavior features to distinguish the spam and legitimate reviews. Such features are either ineffective or take long time to collect and thus are hard to be applied to cold-start spam review detection tasks. Recent advance leveraged the neural network to encode the textual and behavior features for the cold-start problem. However, the abundant attribute information are largely neglected by the existing framework. In this paper, we propose a novel deep learning architecture for incorporating entities and their inherent attributes from various domains into a unified framework. Specifically, our model not only encodes the entities of reviewer, item, and review, but also their attributes such as location, date, price ranges. Furthermore, we present a domain classifier to adapt the knowledge from one domain to the other. With the abundant attributes in existing entities and knowledge in other domains, we successfully solve the problem of data scarcity in the cold-start settings. Experimental results on two Yelp datasets prove that our proposed framework significantly outperforms the state-of-the-art methods.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1160/
PDF	https://www.aclweb.org/anthology/C18-1160
PWC	https://paperswithcode.com/paper/an-attribute-enhanced-domain-adaptive-model
Repo
Framework

The Role of Syntax During Pronoun Resolution: Evidence from fMRI


Title	The Role of Syntax During Pronoun Resolution: Evidence from fMRI
Authors	Jixing Li, Murielle Fabre, Wen-Ming Luh, John Hale
Abstract	The current study examined the role of syntactic structure during pronoun resolution. We correlated complexity measures derived by the syntax-sensitive Hobbs algorithm and a neural network model for pronoun resolution with brain activity of participants listening to an audiobook during fMRI recording. Compared to the neural network model, the Hobbs algorithm is associated with larger clusters of brain activation in a network including the left Broca{'}s area.
Tasks	Coreference Resolution
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2808/
PDF	https://www.aclweb.org/anthology/W18-2808
PWC	https://paperswithcode.com/paper/the-role-of-syntax-during-pronoun-resolution
Repo
Framework

Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP


Title	Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP
Authors
Abstract
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2900/
PDF	https://www.aclweb.org/anthology/W18-2900
PWC	https://paperswithcode.com/paper/proceedings-of-the-workshop-on-the-relevance
Repo
Framework

Predicting Perceived Age: Both Language Ability and Appearance are Important


Title	Predicting Perceived Age: Both Language Ability and Appearance are Important
Authors	Sarah Plane, Ariel Marvasti, Tyler Egan, Casey Kennington
Abstract	When interacting with robots in a situated spoken dialogue setting, human dialogue partners tend to assign anthropomorphic and social characteristics to those robots. In this paper, we explore the age and educational level that human dialogue partners assign to three different robotic systems, including an un-embodied spoken dialogue system. We found that how a robot speaks is as important to human perceptions as the way the robot looks. Using the data from our experiment, we derived prosodic, emotional, and linguistic features from the participants to train and evaluate a classifier that predicts perceived intelligence, age, and education level.
Tasks	Language Acquisition
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5014/
PDF	https://www.aclweb.org/anthology/W18-5014
PWC	https://paperswithcode.com/paper/predicting-perceived-age-both-language
Repo
Framework

A Pseudo Label based Dataless Naive Bayes Algorithm for Text Classification with Seed Words


Title	A Pseudo Label based Dataless Naive Bayes Algorithm for Text Classification with Seed Words
Authors	Ximing Li, Bo Yang
Abstract	Traditional supervised text classifiers require a large number of manually labeled documents, which are often expensive to obtain. Recently, dataless text classification has attracted more attention, since it only requires very few seed words of categories that are much cheaper. In this paper, we develop a pseudo-label based dataless Naive Bayes (PL-DNB) classifier with seed words. We initialize pseudo-labels for each document using seed word occurrences, and employ the expectation maximization algorithm to train PL-DNB in a semi-supervised manner. The pseudo-labels are iteratively updated using a mixture of seed word occurrences and estimations of label posteriors. To avoid noisy pseudo-labels, we also consider the information of nearest neighboring documents in the pseudo-label update step, i.e., preserving local neighborhood structure of documents. We empirically show that PL-DNB outperforms traditional dataless text classification algorithms with seed words. Especially, PL-DNB performs well on the imbalanced dataset.
Tasks	Text Classification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1162/
PDF	https://www.aclweb.org/anthology/C18-1162
PWC	https://paperswithcode.com/paper/a-pseudo-label-based-dataless-naive-bayes
Repo
Framework

PROMT Systems for WMT 2018 Shared Translation Task


Title	PROMT Systems for WMT 2018 Shared Translation Task
Authors	Alex Molchanov, er
Abstract	This paper describes the PROMT submissions for the WMT 2018 Shared News Translation Task. This year we participated only in the English-Russian language pair. We built two primary neural networks-based systems: 1) a pure Marian-based neural system and 2) a hybrid system which incorporates OpenNMT-based neural post-editing component into our RBMT engine. We also submitted pure rule-based translation (RBMT) for contrast. We show competitive results with both primary submissions which significantly outperform the RBMT baseline.
Tasks	Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6420/
PDF	https://www.aclweb.org/anthology/W18-6420
PWC	https://paperswithcode.com/paper/promt-systems-for-wmt-2018-shared-translation
Repo
Framework

Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps


Title	Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps
Authors	Nobuyuki Shimizu, Na Rong, Takashi Miyazaki
Abstract	Visual question answering (VQA) is a challenging task that requires a computer system to understand both a question and an image. While there is much research on VQA in English, there is a lack of datasets for other languages, and English annotation is not directly applicable in those languages. To deal with this, we have created a Japanese VQA dataset by using crowdsourced annotation with images from the Visual Genome dataset. This is the first such dataset in Japanese. As another contribution, we propose a cross-lingual method for making use of English annotation to improve a Japanese VQA system. The proposed method is based on a popular VQA method that uses an attention mechanism. We use attention maps generated from English questions to help improve the Japanese VQA task. The proposed method experimentally performed better than simply using a monolingual corpus, which demonstrates the effectiveness of using attention maps to transfer cross-lingual information.
Tasks	Cross-Lingual Transfer, Image Captioning, Question Answering, Visual Question Answering
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1163/
PDF	https://www.aclweb.org/anthology/C18-1163
PWC	https://paperswithcode.com/paper/visual-question-answering-dataset-for
Repo
Framework

Scalable Hyperparameter Transfer Learning


Title	Scalable Hyperparameter Transfer Learning
Authors	Valerio Perrone, Rodolphe Jenatton, Matthias W. Seeger, Cedric Archambeau
Abstract	Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process (GP) regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, GP-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs. We propose a multi-task adaptive Bayesian linear regression model for transfer learning in BO, whose complexity is linear in the function evaluations: one Bayesian linear regression model is associated to each black-box function optimization problem (or task), while transfer learning is achieved by coupling the models through a shared deep neural net. Experiments show that the neural net learns a representation suitable for warm-starting the black-box optimization problems and that BO runs can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss). The proposed method was found to be at least one order of magnitude faster that methods recently published in the literature.
Tasks	Hyperparameter Optimization, Transfer Learning
Published	2018-12-01
URL	http://papers.nips.cc/paper/7917-scalable-hyperparameter-transfer-learning
PDF	http://papers.nips.cc/paper/7917-scalable-hyperparameter-transfer-learning.pdf
PWC	https://paperswithcode.com/paper/scalable-hyperparameter-transfer-learning
Repo
Framework

Policy and Value Transfer in Lifelong Reinforcement Learning


Title	Policy and Value Transfer in Lifelong Reinforcement Learning
Authors	David Abel, Yuu Jinnai, Sophie Yue Guo, George Konidaris, Michael Littman
Abstract	We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution. First, we identify the initial policy that optimizes expected performance over the distribution of tasks for increasingly complex classes of policy and task distributions. We empirically demonstrate the relative performance of each policy class’ optimal element in a variety of simple task distributions. We then consider value-function initialization methods that preserve PAC guarantees while simultaneously minimizing the learning required in two learning algorithms, yielding MaxQInit, a practical new method for value-function-based transfer. We show that MaxQInit performs well in simple lifelong RL experiments.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2271
PDF	http://proceedings.mlr.press/v80/abel18b/abel18b.pdf
PWC	https://paperswithcode.com/paper/policy-and-value-transfer-in-lifelong
Repo
Framework

MCDTB: A Macro-level Chinese Discourse TreeBank


Title	MCDTB: A Macro-level Chinese Discourse TreeBank
Authors	Feng Jiang, Sheng Xu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu, Guodong Zhou
Abstract	In view of the differences between the annotations of micro and macro discourse rela-tionships, this paper describes the relevant experiments on the construction of the Macro Chinese Discourse Treebank (MCDTB), a higher-level Chinese discourse corpus. Fol-lowing RST (Rhetorical Structure Theory), we annotate the macro discourse information, including discourse structure, nuclearity and relationship, and the additional discourse information, including topic sentences, lead and abstract, to make the macro discourse annotation more objective and accurate. Finally, we annotated 720 articles with a Kappa value greater than 0.6. Preliminary experiments on this corpus verify the computability of MCDTB.
Tasks	Reading Comprehension
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1296/
PDF	https://www.aclweb.org/anthology/C18-1296
PWC	https://paperswithcode.com/paper/mcdtb-a-macro-level-chinese-discourse
Repo
Framework

Template-based multilingual football reports generation using Wikidata as a knowledge base


Title	Template-based multilingual football reports generation using Wikidata as a knowledge base
Authors	Lorenzo Gatti, Chris van der Lee, Mari{"e}t Theune
Abstract	This paper presents a new version of a football reports generation system called PASS. The original version generated Dutch text and relied on a limited hand-crafted knowledge base. We describe how, in a short amount of time, we extended PASS to produce English texts, exploiting machine translation and Wikidata as a large-scale source of multilingual knowledge.
Tasks	Machine Translation, Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6523/
PDF	https://www.aclweb.org/anthology/W18-6523
PWC	https://paperswithcode.com/paper/template-based-multilingual-football-reports
Repo
Framework

NICT Self-Training Approach to Neural Machine Translation at NMT-2018


Title	NICT Self-Training Approach to Neural Machine Translation at NMT-2018
Authors	Kenji Imamura, Eiichiro Sumita
Abstract	This paper describes the NICT neural machine translation system submitted at the NMT-2018 shared task. A characteristic of our approach is the introduction of self-training. Since our self-training does not change the model structure, it does not influence the efficiency of translation, such as the translation speed. The experimental results showed that the translation quality improved not only in the sequence-to-sequence (seq-to-seq) models but also in the transformer models.
Tasks	Machine Translation
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2713/
PDF	https://www.aclweb.org/anthology/W18-2713
PWC	https://paperswithcode.com/paper/nict-self-training-approach-to-neural-machine
Repo
Framework

Automatic Glossing in a Low-Resource Setting for Language Documentation


Title	Automatic Glossing in a Low-Resource Setting for Language Documentation
Authors	Sarah Moeller, Mans Hulden
Abstract	Morphological analysis of morphologically rich and low-resource languages is important to both descriptive linguistics and natural language processing. Field documentary efforts usually procure analyzed data in cooperation with native speakers who are capable of providing some level of linguistic information. Manually annotating such data is very expensive and the traditional process is arguably too slow in the face of language endangerment and loss. We report on a case study of learning to automatically gloss a Nakh-Daghestanian language, Lezgi, from a very small amount of seed data. We compare a conditional random field based sequence labeler and a neural encoder-decoder model and show that a nearly 0.9 F1-score on labeled accuracy of morphemes can be achieved with 3,000 words of transcribed oral text. Errors are mostly limited to morphemes with high allomorphy. These results are potentially useful for developing rapid annotation and fieldwork tools to support documentation of morphologically rich, endangered languages.
Tasks	Morphological Analysis
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4809/
PDF	https://www.aclweb.org/anthology/W18-4809
PWC	https://paperswithcode.com/paper/automatic-glossing-in-a-low-resource-setting
Repo
Framework

Multi-glance Reading Model for Text Understanding


Title	Multi-glance Reading Model for Text Understanding
Authors	Pengcheng Zhu, Yujiu Yang, Wenqiang Gao, Yi Liu
Abstract	In recent years, a variety of recurrent neural networks have been proposed, e.g LSTM. However, existing models only read the text once, it cannot describe the situation of repeated reading in reading comprehension. In fact, when reading or analyzing a text, we may read the text several times rather than once if we couldn{'}t well understand it. So, how to model this kind of the reading behavior? To address the issue, we propose a multi-glance mechanism (MGM) for modeling the habit of reading behavior. In the proposed framework, the actual reading process can be fully simulated, and then the obtained information can be consistent with the task. Based on the multi-glance mechanism, we design two types of recurrent neural network models for repeated reading: Glance Cell Model (GCM) and Glance Gate Model (GGM). Visualization analysis of the GCM and the GGM demonstrates the effectiveness of multi-glance mechanisms. Experiments results on the large-scale datasets show that the proposed methods can achieve better performance.
Tasks	Document Classification, Machine Translation, Reading Comprehension, Sentiment Analysis
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2804/
PDF	https://www.aclweb.org/anthology/W18-2804
PWC	https://paperswithcode.com/paper/multi-glance-reading-model-for-text
Repo
Framework