October 16, 2019

2452 words 12 mins read

Paper Group NAWR 23

Improving Language Understanding by Generative Pre-Training. Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields. Learning semantic similarity in a continuous space. A Korean Knowledge Extraction System for Enriching a KBox. Automatic Opinion Question Generation. A Neural Layered Model for Nested Named Entity Recognition. …

Improving Language Understanding by Generative Pre-Training


Title	Improving Language Understanding by Generative Pre-Training
Authors	Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
Abstract	Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).
Tasks	Document Classification, Language Modelling, Natural Language Inference, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published	2018-06-11
URL	https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
PDF	https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
PWC	https://paperswithcode.com/paper/improving-language-understanding-by
Repo	https://github.com/openai/finetune-transformer-lm
Framework	tf

Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields


Title	Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields
Authors	Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A.
Abstract	Molecular dynamics (MD) simulations employing classical force fields constitute the cornerstone of contemporary atomistic modeling in chemistry, biology, and materials science. However, the predictive power of these simulations is only as good as the underlying interatomic potential. Classical potentials often fail to faithfully capture key quantum effects in molecules and materials. Here we enable the direct construction of flexible molecular force fields from high-level ab initio calculations by incorporating spatial and temporal physical symmetries into a gradient-domain machine learning (sGDML) model in an automatic data-driven way. The developed sGDML approach faithfully reproduces global force fields at quantum-chemical CCSD(T) level of accuracy and allows converged molecular dynamics simulations with fully quantized electrons and nuclei. We present MD simulations, for flexible molecules with up to a few dozen atoms and provide insights into the dynamical behavior of these molecules. Our approach provides the key missing ingredient for achieving spectroscopic accuracy in molecular simulations.
Tasks	MD17 dataset
Published	2018-08-24
URL	https://www.nature.com/articles/s41467-018-06169-2
PDF	https://www.nature.com/articles/s41467-018-06169-2.pdf
PWC	https://paperswithcode.com/paper/towards-exact-molecular-dynamics-simulations-1
Repo	https://github.com/stefanch/sGDML
Framework	pytorch

Learning semantic similarity in a continuous space


Title	Learning semantic similarity in a continuous space
Authors	Michel Deudon
Abstract	We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover’s Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to “travel” to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-12-01
URL	http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space
PDF	http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space.pdf
PWC	https://paperswithcode.com/paper/learning-semantic-similarity-in-a-continuous
Repo	https://github.com/MichelDeudon/variational-siamese-network
Framework	tf

A Korean Knowledge Extraction System for Enriching a KBox


Title	A Korean Knowledge Extraction System for Enriching a KBox
Authors	Sangha Nam, Eun-kyung Kim, Jiho Kim, Yoosung Jung, Kijong Han, Key-Sun Choi
Abstract	The increased demand for structured knowledge has created considerable interest in knowledge extraction from natural language sentences. This study presents a new Korean knowledge extraction system and web interface for enriching a KBox knowledge base that expands based on the Korean DBpedia. The aim is to create an endpoint where knowledge can be extracted and added to KBox anytime and anywhere.
Tasks	Entity Linking, Relation Extraction
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-2005/
PDF	https://www.aclweb.org/anthology/C18-2005
PWC	https://paperswithcode.com/paper/a-korean-knowledge-extraction-system-for
Repo	https://github.com/machinereading/wisekb-demo
Framework	none

Automatic Opinion Question Generation


Title	Automatic Opinion Question Generation
Authors	Yllias Chali, Tina Baghaee
Abstract	We study the problem of opinion question generation from sentences with the help of community-based question answering systems. For this purpose, we use a sequence to sequence attentional model, and we adopt coverage mechanism to prevent sentences from repeating themselves. Experimental results on the Amazon question/answer dataset show an improvement in automatic evaluation metrics as well as human evaluations from the state-of-the-art question generation systems.
Tasks	Community Question Answering, Question Answering, Question Generation, Reading Comprehension, Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6518/
PDF	https://www.aclweb.org/anthology/W18-6518
PWC	https://paperswithcode.com/paper/automatic-opinion-question-generation
Repo	https://github.com/Tina-19/Question-Generation
Framework	pytorch

A Neural Layered Model for Nested Named Entity Recognition


Title	A Neural Layered Model for Nested Named Entity Recognition
Authors	Meizhi Ju, Makoto Miwa, Sophia Ananiadou
Abstract	Entity mentions embedded in longer entity mentions are referred to as nested entities. Most named entity recognition (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers. Each flat NER layer is based on the state-of-the-art flat NER model that captures sequential context representation with bidirectional Long Short-Term Memory (LSTM) layer and feeds it to the cascaded CRF layer. Our model merges the output of the LSTM layer in the current flat NER layer to build new representation for detected entities and subsequently feeds them into the next flat NER layer. This allows our model to extract outer entities by taking full advantage of information encoded in their corresponding inner entities, in an inside-to-outside way. Our model dynamically stacks the flat NER layers until no outer entities are extracted. Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on nested NER, achieving 74.7{%} and 72.2{%} on GENIA and ACE2005 datasets, respectively, in terms of F-score.
Tasks	Entity Linking, Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Relation Extraction
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-1131/
PDF	https://www.aclweb.org/anthology/N18-1131
PWC	https://paperswithcode.com/paper/a-neural-layered-model-for-nested-named
Repo	https://github.com/meizhiju/layered-bilstm-crf
Framework	none

Adversarial Feature Adaptation for Cross-lingual Relation Classification


Title	Adversarial Feature Adaptation for Cross-lingual Relation Classification
Authors	Bowei Zou, Zengzhuang Xu, Yu Hong, Guodong Zhou
Abstract	Relation Classification aims to classify the semantic relationship between two marked entities in a given sentence. It plays a vital role in a variety of natural language processing applications. Most existing methods focus on exploiting mono-lingual data, e.g., in English, due to the lack of annotated data in other languages. In this paper, we come up with a feature adaptation approach for cross-lingual relation classification, which employs a generative adversarial network (GAN) to transfer feature representations from one language with rich annotated data to another language with scarce annotated data. Such a feature adaptation approach enables feature imitation via the competition between a relation classification network and a rival discriminator. Experimental results on the ACE 2005 multilingual training corpus, treating English as the source language and Chinese the target, demonstrate the effectiveness of our proposed approach, yielding an improvement of 5.7{%} over the state-of-the-art.
Tasks	Domain Adaptation, Knowledge Base Population, Question Answering, Relation Classification, Representation Learning
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1037/
PDF	https://www.aclweb.org/anthology/C18-1037
PWC	https://paperswithcode.com/paper/adversarial-feature-adaptation-for-cross
Repo	https://github.com/zoubowei/feature_adaptation4RC
Framework	pytorch

Training for Diversity in Image Paragraph Captioning


Title	Training for Diversity in Image Paragraph Captioning
Authors	Luke Melas-Kyriazi, Alex Rush, er, George Han
Abstract	Image paragraph captioning models aim to produce detailed descriptions of a source image. These models use similar techniques as standard image captioning models, but they have encountered issues in text generation, notably a lack of diversity between sentences, that have limited their effectiveness. In this work, we consider applying sequence-level training for this task. We find that standard self-critical training produces poor results, but when combined with an integrated penalty on trigram repetition produces much more diverse paragraphs. This simple training approach improves on the best result on the Visual Genome paragraph captioning dataset from 16.9 to 30.6 CIDEr, with gains on METEOR and BLEU as well, without requiring any architectural changes.
Tasks	Image Captioning, Image Paragraph Captioning, Machine Translation, Object Detection, Policy Gradient Methods, Text Generation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1084/
PDF	https://www.aclweb.org/anthology/D18-1084
PWC	https://paperswithcode.com/paper/training-for-diversity-in-image-paragraph
Repo	https://github.com/lukemelas/image-paragraph-captioning
Framework	pytorch

Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory


Title	Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory
Authors	Tuomo Hiippala, Serafina Orekhova
Abstract
Tasks	Question Answering, Visual Question Answering
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1303/
PDF	https://www.aclweb.org/anthology/L18-1303
PWC	https://paperswithcode.com/paper/enhancing-the-ai2-diagrams-dataset-using
Repo	https://github.com/DigitalGeographyLab/AI2D-RST
Framework	none

Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection


Title	Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
Authors	Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel
Abstract	Grammatical error correction, like other machine learning tasks, greatly benefits from large quantities of high quality training data, which is typically expensive to produce. While writing a program to automatically generate realistic grammatical errors would be difficult, one could learn the distribution of naturally-occurring errors and attempt to introduce them into other datasets. Initial work on inducing errors in this way using statistical machine translation has shown promise; we investigate cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure. Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model to gain further improvements of over 5{%} F0.5 score. When attempting to determine if a given sentence is synthetic, a human annotator at best achieves 39.39 F1 score, indicating that our model generates mostly human-like instances.
Tasks	Grammatical Error Correction, Grammatical Error Detection, Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1541/
PDF	https://www.aclweb.org/anthology/D18-1541
PWC	https://paperswithcode.com/paper/wronging-a-right-generating-better-errors-to
Repo	https://github.com/skasewa/wronging
Framework	tf

Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU


Title	Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Authors	Normunds Gruzitis, Lauma Pretkalnina, Baiba Saulite, Laura Rituma, Gunta Nespore-Berzkalne, Arturs Znotins, Peteris Paikens
Abstract
Tasks	Abstractive Text Summarization, Coreference Resolution, Entity Linking, Knowledge Base Population, Named Entity Recognition, Semantic Parsing, Semantic Role Labeling, Text Summarization
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1714/
PDF	https://www.aclweb.org/anthology/L18-1714
PWC	https://paperswithcode.com/paper/creation-of-a-balanced-state-of-the-art
Repo	https://github.com/LUMII-AILab/FullStack
Framework	none

Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach


Title	Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach
Authors	Thuy-Trang Vu, Gholamreza Haffari
Abstract	Automated Post-Editing (PE) is the task of automatically correct common and repetitive errors found in machine translation (MT) output. In this paper, we present a neural programmer-interpreter approach to this task, resembling the way that human perform post-editing using discrete edit operations, wich we refer to as programs. Our model outperforms previous neural models for inducing PE programs on the WMT17 APE task for German-English up to +1 BLEU score and -0.7 TER scores.
Tasks	Automatic Post-Editing, Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1341/
PDF	https://www.aclweb.org/anthology/D18-1341
PWC	https://paperswithcode.com/paper/automatic-post-editing-of-machine-translation
Repo	https://github.com/trangvu/ape-npi
Framework	tf

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension


Title	SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Authors	Taeuk Kim, Jihun Choi, Sang-goo Lee
Abstract	We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70{%} on the development set and about 60{%} on the test set.
Tasks	Machine Translation, Transfer Learning
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1182/
PDF	https://www.aclweb.org/anthology/S18-1182
PWC	https://paperswithcode.com/paper/snu_ids-at-semeval-2018-task-12-sentence-1
Repo	https://github.com/galsang/SemEval2018-task12
Framework	pytorch

GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task


Title	GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task
Authors	HongSeok Choi, Hyunju Lee
Abstract	This paper describes our GIST team system that participated in SemEval-2018 Argument Reasoning Comprehension task (Task 12). Here, we address two challenging factors: unstated common senses and two lexically close warrants that lead to contradicting claims. A key idea for our system is full use of transfer learning from the Natural Language Inference (NLI) task to this task. We used Enhanced Sequential Inference Model (ESIM) to learn the NLI dataset. We describe how to use ESIM for transfer learning to choose correct warrant through a proposed system. We show comparable results through ablation experiments. Our system ranked 1st among 22 systems, outperforming all the systems more than 10{%}.
Tasks	Common Sense Reasoning, Natural Language Inference, Transfer Learning
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1122/
PDF	https://www.aclweb.org/anthology/S18-1122
PWC	https://paperswithcode.com/paper/gist-at-semeval-2018-task-12-a-network
Repo	https://github.com/hongking9/SemEval-2018-task12
Framework	none

NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager


Title	NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager
Authors	Idris Yusupov, Yurii Kuratov
Abstract	We present bot{#}1337: a dialog system developed for the 1st NIPS Conversational Intelligence Challenge 2017 (ConvAI). The aim of the competition was to implement a bot capable of conversing with humans based on a given passage of text. To enable conversation, we implemented a set of skills for our bot, including chit-chat, topic detection, text summarization, question answering and question generation. The system has been trained in a supervised setting using a dialogue manager to select an appropriate skill for generating a response. The latter allows a developer to focus on the skill implementation rather than the finite state machine based dialog manager. The proposed system bot{#}1337 won the competition with an average dialogue quality score of 2.78 out of 5 given by human evaluators. Source code and trained models for the bot{#}1337 are available on GitHub.
Tasks	Goal-Oriented Dialog, Goal-Oriented Dialogue Systems, Machine Translation, Question Answering, Question Generation, Short-Text Conversation, Text Summarization
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1312/
PDF	https://www.aclweb.org/anthology/C18-1312
PWC	https://paperswithcode.com/paper/nips-conversational-intelligence-challenge
Repo	https://github.com/sld/convai-bot-1337
Framework	none