October 16, 2019

2452 words 12 mins read

Paper Group NAWR 23

Paper Group NAWR 23

Improving Language Understanding by Generative Pre-Training. Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields. Learning semantic similarity in a continuous space. A Korean Knowledge Extraction System for Enriching a KBox. Automatic Opinion Question Generation. A Neural Layered Model for Nested Named Entity Recognition. …

Improving Language Understanding by Generative Pre-Training

Title Improving Language Understanding by Generative Pre-Training
Authors Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
Abstract Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).
Tasks Document Classification, Language Modelling, Natural Language Inference, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published 2018-06-11
URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
PDF https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
PWC https://paperswithcode.com/paper/improving-language-understanding-by
Repo https://github.com/openai/finetune-transformer-lm
Framework tf

Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields

Title Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields
Authors Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A.
Abstract Molecular dynamics (MD) simulations employing classical force fields constitute the cornerstone of contemporary atomistic modeling in chemistry, biology, and materials science. However, the predictive power of these simulations is only as good as the underlying interatomic potential. Classical potentials often fail to faithfully capture key quantum effects in molecules and materials. Here we enable the direct construction of flexible molecular force fields from high-level ab initio calculations by incorporating spatial and temporal physical symmetries into a gradient-domain machine learning (sGDML) model in an automatic data-driven way. The developed sGDML approach faithfully reproduces global force fields at quantum-chemical CCSD(T) level of accuracy and allows converged molecular dynamics simulations with fully quantized electrons and nuclei. We present MD simulations, for flexible molecules with up to a few dozen atoms and provide insights into the dynamical behavior of these molecules. Our approach provides the key missing ingredient for achieving spectroscopic accuracy in molecular simulations.
Tasks MD17 dataset
Published 2018-08-24
URL https://www.nature.com/articles/s41467-018-06169-2
PDF https://www.nature.com/articles/s41467-018-06169-2.pdf
PWC https://paperswithcode.com/paper/towards-exact-molecular-dynamics-simulations-1
Repo https://github.com/stefanch/sGDML
Framework pytorch

Learning semantic similarity in a continuous space

Title Learning semantic similarity in a continuous space
Authors Michel Deudon
Abstract We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover’s Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to “travel” to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2018-12-01
URL http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space
PDF http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space.pdf
PWC https://paperswithcode.com/paper/learning-semantic-similarity-in-a-continuous
Repo https://github.com/MichelDeudon/variational-siamese-network
Framework tf

A Korean Knowledge Extraction System for Enriching a KBox

Title A Korean Knowledge Extraction System for Enriching a KBox
Authors Sangha Nam, Eun-kyung Kim, Jiho Kim, Yoosung Jung, Kijong Han, Key-Sun Choi
Abstract The increased demand for structured knowledge has created considerable interest in knowledge extraction from natural language sentences. This study presents a new Korean knowledge extraction system and web interface for enriching a KBox knowledge base that expands based on the Korean DBpedia. The aim is to create an endpoint where knowledge can be extracted and added to KBox anytime and anywhere.
Tasks Entity Linking, Relation Extraction
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-2005/
PDF https://www.aclweb.org/anthology/C18-2005
PWC https://paperswithcode.com/paper/a-korean-knowledge-extraction-system-for
Repo https://github.com/machinereading/wisekb-demo
Framework none

Automatic Opinion Question Generation

Title Automatic Opinion Question Generation
Authors Yllias Chali, Tina Baghaee
Abstract We study the problem of opinion question generation from sentences with the help of community-based question answering systems. For this purpose, we use a sequence to sequence attentional model, and we adopt coverage mechanism to prevent sentences from repeating themselves. Experimental results on the Amazon question/answer dataset show an improvement in automatic evaluation metrics as well as human evaluations from the state-of-the-art question generation systems.
Tasks Community Question Answering, Question Answering, Question Generation, Reading Comprehension, Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6518/
PDF https://www.aclweb.org/anthology/W18-6518
PWC https://paperswithcode.com/paper/automatic-opinion-question-generation
Repo https://github.com/Tina-19/Question-Generation
Framework pytorch

A Neural Layered Model for Nested Named Entity Recognition

Title A Neural Layered Model for Nested Named Entity Recognition
Authors Meizhi Ju, Makoto Miwa, Sophia Ananiadou
Abstract Entity mentions embedded in longer entity mentions are referred to as nested entities. Most named entity recognition (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers. Each flat NER layer is based on the state-of-the-art flat NER model that captures sequential context representation with bidirectional Long Short-Term Memory (LSTM) layer and feeds it to the cascaded CRF layer. Our model merges the output of the LSTM layer in the current flat NER layer to build new representation for detected entities and subsequently feeds them into the next flat NER layer. This allows our model to extract outer entities by taking full advantage of information encoded in their corresponding inner entities, in an inside-to-outside way. Our model dynamically stacks the flat NER layers until no outer entities are extracted. Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on nested NER, achieving 74.7{%} and 72.2{%} on GENIA and ACE2005 datasets, respectively, in terms of F-score.
Tasks Entity Linking, Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Relation Extraction
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1131/
PDF https://www.aclweb.org/anthology/N18-1131
PWC https://paperswithcode.com/paper/a-neural-layered-model-for-nested-named
Repo https://github.com/meizhiju/layered-bilstm-crf
Framework none

Adversarial Feature Adaptation for Cross-lingual Relation Classification

Title Adversarial Feature Adaptation for Cross-lingual Relation Classification
Authors Bowei Zou, Zengzhuang Xu, Yu Hong, Guodong Zhou
Abstract Relation Classification aims to classify the semantic relationship between two marked entities in a given sentence. It plays a vital role in a variety of natural language processing applications. Most existing methods focus on exploiting mono-lingual data, e.g., in English, due to the lack of annotated data in other languages. In this paper, we come up with a feature adaptation approach for cross-lingual relation classification, which employs a generative adversarial network (GAN) to transfer feature representations from one language with rich annotated data to another language with scarce annotated data. Such a feature adaptation approach enables feature imitation via the competition between a relation classification network and a rival discriminator. Experimental results on the ACE 2005 multilingual training corpus, treating English as the source language and Chinese the target, demonstrate the effectiveness of our proposed approach, yielding an improvement of 5.7{%} over the state-of-the-art.
Tasks Domain Adaptation, Knowledge Base Population, Question Answering, Relation Classification, Representation Learning
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1037/
PDF https://www.aclweb.org/anthology/C18-1037
PWC https://paperswithcode.com/paper/adversarial-feature-adaptation-for-cross
Repo https://github.com/zoubowei/feature_adaptation4RC
Framework pytorch

Training for Diversity in Image Paragraph Captioning

Title Training for Diversity in Image Paragraph Captioning
Authors Luke Melas-Kyriazi, Alex Rush, er, George Han
Abstract Image paragraph captioning models aim to produce detailed descriptions of a source image. These models use similar techniques as standard image captioning models, but they have encountered issues in text generation, notably a lack of diversity between sentences, that have limited their effectiveness. In this work, we consider applying sequence-level training for this task. We find that standard self-critical training produces poor results, but when combined with an integrated penalty on trigram repetition produces much more diverse paragraphs. This simple training approach improves on the best result on the Visual Genome paragraph captioning dataset from 16.9 to 30.6 CIDEr, with gains on METEOR and BLEU as well, without requiring any architectural changes.
Tasks Image Captioning, Image Paragraph Captioning, Machine Translation, Object Detection, Policy Gradient Methods, Text Generation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1084/
PDF https://www.aclweb.org/anthology/D18-1084
PWC https://paperswithcode.com/paper/training-for-diversity-in-image-paragraph
Repo https://github.com/lukemelas/image-paragraph-captioning
Framework pytorch

Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory

Title Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory
Authors Tuomo Hiippala, Serafina Orekhova
Abstract
Tasks Question Answering, Visual Question Answering
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1303/
PDF https://www.aclweb.org/anthology/L18-1303
PWC https://paperswithcode.com/paper/enhancing-the-ai2-diagrams-dataset-using
Repo https://github.com/DigitalGeographyLab/AI2D-RST
Framework none

Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection

Title Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
Authors Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel
Abstract Grammatical error correction, like other machine learning tasks, greatly benefits from large quantities of high quality training data, which is typically expensive to produce. While writing a program to automatically generate realistic grammatical errors would be difficult, one could learn the distribution of naturally-occurring errors and attempt to introduce them into other datasets. Initial work on inducing errors in this way using statistical machine translation has shown promise; we investigate cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure. Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model to gain further improvements of over 5{%} F0.5 score. When attempting to determine if a given sentence is synthetic, a human annotator at best achieves 39.39 F1 score, indicating that our model generates mostly human-like instances.
Tasks Grammatical Error Correction, Grammatical Error Detection, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1541/
PDF https://www.aclweb.org/anthology/D18-1541
PWC https://paperswithcode.com/paper/wronging-a-right-generating-better-errors-to
Repo https://github.com/skasewa/wronging
Framework tf

Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU

Title Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Authors Normunds Gruzitis, Lauma Pretkalnina, Baiba Saulite, Laura Rituma, Gunta Nespore-Berzkalne, Arturs Znotins, Peteris Paikens
Abstract
Tasks Abstractive Text Summarization, Coreference Resolution, Entity Linking, Knowledge Base Population, Named Entity Recognition, Semantic Parsing, Semantic Role Labeling, Text Summarization
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1714/
PDF https://www.aclweb.org/anthology/L18-1714
PWC https://paperswithcode.com/paper/creation-of-a-balanced-state-of-the-art
Repo https://github.com/LUMII-AILab/FullStack
Framework none

Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach

Title Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach
Authors Thuy-Trang Vu, Gholamreza Haffari
Abstract Automated Post-Editing (PE) is the task of automatically correct common and repetitive errors found in machine translation (MT) output. In this paper, we present a neural programmer-interpreter approach to this task, resembling the way that human perform post-editing using discrete edit operations, wich we refer to as programs. Our model outperforms previous neural models for inducing PE programs on the WMT17 APE task for German-English up to +1 BLEU score and -0.7 TER scores.
Tasks Automatic Post-Editing, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1341/
PDF https://www.aclweb.org/anthology/D18-1341
PWC https://paperswithcode.com/paper/automatic-post-editing-of-machine-translation
Repo https://github.com/trangvu/ape-npi
Framework tf

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension

Title SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Authors Taeuk Kim, Jihun Choi, Sang-goo Lee
Abstract We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70{%} on the development set and about 60{%} on the test set.
Tasks Machine Translation, Transfer Learning
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1182/
PDF https://www.aclweb.org/anthology/S18-1182
PWC https://paperswithcode.com/paper/snu_ids-at-semeval-2018-task-12-sentence-1
Repo https://github.com/galsang/SemEval2018-task12
Framework pytorch

GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task

Title GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task
Authors HongSeok Choi, Hyunju Lee
Abstract This paper describes our GIST team system that participated in SemEval-2018 Argument Reasoning Comprehension task (Task 12). Here, we address two challenging factors: unstated common senses and two lexically close warrants that lead to contradicting claims. A key idea for our system is full use of transfer learning from the Natural Language Inference (NLI) task to this task. We used Enhanced Sequential Inference Model (ESIM) to learn the NLI dataset. We describe how to use ESIM for transfer learning to choose correct warrant through a proposed system. We show comparable results through ablation experiments. Our system ranked 1st among 22 systems, outperforming all the systems more than 10{%}.
Tasks Common Sense Reasoning, Natural Language Inference, Transfer Learning
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1122/
PDF https://www.aclweb.org/anthology/S18-1122
PWC https://paperswithcode.com/paper/gist-at-semeval-2018-task-12-a-network
Repo https://github.com/hongking9/SemEval-2018-task12
Framework none

NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager

Title NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager
Authors Idris Yusupov, Yurii Kuratov
Abstract We present bot{#}1337: a dialog system developed for the 1st NIPS Conversational Intelligence Challenge 2017 (ConvAI). The aim of the competition was to implement a bot capable of conversing with humans based on a given passage of text. To enable conversation, we implemented a set of skills for our bot, including chit-chat, topic detection, text summarization, question answering and question generation. The system has been trained in a supervised setting using a dialogue manager to select an appropriate skill for generating a response. The latter allows a developer to focus on the skill implementation rather than the finite state machine based dialog manager. The proposed system bot{#}1337 won the competition with an average dialogue quality score of 2.78 out of 5 given by human evaluators. Source code and trained models for the bot{#}1337 are available on GitHub.
Tasks Goal-Oriented Dialog, Goal-Oriented Dialogue Systems, Machine Translation, Question Answering, Question Generation, Short-Text Conversation, Text Summarization
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1312/
PDF https://www.aclweb.org/anthology/C18-1312
PWC https://paperswithcode.com/paper/nips-conversational-intelligence-challenge
Repo https://github.com/sld/convai-bot-1337
Framework none
comments powered by Disqus