Paper Group NAWR 23
Improving Language Understanding by Generative Pre-Training. Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields. Learning semantic similarity in a continuous space. A Korean Knowledge Extraction System for Enriching a KBox. Automatic Opinion Question Generation. A Neural Layered Model for Nested Named Entity Recognition. …
Improving Language Understanding by Generative Pre-Training
Title | Improving Language Understanding by Generative Pre-Training |
Authors | Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever |
Abstract | Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI). |
Tasks | Document Classification, Language Modelling, Natural Language Inference, Question Answering, Semantic Similarity, Semantic Textual Similarity |
Published | 2018-06-11 |
URL | https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf |
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf | |
PWC | https://paperswithcode.com/paper/improving-language-understanding-by |
Repo | https://github.com/openai/finetune-transformer-lm |
Framework | tf |
Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields
Title | Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields |
Authors | Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A. |
Abstract | Molecular dynamics (MD) simulations employing classical force fields constitute the cornerstone of contemporary atomistic modeling in chemistry, biology, and materials science. However, the predictive power of these simulations is only as good as the underlying interatomic potential. Classical potentials often fail to faithfully capture key quantum effects in molecules and materials. Here we enable the direct construction of flexible molecular force fields from high-level ab initio calculations by incorporating spatial and temporal physical symmetries into a gradient-domain machine learning (sGDML) model in an automatic data-driven way. The developed sGDML approach faithfully reproduces global force fields at quantum-chemical CCSD(T) level of accuracy and allows converged molecular dynamics simulations with fully quantized electrons and nuclei. We present MD simulations, for flexible molecules with up to a few dozen atoms and provide insights into the dynamical behavior of these molecules. Our approach provides the key missing ingredient for achieving spectroscopic accuracy in molecular simulations. |
Tasks | MD17 dataset |
Published | 2018-08-24 |
URL | https://www.nature.com/articles/s41467-018-06169-2 |
https://www.nature.com/articles/s41467-018-06169-2.pdf | |
PWC | https://paperswithcode.com/paper/towards-exact-molecular-dynamics-simulations-1 |
Repo | https://github.com/stefanch/sGDML |
Framework | pytorch |
Learning semantic similarity in a continuous space
Title | Learning semantic similarity in a continuous space |
Authors | Michel Deudon |
Abstract | We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover’s Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to “travel” to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space |
http://papers.nips.cc/paper/7377-learning-semantic-similarity-in-a-continuous-space.pdf | |
PWC | https://paperswithcode.com/paper/learning-semantic-similarity-in-a-continuous |
Repo | https://github.com/MichelDeudon/variational-siamese-network |
Framework | tf |
A Korean Knowledge Extraction System for Enriching a KBox
Title | A Korean Knowledge Extraction System for Enriching a KBox |
Authors | Sangha Nam, Eun-kyung Kim, Jiho Kim, Yoosung Jung, Kijong Han, Key-Sun Choi |
Abstract | The increased demand for structured knowledge has created considerable interest in knowledge extraction from natural language sentences. This study presents a new Korean knowledge extraction system and web interface for enriching a KBox knowledge base that expands based on the Korean DBpedia. The aim is to create an endpoint where knowledge can be extracted and added to KBox anytime and anywhere. |
Tasks | Entity Linking, Relation Extraction |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-2005/ |
https://www.aclweb.org/anthology/C18-2005 | |
PWC | https://paperswithcode.com/paper/a-korean-knowledge-extraction-system-for |
Repo | https://github.com/machinereading/wisekb-demo |
Framework | none |
Automatic Opinion Question Generation
Title | Automatic Opinion Question Generation |
Authors | Yllias Chali, Tina Baghaee |
Abstract | We study the problem of opinion question generation from sentences with the help of community-based question answering systems. For this purpose, we use a sequence to sequence attentional model, and we adopt coverage mechanism to prevent sentences from repeating themselves. Experimental results on the Amazon question/answer dataset show an improvement in automatic evaluation metrics as well as human evaluations from the state-of-the-art question generation systems. |
Tasks | Community Question Answering, Question Answering, Question Generation, Reading Comprehension, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6518/ |
https://www.aclweb.org/anthology/W18-6518 | |
PWC | https://paperswithcode.com/paper/automatic-opinion-question-generation |
Repo | https://github.com/Tina-19/Question-Generation |
Framework | pytorch |
A Neural Layered Model for Nested Named Entity Recognition
Title | A Neural Layered Model for Nested Named Entity Recognition |
Authors | Meizhi Ju, Makoto Miwa, Sophia Ananiadou |
Abstract | Entity mentions embedded in longer entity mentions are referred to as nested entities. Most named entity recognition (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers. Each flat NER layer is based on the state-of-the-art flat NER model that captures sequential context representation with bidirectional Long Short-Term Memory (LSTM) layer and feeds it to the cascaded CRF layer. Our model merges the output of the LSTM layer in the current flat NER layer to build new representation for detected entities and subsequently feeds them into the next flat NER layer. This allows our model to extract outer entities by taking full advantage of information encoded in their corresponding inner entities, in an inside-to-outside way. Our model dynamically stacks the flat NER layers until no outer entities are extracted. Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on nested NER, achieving 74.7{%} and 72.2{%} on GENIA and ACE2005 datasets, respectively, in terms of F-score. |
Tasks | Entity Linking, Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Relation Extraction |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1131/ |
https://www.aclweb.org/anthology/N18-1131 | |
PWC | https://paperswithcode.com/paper/a-neural-layered-model-for-nested-named |
Repo | https://github.com/meizhiju/layered-bilstm-crf |
Framework | none |
Adversarial Feature Adaptation for Cross-lingual Relation Classification
Title | Adversarial Feature Adaptation for Cross-lingual Relation Classification |
Authors | Bowei Zou, Zengzhuang Xu, Yu Hong, Guodong Zhou |
Abstract | Relation Classification aims to classify the semantic relationship between two marked entities in a given sentence. It plays a vital role in a variety of natural language processing applications. Most existing methods focus on exploiting mono-lingual data, e.g., in English, due to the lack of annotated data in other languages. In this paper, we come up with a feature adaptation approach for cross-lingual relation classification, which employs a generative adversarial network (GAN) to transfer feature representations from one language with rich annotated data to another language with scarce annotated data. Such a feature adaptation approach enables feature imitation via the competition between a relation classification network and a rival discriminator. Experimental results on the ACE 2005 multilingual training corpus, treating English as the source language and Chinese the target, demonstrate the effectiveness of our proposed approach, yielding an improvement of 5.7{%} over the state-of-the-art. |
Tasks | Domain Adaptation, Knowledge Base Population, Question Answering, Relation Classification, Representation Learning |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1037/ |
https://www.aclweb.org/anthology/C18-1037 | |
PWC | https://paperswithcode.com/paper/adversarial-feature-adaptation-for-cross |
Repo | https://github.com/zoubowei/feature_adaptation4RC |
Framework | pytorch |
Training for Diversity in Image Paragraph Captioning
Title | Training for Diversity in Image Paragraph Captioning |
Authors | Luke Melas-Kyriazi, Alex Rush, er, George Han |
Abstract | Image paragraph captioning models aim to produce detailed descriptions of a source image. These models use similar techniques as standard image captioning models, but they have encountered issues in text generation, notably a lack of diversity between sentences, that have limited their effectiveness. In this work, we consider applying sequence-level training for this task. We find that standard self-critical training produces poor results, but when combined with an integrated penalty on trigram repetition produces much more diverse paragraphs. This simple training approach improves on the best result on the Visual Genome paragraph captioning dataset from 16.9 to 30.6 CIDEr, with gains on METEOR and BLEU as well, without requiring any architectural changes. |
Tasks | Image Captioning, Image Paragraph Captioning, Machine Translation, Object Detection, Policy Gradient Methods, Text Generation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1084/ |
https://www.aclweb.org/anthology/D18-1084 | |
PWC | https://paperswithcode.com/paper/training-for-diversity-in-image-paragraph |
Repo | https://github.com/lukemelas/image-paragraph-captioning |
Framework | pytorch |
Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory
Title | Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory |
Authors | Tuomo Hiippala, Serafina Orekhova |
Abstract | |
Tasks | Question Answering, Visual Question Answering |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1303/ |
https://www.aclweb.org/anthology/L18-1303 | |
PWC | https://paperswithcode.com/paper/enhancing-the-ai2-diagrams-dataset-using |
Repo | https://github.com/DigitalGeographyLab/AI2D-RST |
Framework | none |
Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
Title | Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection |
Authors | Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel |
Abstract | Grammatical error correction, like other machine learning tasks, greatly benefits from large quantities of high quality training data, which is typically expensive to produce. While writing a program to automatically generate realistic grammatical errors would be difficult, one could learn the distribution of naturally-occurring errors and attempt to introduce them into other datasets. Initial work on inducing errors in this way using statistical machine translation has shown promise; we investigate cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure. Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model to gain further improvements of over 5{%} F0.5 score. When attempting to determine if a given sentence is synthetic, a human annotator at best achieves 39.39 F1 score, indicating that our model generates mostly human-like instances. |
Tasks | Grammatical Error Correction, Grammatical Error Detection, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1541/ |
https://www.aclweb.org/anthology/D18-1541 | |
PWC | https://paperswithcode.com/paper/wronging-a-right-generating-better-errors-to |
Repo | https://github.com/skasewa/wronging |
Framework | tf |
Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Title | Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU |
Authors | Normunds Gruzitis, Lauma Pretkalnina, Baiba Saulite, Laura Rituma, Gunta Nespore-Berzkalne, Arturs Znotins, Peteris Paikens |
Abstract | |
Tasks | Abstractive Text Summarization, Coreference Resolution, Entity Linking, Knowledge Base Population, Named Entity Recognition, Semantic Parsing, Semantic Role Labeling, Text Summarization |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1714/ |
https://www.aclweb.org/anthology/L18-1714 | |
PWC | https://paperswithcode.com/paper/creation-of-a-balanced-state-of-the-art |
Repo | https://github.com/LUMII-AILab/FullStack |
Framework | none |
Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach
Title | Automatic Post-Editing of Machine Translation: A Neural Programmer-Interpreter Approach |
Authors | Thuy-Trang Vu, Gholamreza Haffari |
Abstract | Automated Post-Editing (PE) is the task of automatically correct common and repetitive errors found in machine translation (MT) output. In this paper, we present a neural programmer-interpreter approach to this task, resembling the way that human perform post-editing using discrete edit operations, wich we refer to as programs. Our model outperforms previous neural models for inducing PE programs on the WMT17 APE task for German-English up to +1 BLEU score and -0.7 TER scores. |
Tasks | Automatic Post-Editing, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1341/ |
https://www.aclweb.org/anthology/D18-1341 | |
PWC | https://paperswithcode.com/paper/automatic-post-editing-of-machine-translation |
Repo | https://github.com/trangvu/ape-npi |
Framework | tf |
SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Title | SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension |
Authors | Taeuk Kim, Jihun Choi, Sang-goo Lee |
Abstract | We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70{%} on the development set and about 60{%} on the test set. |
Tasks | Machine Translation, Transfer Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1182/ |
https://www.aclweb.org/anthology/S18-1182 | |
PWC | https://paperswithcode.com/paper/snu_ids-at-semeval-2018-task-12-sentence-1 |
Repo | https://github.com/galsang/SemEval2018-task12 |
Framework | pytorch |
GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task
Title | GIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task |
Authors | HongSeok Choi, Hyunju Lee |
Abstract | This paper describes our GIST team system that participated in SemEval-2018 Argument Reasoning Comprehension task (Task 12). Here, we address two challenging factors: unstated common senses and two lexically close warrants that lead to contradicting claims. A key idea for our system is full use of transfer learning from the Natural Language Inference (NLI) task to this task. We used Enhanced Sequential Inference Model (ESIM) to learn the NLI dataset. We describe how to use ESIM for transfer learning to choose correct warrant through a proposed system. We show comparable results through ablation experiments. Our system ranked 1st among 22 systems, outperforming all the systems more than 10{%}. |
Tasks | Common Sense Reasoning, Natural Language Inference, Transfer Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1122/ |
https://www.aclweb.org/anthology/S18-1122 | |
PWC | https://paperswithcode.com/paper/gist-at-semeval-2018-task-12-a-network |
Repo | https://github.com/hongking9/SemEval-2018-task12 |
Framework | none |
NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager
Title | NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager |
Authors | Idris Yusupov, Yurii Kuratov |
Abstract | We present bot{#}1337: a dialog system developed for the 1st NIPS Conversational Intelligence Challenge 2017 (ConvAI). The aim of the competition was to implement a bot capable of conversing with humans based on a given passage of text. To enable conversation, we implemented a set of skills for our bot, including chit-chat, topic detection, text summarization, question answering and question generation. The system has been trained in a supervised setting using a dialogue manager to select an appropriate skill for generating a response. The latter allows a developer to focus on the skill implementation rather than the finite state machine based dialog manager. The proposed system bot{#}1337 won the competition with an average dialogue quality score of 2.78 out of 5 given by human evaluators. Source code and trained models for the bot{#}1337 are available on GitHub. |
Tasks | Goal-Oriented Dialog, Goal-Oriented Dialogue Systems, Machine Translation, Question Answering, Question Generation, Short-Text Conversation, Text Summarization |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1312/ |
https://www.aclweb.org/anthology/C18-1312 | |
PWC | https://paperswithcode.com/paper/nips-conversational-intelligence-challenge |
Repo | https://github.com/sld/convai-bot-1337 |
Framework | none |