January 24, 2020

2665 words 13 mins read

Paper Group NANR 214

Paper Group NANR 214

What just happened? Evaluating retrofitted distributional word vectors. Trigger Word Detection and Thematic Role Identification via BERT and Multitask Learning. MetaInit: Initializing learning by learning to initialize. Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation. X-Section: Cro …

What just happened? Evaluating retrofitted distributional word vectors

Title What just happened? Evaluating retrofitted distributional word vectors
Authors Dmetri Hayes
Abstract Recent work has attempted to enhance vector space representations using information from structured semantic resources. This process, dubbed retrofitting (Faruqui et al., 2015), has yielded improvements in word similarity performance. Research has largely focused on the retrofitting algorithm, or on the kind of structured semantic resources used, but little research has explored why some resources perform better than others. We conducted a fine-grained analysis of the original retrofitting process, and found that the utility of different lexical resources for retrofitting depends on two factors: the coverage of the resource and the evaluation metric. Our assessment suggests that the common practice of using correlation measures to evaluate increases in performance against full word similarity benchmarks 1) obscures the benefits offered by smaller resources, and 2) overlooks incremental gains in word similarity performance. We propose root-mean-square error (RMSE) as an alternative evaluation metric, and demonstrate that correlation measures and RMSE sometimes yield opposite conclusions concerning the efficacy of retrofitting. This point is illustrated by word vectors retrofitted with novel treatments of the FrameNet data (Fillmore and Baker, 2010).
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1111/
PDF https://www.aclweb.org/anthology/N19-1111
PWC https://paperswithcode.com/paper/what-just-happened-evaluating-retrofitted
Repo
Framework

Trigger Word Detection and Thematic Role Identification via BERT and Multitask Learning

Title Trigger Word Detection and Thematic Role Identification via BERT and Multitask Learning
Authors Dongfang Li, Ying Xiong, Baotian Hu, Hanyang Du, Buzhou Tang, Qingcai Chen
Abstract The prediction of the relationship between the disease with genes and its mutations is a very important knowledge extraction task that can potentially help drug discovery. In this paper, we present our approaches for trigger word detection (task 1) and the identification of its thematic role (task 2) in AGAC track of BioNLP Open Shared Task 2019. Task 1 can be regarded as the traditional name entity recognition (NER), which cultivates molecular phenomena related to gene mutation. Task 2 can be regarded as relation extraction which captures the thematic roles between entities. For two tasks, we exploit the pre-trained biomedical language representation model (i.e., BERT) in the pipe of information extraction for the collection of mutation-disease knowledge from PubMed. And also, we design a fine-tuning technique and extra features by using multi-task learning. The experiment results show that our proposed approaches achieve 0.60 (ranks 1) and 0.25 (ranks 2) on task 1 and task 2 respectively in terms of $F_1$ metric.
Tasks Drug Discovery, Multi-Task Learning, Relation Extraction
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5711/
PDF https://www.aclweb.org/anthology/D19-5711
PWC https://paperswithcode.com/paper/trigger-word-detection-and-thematic-role
Repo
Framework

MetaInit: Initializing learning by learning to initialize

Title MetaInit: Initializing learning by learning to initialize
Authors Yann N. Dauphin, Samuel Schoenholz
Abstract Deep learning models frequently trade handcrafted features for deep features learned with much less human intervention using gradient descent. While this paradigm has been enormously successful, deep networks are often difficult to train and performance can depend crucially on the initial choice of parameters. In this work, we introduce an algorithm called MetaInit as a step towards automating the search for good initializations using meta-learning. Our approach is based on a hypothesis that good initializations make gradient descent easier by starting in regions that look locally linear with minimal second order effects. We formalize this notion via a quantity that we call the gradient quotient, which can be computed with any architecture or dataset. MetaInit minimizes this quantity efficiently by using gradient descent to tune the norms of the initial weight matrices. We conduct experiments on plain and residual networks and show that the algorithm can automatically recover from a class of bad initializations. MetaInit allows us to train networks and achieve performance competitive with the state-of-the-art without batch normalization or residual connections. In particular, we find that this approach outperforms normalization for networks without skip connections on CIFAR-10 and can scale to Resnet-50 models on Imagenet.
Tasks Meta-Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/9427-metainit-initializing-learning-by-learning-to-initialize
PDF http://papers.nips.cc/paper/9427-metainit-initializing-learning-by-learning-to-initialize.pdf
PWC https://paperswithcode.com/paper/metainit-initializing-learning-by-learning-to
Repo
Framework

Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation

Title Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation
Authors Cao Liu, Shizhu He, Kang Liu, Jun Zhao
Abstract We study the task of response generation. Conventional methods employ a fixed vocabulary and one-pass decoding, which not only make them prone to safe and general responses but also lack further refining to the first generated raw sequence. To tackle the above two problems, we present a Vocabulary Pyramid Network (VPN) which is able to incorporate multi-pass encoding and decoding with multi-level vocabularies into response generation. Specifically, the dialogue input and output are represented by multi-level vocabularies which are obtained from hierarchical clustering of raw words. Then, multi-pass encoding and decoding are conducted on the multi-level vocabularies. Since VPN is able to leverage rich encoding and decoding information with multi-level vocabularies, it has the potential to generate better responses. Experiments on English Twitter and Chinese Weibo datasets demonstrate that VPN remarkably outperforms strong baselines.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1367/
PDF https://www.aclweb.org/anthology/P19-1367
PWC https://paperswithcode.com/paper/vocabulary-pyramid-network-multi-pass
Repo
Framework

X-Section: Cross-Section Prediction for Enhanced RGB-D Fusion

Title X-Section: Cross-Section Prediction for Enhanced RGB-D Fusion
Authors Andrea Nicastro, Ronald Clark, Stefan Leutenegger
Abstract Detailed 3D reconstruction is an important challenge with application to robotics, augmented and virtual reality, which has seen impressive progress throughout the past years. Advancements were driven by the availability of depth cameras (RGB-D), as well as increased compute power, e.g. in the form of GPUs – but also thanks to inclusion of machine learning in the process. Here, we propose X-Section, an RGB-D 3D reconstruction approach that leverages deep learning to make object-level predictions about thicknesses that can be readily integrated into a volumetric multi-view fusion process, where we propose an extension to the popular KinectFusion approach. In essence, our method allows to complete shape in general indoor scenes behind what is sensed by the RGB-D camera, which may be crucial e.g. for robotic manipulation tasks or efficient scene exploration. Predicting object thicknesses rather than volumes allows us to work with comparably high spatial resolution without exploding memory and training data requirements on the employed Convolutional Neural Networks. In a series of qualitative and quantitative evaluations, we demonstrate how we accurately predict object thickness and reconstruct general 3D scenes containing multiple objects.
Tasks 3D Reconstruction
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Nicastro_X-Section_Cross-Section_Prediction_for_Enhanced_RGB-D_Fusion_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Nicastro_X-Section_Cross-Section_Prediction_for_Enhanced_RGB-D_Fusion_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/x-section-cross-section-prediction-for-1
Repo
Framework

Better Exploiting Latent Variables in Text Modeling

Title Better Exploiting Latent Variables in Text Modeling
Authors Canasai Kruengkrai
Abstract We show that sampling latent variables multiple times at a gradient step helps in improving a variational autoencoder and propose a simple and effective method to better exploit these latent variables through hidden state averaging. Consistent gains in performance on two different datasets, Penn Treebank and Yahoo, indicate the generalizability of our method.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1553/
PDF https://www.aclweb.org/anthology/P19-1553
PWC https://paperswithcode.com/paper/better-exploiting-latent-variables-in-text
Repo
Framework

Proposed Taxonomy for Gender Bias in Text; A Filtering Methodology for the Gender Generalization Subtype

Title Proposed Taxonomy for Gender Bias in Text; A Filtering Methodology for the Gender Generalization Subtype
Authors Yasmeen Hitti, Eunbee Jang, Ines Moreno, Carolyne Pelletier
Abstract The purpose of this paper is to present an empirical study on gender bias in text. Current research in this field is focused on detecting and correcting for gender bias in existing machine learning models rather than approaching the issue at the dataset level. The underlying motivation is to create a dataset which could enable machines to learn to differentiate bias writing from non-bias writing. A taxonomy is proposed for structural and contextual gender biases which can manifest themselves in text. A methodology is proposed to fetch one type of structural gender bias, Gender Generalization. We explore the IMDB movie review dataset and 9 different corpora from Project Gutenberg. By filtering out irrelevant sentences, the remaining pool of candidate sentences are sent for human validation. A total of 6123 judgments are made on 1627 sentences and after a quality check on randomly selected sentences we obtain an accuracy of 75{%}. Out of the 1627 sentences, 808 sentence were labeled as Gender Generalizations. The inter-rater reliability amongst labelers was of 61.14{%}.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3802/
PDF https://www.aclweb.org/anthology/W19-3802
PWC https://paperswithcode.com/paper/proposed-taxonomy-for-gender-bias-in-text-a
Repo
Framework

Deep Dominance - How to Properly Compare Deep Neural Models

Title Deep Dominance - How to Properly Compare Deep Neural Models
Authors Rotem Dror, Segev Shlomov, Roi Reichart
Abstract Comparing between Deep Neural Network (DNN) models based on their performance on unseen data is crucial for the progress of the NLP field. However, these models have a large number of hyper-parameters and, being non-convex, their convergence point depends on the random values chosen at initialization and during training. Proper DNN comparison hence requires a comparison between their empirical score distributions on unseen data, rather than between single evaluation scores as is standard for more simple, convex models. In this paper, we propose to adapt to this problem a recently proposed test for the Almost Stochastic Dominance relation between two distributions. We define the criteria for a high quality comparison method between DNNs, and show, both theoretically and through analysis of extensive experimental results with leading DNN models for sequence tagging tasks, that the proposed test meets all criteria while previously proposed methods fail to do so. We hope the test we propose here will set a new working practice in the NLP community.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1266/
PDF https://www.aclweb.org/anthology/P19-1266
PWC https://paperswithcode.com/paper/deep-dominance-how-to-properly-compare-deep
Repo
Framework

Leveraging Rule-Based Machine Translation Knowledge for Under-Resourced Neural Machine Translation Models

Title Leveraging Rule-Based Machine Translation Knowledge for Under-Resourced Neural Machine Translation Models
Authors Daniel Torregrosa, Nivranshu Pasricha, Maraim Masoud, Bharathi Raja Chakravarthi, Juan Alonso, Noe Casas, Mihael Arcan
Abstract
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-6725/
PDF https://www.aclweb.org/anthology/W19-6725
PWC https://paperswithcode.com/paper/leveraging-rule-based-machine-translation
Repo
Framework

Towards a Compositional Analysis of German Light Verb Constructions (LVCs) Combining Lexicalized Tree Adjoining Grammar (LTAG) with Frame Semantics

Title Towards a Compositional Analysis of German Light Verb Constructions (LVCs) Combining Lexicalized Tree Adjoining Grammar (LTAG) with Frame Semantics
Authors Jens Fleischhauer, Thomas Gamerschlag, Laura Kallmeyer, Simon Petitjean
Abstract Complex predicates formed of a semantically {}light{'} verbal head and a noun or verb which contributes the major part of the meaning are frequently referred to as {}light verb constructions{'} (LVCs). In the paper, we present a case study of LVCs with the German posture verb stehen {`}stand{'}. In our account, we model the syntactic as well as semantic composition of such LVCs by combining Lexicalized Tree Adjoining Grammar (LTAG) with frames. Starting from the analysis of the literal uses of posture verbs, we show how the meaning components of the literal uses are systematically exploited in the interpretation of stehen-LVCs. The paper constitutes an important step towards a compositional and computational analysis of LVCs. We show that LTAG allows us to separate constructional from lexical meaning components and that frames enable elegant generalizations over event types and related constraints. |
Tasks Semantic Composition
Published 2019-05-01
URL https://www.aclweb.org/anthology/W19-0407/
PDF https://www.aclweb.org/anthology/W19-0407
PWC https://paperswithcode.com/paper/towards-a-compositional-analysis-of-german
Repo
Framework

No Word is an Island—A Transformation Weighting Model for Semantic Composition

Title No Word is an Island—A Transformation Weighting Model for Semantic Composition
Authors Corina Dima, Dani{"e}l de Kok, Neele Witte, Erhard Hinrichs
Abstract Composition models of distributional semantics are used to construct phrase representations from the representations of their words. Composition models are typically situated on two ends of a spectrum. They either have a small number of parameters but compose all phrases in the same way, or they perform word-specific compositions at the cost of a far larger number of parameters. In this paper we propose transformation weighting (TransWeight), a composition model that consistently outperforms existing models on nominal compounds, adjective-noun phrases, and adverb-adjective phrases in English, German, and Dutch. TransWeight drastically reduces the number of parameters needed compared with the best model in the literature by composing similar words in the same way.
Tasks Semantic Composition
Published 2019-03-01
URL https://www.aclweb.org/anthology/Q19-1025/
PDF https://www.aclweb.org/anthology/Q19-1025
PWC https://paperswithcode.com/paper/no-word-is-an-island-a-transformation-1
Repo
Framework

Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations

Title Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations
Authors Wonhee Lee, Joonil Na, Gunhee Kim
Abstract In spite of recent enormous success of deep convolutional networks in object detection, they require a large amount of bounding box annotations, which are often time-consuming and error-prone to obtain. To make better use of given limited labels, we propose a novel object detection approach that takes advantage of both multi-task learning (MTL) and self-supervised learning (SSL). We propose a set of auxiliary tasks that help improve the accuracy of object detection. They create their own labels by recycling the bounding box labels (i.e. annotations of the main task) in an SSL manner, and are jointly trained with the object detection model in an MTL way. Our approach is integrable with any region proposal based detection models. We empirically validate that our approach effectively improves detection performance on various architectures and datasets. We test two state-of-the-art region proposal object detectors, including Faster R-CNN and R-FCN, with three CNN backbones of ResNet-101, Inception-ResNet-v2, and MobileNet on two benchmark datasets of PASCAL VOC and COCO.
Tasks Multi-Task Learning, Object Detection
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Lee_Multi-Task_Self-Supervised_Object_Detection_via_Recycling_of_Bounding_Box_Annotations_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Lee_Multi-Task_Self-Supervised_Object_Detection_via_Recycling_of_Bounding_Box_Annotations_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/multi-task-self-supervised-object-detection
Repo
Framework

Lijunyi at SemEval-2019 Task 9: An attention-based LSTM and ensemble of different models for suggestion mining from online reviews and forums

Title Lijunyi at SemEval-2019 Task 9: An attention-based LSTM and ensemble of different models for suggestion mining from online reviews and forums
Authors Junyi Li
Abstract In this paper, we describe a suggestion mining system that participated in SemEval 2019 Task 9, SubTask A - Suggestion Mining from Online Reviews and Forums. Given some suggestions from online reviews and forums that can be classified into suggestion and non-suggestion classes. In this task, we combine the attention mechanism with the LSTM model, which is the final system we submitted. The final submission achieves 14th place in Task 9, SubTask A with the accuracy of 0.6776. After the challenge, we train a series of neural network models such as convolutional neural network(CNN), TextCNN, long short-term memory(LSTM) and C-LSTM. Finally, we make an ensemble on the predictions of these models and get a better result.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2212/
PDF https://www.aclweb.org/anthology/S19-2212
PWC https://paperswithcode.com/paper/lijunyi-at-semeval-2019-task-9-an-attention
Repo
Framework

Real Life Application of a Question Answering System Using BERT Language Model

Title Real Life Application of a Question Answering System Using BERT Language Model
Authors Francesca Alloatti, Luigi Di Caro, Gianpiero Sportelli
Abstract It is often hard to apply the newest advances in research to real life scenarios. They usually require the resolution of some specific task applied to a restricted domain, all the while providing small amounts of data to begin with. In this study we apply one of the newest innovations in Deep Learning to a task of text classification. We created a question answering system in Italian that provides information about a specific subject, e-invoicing and digital billing. Italy recently introduced a new legislation about e-invoicing and people have some legit doubts, therefore a large share of professionals could benefit from this tool.
Tasks Language Modelling, Question Answering, Text Classification
Published 2019-09-01
URL https://www.aclweb.org/anthology/W19-5930/
PDF https://www.aclweb.org/anthology/W19-5930
PWC https://paperswithcode.com/paper/real-life-application-of-a-question-answering
Repo
Framework

University of Tartu’s Multilingual Multi-domain WMT19 News Translation Shared Task Submission

Title University of Tartu’s Multilingual Multi-domain WMT19 News Translation Shared Task Submission
Authors Andre T{"a}ttar, Elizaveta Korotkova, Mark Fishel
Abstract This paper describes the University of Tartu{'}s submission to the news translation shared task of WMT19, where the core idea was to train a single multilingual system to cover several language pairs of the shared task and submit its results. We only used the constrained data from the shared task. We describe our approach and its results and discuss the technical issues we faced.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5342/
PDF https://www.aclweb.org/anthology/W19-5342
PWC https://paperswithcode.com/paper/university-of-tartus-multilingual-multi
Repo
Framework
comments powered by Disqus