Paper Group NANR 51
Diverse Machine Translation with a Single Multinomial Latent Variable. FindHer: a Filter to Find Women Experts. Distillation-Based Training for Multi-Exit Architectures. Characterizing the Response Space of Questions: a Corpus Study for English and Polish. Elliptical Perturbations for Differential Privacy. A Platform Agnostic Dual-Strand Hate Speec …
Diverse Machine Translation with a Single Multinomial Latent Variable
Title | Diverse Machine Translation with a Single Multinomial Latent Variable |
Authors | Tianxiao Shen, Myle Ott, Michael Auli, Marc’Aurelio Ranzato |
Abstract | There are many ways to translate a sentence into another language. Explicit modeling of such uncertainty may enable better model fitting to the data and it may enable users to express a preference for how to translate a piece of content. Latent variable models are a natural way to represent uncertainty. Prior work investigated the use of multivariate continuous and discrete latent variables, but their interpretation and use for generating a diverse set of hypotheses have been elusive. In this work, we drastically simplify the model, using just a single multinomial latent variable. The resulting mixture of experts model can be trained efficiently via hard-EM and can generate a diverse set of hypothesis by parallel greedy decoding. We perform extensive experiments on three WMT benchmark datasets that have multiple human references, and we show that our model provides a better trade-off between quality and diversity of generations compared to all baseline methods.\footnote{Code to reproduce this work is available at: anonymized URL.} |
Tasks | Latent Variable Models, Machine Translation |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJgnmhA5KQ |
https://openreview.net/pdf?id=BJgnmhA5KQ | |
PWC | https://paperswithcode.com/paper/diverse-machine-translation-with-a-single |
Repo | |
Framework | |
FindHer: a Filter to Find Women Experts
Title | FindHer: a Filter to Find Women Experts |
Authors | Gabriela Ferraro, Zoe Piper, Rebecca Hinton |
Abstract | |
Tasks | |
Published | 2019-04-01 |
URL | https://www.aclweb.org/anthology/U19-1019/ |
https://www.aclweb.org/anthology/U19-1019 | |
PWC | https://paperswithcode.com/paper/findher-a-filter-to-find-women-experts |
Repo | |
Framework | |
Distillation-Based Training for Multi-Exit Architectures
Title | Distillation-Based Training for Multi-Exit Architectures |
Authors | Mary Phuong, Christoph H. Lampert |
Abstract | Multi-exit architectures, in which a stack of processing layers is interleaved with early output layers, allow the processing of a test example to stop early and thus save computation time and/or energy. In this work, we propose a new training procedure for multi-exit architectures based on the principle of knowledge distillation. The method encourages early exits to mimic later, more accurate exits, by matching their probability outputs. Experiments on CIFAR100 and ImageNet show that distillation-based training significantly improves the accuracy of early exits while maintaining state-of-the-art accuracy for late ones. The method is particularly beneficial when training data is limited and also allows a straight-forward extension to semi-supervised learning, i.e. make use also of unlabeled data at training time. Moreover, it takes only a few lines to implement and imposes almost no computational overhead at training time, and none at all at test time. |
Tasks | |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Phuong_Distillation-Based_Training_for_Multi-Exit_Architectures_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Phuong_Distillation-Based_Training_for_Multi-Exit_Architectures_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/distillation-based-training-for-multi-exit |
Repo | |
Framework | |
Characterizing the Response Space of Questions: a Corpus Study for English and Polish
Title | Characterizing the Response Space of Questions: a Corpus Study for English and Polish |
Authors | Jonathan Ginzburg, Zulipiye Yusupujiang, Chuyuan Li, Kexin Ren, Pawe{\l} {\L}upkowski |
Abstract | The main aim of this paper is to provide a characterization of the response space for questions using a taxonomy grounded in a dialogical formal semantics. As a starting point we take the typology for responses in the form of questions provided in (Lupkowski and Ginzburg, 2016). This work develops a wide coverage taxonomy for question/question sequences observable in corpora including the BNC, CHILDES, and BEE, as well as formal modelling of all the postulated classes. Our aim is to extend this work to cover all responses to questions. We present the extended typology of responses to questions based on a corpus studies of BNC, BEE and Maptask with include 506, 262, and 467 question/response pairs respectively. We compare the data for English with data from Polish using the Spokes corpus (205 question/response pairs). We discuss annotation reliability and disagreement analysis. We sketch how each class can be formalized using a dialogical semantics appropriate for dialogue management. |
Tasks | Dialogue Management |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-5937/ |
https://www.aclweb.org/anthology/W19-5937 | |
PWC | https://paperswithcode.com/paper/characterizing-the-response-space-of |
Repo | |
Framework | |
Elliptical Perturbations for Differential Privacy
Title | Elliptical Perturbations for Differential Privacy |
Authors | Matthew Reimherr, Jordan Awan |
Abstract | We study elliptical distributions in locally convex vector spaces, and determine conditions when they can or cannot be used to satisfy differential privacy (DP). A requisite condition for a sanitized statistical summary to satisfy DP is that the corresponding privacy mechanism must induce equivalent probability measures for all possible input databases. We show that elliptical distributions with the same dispersion operator, $C$, are equivalent if the difference of their means lies in the Cameron-Martin space of $C$. In the case of releasing finite-dimensional summaries using elliptical perturbations, we show that the privacy parameter $\ep$ can be computed in terms of a one-dimensional maximization problem. We apply this result to consider multivariate Laplace, $t$, Gaussian, and $K$-norm noise. Surprisingly, we show that the multivariate Laplace noise does not achieve $\ep$-DP in any dimension greater than one. Finally, we show that when the dimension of the space is infinite, no elliptical distribution can be used to give $\ep$-DP; only $(\epsilon,\delta)$-DP is possible. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9208-elliptical-perturbations-for-differential-privacy |
http://papers.nips.cc/paper/9208-elliptical-perturbations-for-differential-privacy.pdf | |
PWC | https://paperswithcode.com/paper/elliptical-perturbations-for-differential |
Repo | |
Framework | |
A Platform Agnostic Dual-Strand Hate Speech Detector
Title | A Platform Agnostic Dual-Strand Hate Speech Detector |
Authors | Johannes Skjeggestad Meyer, Bj{"o}rn Gamb{"a}ck |
Abstract | Hate speech detectors must be applicable across a multitude of services and platforms, and there is hence a need for detection approaches that do not depend on any information specific to a given platform. For instance, the information stored about the text{'}s author may differ between services, and so using such data would reduce a system{'}s general applicability. The paper thus focuses on using exclusively text-based input in the detection, in an optimised architecture combining Convolutional Neural Networks and Long Short-Term Memory-networks. The hate speech detector merges two strands with character n-grams and word embeddings to produce the final classification, and is shown to outperform comparable previous approaches. |
Tasks | Word Embeddings |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3516/ |
https://www.aclweb.org/anthology/W19-3516 | |
PWC | https://paperswithcode.com/paper/a-platform-agnostic-dual-strand-hate-speech |
Repo | |
Framework | |
MSIT_SRIB at MEDIQA 2019: Knowledge Directed Multi-task Framework for Natural Language Inference in Clinical Domain.
Title | MSIT_SRIB at MEDIQA 2019: Knowledge Directed Multi-task Framework for Natural Language Inference in Clinical Domain. |
Authors | Sahil Chopra, Ankita Gupta, Anupama Kaushik |
Abstract | In this paper, we present Biomedical Multi-Task Deep Neural Network (Bio-MTDNN) on the NLI task of MediQA 2019 challenge. Bio-MTDNN utilizes {``}transfer learning{''} based paradigm where not only the source and target domains are different but also the source and target tasks are varied, although related. Further, Bio-MTDNN integrates knowledge from external sources such as clinical databases (UMLS) enhancing its performance on the clinical domain. Our proposed method outperformed the official baseline and other prior models (such as ESIM and Infersent on dev set) by a considerable margin as evident from our experimental results. | |
Tasks | Natural Language Inference, Transfer Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5052/ |
https://www.aclweb.org/anthology/W19-5052 | |
PWC | https://paperswithcode.com/paper/msit_srib-at-mediqa-2019-knowledge-directed |
Repo | |
Framework | |
Ontological attention ensembles for capturing semantic concepts in ICD code prediction from clinical text
Title | Ontological attention ensembles for capturing semantic concepts in ICD code prediction from clinical text |
Authors | Matus Falis, Maciej Pajak, Aneta Lisowska, Patrick Schrempf, Lucas Deckers, Shadia Mikhael, Sotirios Tsaftaris, Alison O{'}Neil |
Abstract | We present a semantically interpretable system for automated ICD coding of clinical text documents. Our contribution is an ontological attention mechanism which matches the structure of the ICD ontology, in which shared attention vectors are learned at each level of the hierarchy, and combined into label-dependent ensembles. Analysis of the attention heads shows that shared concepts are learned by the lowest common denominator node. This allows child nodes to focus on the differentiating concepts, leading to efficient learning and memory usage. Visualisation of the multi-level attention on the original text allows explanation of the code predictions according to the semantics of the ICD ontology. On the MIMIC-III dataset we achieve a 2.7{%} absolute (11{%} relative) improvement from 0.218 to 0.245 macro-F1 score compared to the previous state of the art across 3,912 codes. Finally, we analyse the labelling inconsistencies arising from different coding practices which limit performance on this task. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6220/ |
https://www.aclweb.org/anthology/D19-6220 | |
PWC | https://paperswithcode.com/paper/ontological-attention-ensembles-for-capturing |
Repo | |
Framework | |
NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection
Title | NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection |
Authors | Lukas Lange, Heike Adel, Jannik Str{"o}tgen |
Abstract | Named entity recognition has been extensively studied on English news texts. However, the transfer to other domains and languages is still a challenging problem. In this paper, we describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the task provides a non-standard domain and language setting. However, we propose an architecture that requires neither language nor domain expertise. We treat the task as a sequence labeling task and experiment with attention-based embedding selection and the training on automatically annotated data to further improve our system{'}s performance. Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6{%} F1 in the competition. |
Tasks | Named Entity Recognition |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5705/ |
https://www.aclweb.org/anthology/D19-5705 | |
PWC | https://paperswithcode.com/paper/nlnde-enhancing-neural-sequence-taggers-with |
Repo | |
Framework | |
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
Title | Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering |
Authors | Asma Ben Abacha, Chaitanya Shivade, Dina Demner-Fushman |
Abstract | This paper presents the MEDIQA 2019 shared task organized at the ACL-BioNLP workshop. The shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain, and their application to improve domain specific information retrieval and question answering systems. MEDIQA 2019 includes three tasks: Natural Language Inference (NLI), Recognizing Question Entailment (RQE), and Question Answering (QA) in the medical domain. 72 teams participated in the challenge, achieving an accuracy of 98{%} in the NLI task, 74.9{%} in the RQE task, and 78.3{%} in the QA task. In this paper, we describe the tasks, the datasets, and the participants{'} approaches and results. We hope that this shared task will attract further research efforts in textual inference, question entailment, and question answering in the medical domain. |
Tasks | Information Retrieval, Natural Language Inference, Question Answering |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5039/ |
https://www.aclweb.org/anthology/W19-5039 | |
PWC | https://paperswithcode.com/paper/overview-of-the-mediqa-2019-shared-task-on |
Repo | |
Framework | |
R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network
Title | R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network |
Authors | Bin Zhu, Chong-Wah Ngo, Jingjing Chen, Yanbin Hao |
Abstract | Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of generating image from procedure text for retrieval problem. The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. The novelty of R2GAN comes from architecture design, specifically a GAN with one generator and dual discriminators is used, which makes the generation of image from recipe a feasible idea. Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. These add-ons not only result in excellent retrieval performance, but also generate close-to-realistic food images useful for explaining ranking of recipes. On recipe1M dataset, R2GAN demonstrates high scalability to data size, outperforms all the existing approaches, and generates images intuitive for human to interpret the search results. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Zhu_R2GAN_Cross-Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhu_R2GAN_Cross-Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/r2gan-cross-modal-recipe-retrieval-with |
Repo | |
Framework | |
Diffeomorphic registration with intensity transformation and missing data: Application to 3D digital pathology of Alzheimer’s disease
Title | Diffeomorphic registration with intensity transformation and missing data: Application to 3D digital pathology of Alzheimer’s disease |
Authors | Daniel Tward, Timothy Brown, Yusuke Kageyama, Jaymin Patel, Zhipeng Hou, Susumu Mori, Marilyn Albert, Juan Troncoso, Michael Miller |
Abstract | This paper examines the problem of diffeomorphic image mapping in the presence of differing image intensity profiles and missing data. Our motivation comes from the problem of aligning 3D brain MRI with 100 micron isotropic resolution, to histology sections with 1 micron in plane resolution. Multiple stains, as well as damaged, folded, or missing tissue are common in this situation. We overcome these challenges by introducing two new concepts. Cross modality image matching is achieved by jointly estimating polynomial transformations of the atlas intensity, together with pose and deformation parameters. Missing data is accommodated via a multiple atlas selection procedure where several atlases may be of homogeneous intensity and correspond to “background” or “artifact”. The two concepts are combined within an Expectation Maximization algorithm, where atlas selection posteriors and deformation parameters are updated iteratively, and polynomial coefficients are computed in closed form. We show results for 3D reconstruction of digital pathology and MRI in standard atlas coordinates. In conjunction with convolutional neural networks, we quantify the 3D density distribution of tauopathy throughout the medial temporal lobe of an Alzheimer’s disease postmortem specimen. Author summary Our work in Alzheimer’s disease (AD) is attempting to connect histopathology at autopsy and longitudinal clinical magnetic resonance imaging (MRI), combining the strengths of each modality in a common coordinate system. We are bridging this gap by using post mortem high resolution MRI to reconstruct digital pathology in 3D. This image registration problem is challenging because it combines images from different modalities in the presence of missing tissue and artifacts. We overcome this challenge by developing a new registration technique that simultaneously classifies each pixel as “good data” / “missing tissue” / “artifact”, learns a contrast transformation between modalities, and computes deformation parameters. We name this technique “(D)eformable (R)egistration and (I)ntensity (T)ransformation with (M)issing (D)ata”, pronounced as “Dr. It, M.D.”. In conjunction with convolutional neural networks, we use this technique to map the three dimensional distribution of tau tangles in the medial temporal lobe of an AD postmortem specimen. |
Tasks | 3D Reconstruction, Deformable Medical Image Registration, Diffeomorphic Medical Image Registration, Image Registration |
Published | 2019-01-04 |
URL | https://doi.org/10.1101/494005 |
https://www.biorxiv.org/content/biorxiv/early/2019/01/04/494005.full-text.pdf | |
PWC | https://paperswithcode.com/paper/diffeomorphic-registration-with-intensity |
Repo | |
Framework | |
A Comparison of Context-sensitive Models for Lexical Substitution
Title | A Comparison of Context-sensitive Models for Lexical Substitution |
Authors | Aina Gar{'\i} Soler, Anne Cocos, Marianna Apidianaki, Chris Callison-Burch |
Abstract | Word embedding representations provide good estimates of word meaning and give state-of-the art performance in semantic tasks. Embedding approaches differ as to whether and how they account for the context surrounding a word. We present a comparison of different word and context representations on the task of proposing substitutes for a target word in context (lexical substitution). We also experiment with tuning contextualized word embeddings on a dataset of sense-specific instances for each target word. We show that powerful contextualized word representations, which give high performance in several semantics-related tasks, deal less well with the subtle in-context similarity relationships needed for substitution. This is better handled by models trained with this objective in mind, where the inter-dependence between word and context representations is explicitly modeled during training. |
Tasks | Word Embeddings |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0423/ |
https://www.aclweb.org/anthology/W19-0423 | |
PWC | https://paperswithcode.com/paper/a-comparison-of-context-sensitive-models-for |
Repo | |
Framework | |
Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics
Title | Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics |
Authors | David Schlangen |
Abstract | Propelling, and propelled by, the {}deep learning revolution{''}, recent years have seen the introduction of ever larger corpora of images annotated with natural language expressions. We survey some of these corpora, taking a perspective that reverses the usual directionality, as it were, by viewing the images as semantic annotation of the natural language expressions. We discuss datasets that can be derived from the corpora, and tasks of potential interest for computational semanticists that can be defined on those. In this, we make use of relations provided by the corpora (namely, the link between expression and image, and that between two expressions linked to the same image) and relations that we can add (similarity relations between expressions, or between images). Specifically, we show that in this way we can create data that can be used to learn and evaluate lexical and compositional grounded semantics, and we show that the { }linked to same image{''} relation tracks a semantic implication relation that is recognisable to annotators even in the absence of the linking image as evidence. Finally, as an example of possible benefits of this approach, we show that an exemplar-model-based approach to implication beats a (simple) distributional space-based one on some derived datasets, while lending itself to explainability. |
Tasks | |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0424/ |
https://www.aclweb.org/anthology/W19-0424 | |
PWC | https://paperswithcode.com/paper/natural-language-semantics-with-pictures-some-1 |
Repo | |
Framework | |
Moral Stance Recognition and Polarity Classification from Twitter and Elicited Text
Title | Moral Stance Recognition and Polarity Classification from Twitter and Elicited Text |
Authors | Wesley Santos, Iv Paraboni, r{'e} |
Abstract | We introduce a labelled corpus of stances about moral issues for the Brazilian Portuguese language, and present reference results for both the stance recognition and polarity classification tasks. The corpus is built from Twitter and further expanded with data elicited through crowd sourcing and labelled by their own authors. Put together, the corpus and reference results are expected to be taken as a baseline for further studies in the field of stance recognition and polarity classification from text. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1123/ |
https://www.aclweb.org/anthology/R19-1123 | |
PWC | https://paperswithcode.com/paper/moral-stance-recognition-and-polarity |
Repo | |
Framework | |