January 25, 2020

2127 words 10 mins read

Paper Group NANR 96

Paper Group NANR 96

The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task. Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation. CUNI Submission for Low-Resource Languages in WMT News 2019. Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese. Talking about what is no …

The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task

Title The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task
Authors Javier Iranzo-S{'a}nchez, Gon{\c{c}}al Garc{'e}s D{'\i}az-Mun{'\i}o, Jorge Civera, Alfons Juan
Abstract This paper describes the participation of the MLLP research group of the Universitat Polit{`e}cnica de Val{`e}ncia in the WMT 2019 News Translation Shared Task. In this edition, we have submitted systems for the German ↔ English and German ↔ French language pairs, participating in both directions of each pair. Our submitted systems, based on the Transformer architecture, make ample use of data filtering, synthetic data and domain adaptation through fine-tuning.
Tasks Domain Adaptation, Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5320/
PDF https://www.aclweb.org/anthology/W19-5320
PWC https://paperswithcode.com/paper/the-mllp-upv-supervised-machine-translation
Repo
Framework

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation

Title Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation
Authors Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, Anderson Nascimento
Abstract Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We propose the first privacy-preserving solution for text classification that is provably secure. Our method, which is based on Secure Multiparty Computation (SMC), encompasses both feature extraction from texts, and subsequent classification with logistic regression and tree ensembles. We prove that when using our secure text classification method, the application does not learn anything about the text, and the author of the text does not learn anything about the text classification model used by the application beyond what is given by the classification result itself. We perform end-to-end experiments with an application for detecting hate speech against women and immigrants, demonstrating excellent runtime results without loss of accuracy.
Tasks Text Classification
Published 2019-12-01
URL http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation
PDF http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation.pdf
PWC https://paperswithcode.com/paper/privacy-preserving-classification-of-personal-1
Repo
Framework

CUNI Submission for Low-Resource Languages in WMT News 2019

Title CUNI Submission for Low-Resource Languages in WMT News 2019
Authors Tom Kocmi, Ond{\v{r}}ej Bojar
Abstract This paper describes the CUNI submission to the WMT 2019 News Translation Shared Task for the low-resource languages: Gujarati-English and Kazakh-English. We participated in both language pairs in both translation directions. Our system combines transfer learning from a different high-resource language pair followed by training on backtranslated monolingual data. Thanks to the simultaneous training in both directions, we can iterate the backtranslation process. We are using the Transformer model in a constrained submission.
Tasks Transfer Learning
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5322/
PDF https://www.aclweb.org/anthology/W19-5322
PWC https://paperswithcode.com/paper/cuni-submission-for-low-resource-languages-in
Repo
Framework

Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese

Title Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese
Authors Marco Antonio Sobrevilla Cabezudo, Thiago Pardo
Abstract Abstract Meaning Representation (AMR) is a recent and prominent semantic representation with good acceptance and several applications in the Natural Language Processing area. For English, there is a large annotated corpus (with approximately 39K sentences) that supports the research with the representation. However, to the best of our knowledge, there is only one restricted corpus for Portuguese, which contains 1,527 sentences. In this context, this paper presents an effort to build a general purpose AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines. Our results show that such approach is feasible, but there are some challenging phenomena to solve. More than this, efforts are necessary to increase the coverage of the corresponding lexical resource that supports the annotation.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4028/
PDF https://www.aclweb.org/anthology/W19-4028
PWC https://paperswithcode.com/paper/towards-a-general-abstract-meaning
Repo
Framework

Talking about what is not there: Generating indefinite referring expressions in Minecraft

Title Talking about what is not there: Generating indefinite referring expressions in Minecraft
Authors Arne K{"o}hn, Alex Koller, er
Abstract When generating technical instructions, it is often necessary to describe an object that does not exist yet. For example, an NLG system which explains how to build a house needs to generate sentences like {}build *a wall of height five to your left*{''} and {}now build a wall on the other side.{''} Generating (indefinite) referring expressions to objects that do not exist yet is fundamentally different from generating the usual definite referring expressions, because the new object must be distinguished from an infinite set of possible alternatives. We formalize this problem and present an algorithm for generating such expressions, in the context of generating building instructions within the Minecraft video game.
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-8601/
PDF https://www.aclweb.org/anthology/W19-8601
PWC https://paperswithcode.com/paper/talking-about-what-is-not-there-generating
Repo
Framework

The NiuTrans Machine Translation Systems for WMT19

Title The NiuTrans Machine Translation Systems for WMT19
Authors Bei Li, Yinqiao Li, Chen Xu, Ye Lin, Jiqiang Liu, Hui Liu, Ziyang Wang, Yuhao Zhang, Nuo Xu, Zeyang Wang, Kai Feng, Hexuan Chen, Tengbo Liu, Yanyang Li, Qiang Wang, Tong Xiao, Jingbo Zhu
Abstract This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5325/
PDF https://www.aclweb.org/anthology/W19-5325
PWC https://paperswithcode.com/paper/the-niutrans-machine-translation-systems-for
Repo
Framework

Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation

Title Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation
Authors Patrick Littell, Chi-kiu Lo, Samuel Larkin, Darlene Stewart
Abstract We describe the neural machine translation (NMT) system developed at the National Research Council of Canada (NRC) for the Kazakh-English news translation task of the Fourth Conference on Machine Translation (WMT19). Our submission is a multi-source NMT taking both the original Kazakh sentence and its Russian translation as input for translating into English.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5326/
PDF https://www.aclweb.org/anthology/W19-5326
PWC https://paperswithcode.com/paper/multi-source-transformer-for-kazakh-russian
Repo
Framework

Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence

Title Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence
Authors Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace, Ani Nenkova
Abstract Standard paradigms for search do not work well in the medical context. Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search. Instead, we propose faceted-search in which a user specifies a condition and then can browse treatments and outcomes that have been evaluated. Choosing from these, they can access randomized control trials (RCTs) describing individual studies. Realizing such a view of the medical evidence requires information extraction techniques to identify the population, interventions, and outcome measures in an RCT. Patients, health practitioners, and biomedical librarians all stand to benefit from such innovation in search of medical evidence. We present an initial prototype of such an interface applied to pre-registered clinical studies. We also discuss pilot studies into the applicability of information extraction methods to allow for similar access to all published trial results.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-2606/
PDF https://www.aclweb.org/anthology/W19-2606
PWC https://paperswithcode.com/paper/browsing-health-information-extraction-to
Repo
Framework

Enthymemetic Conditionals

Title Enthymemetic Conditionals
Authors Eimear Maguire
Abstract To model conditionals in a way that reflects their acceptability, we must include some means of making judgements about whether antecedent and consequent are meaningfully related or not. Enthymemes are non-logical arguments which do not hold up by themselves, but are acceptable through their relation to a topos, an already-known general principle or pattern for reasoning. This paper uses enthymemes and topoi as a way to model the world-knowledge behind these judgements. In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as networks rather than functions, and information state update rules for conditionals.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-1018/
PDF https://www.aclweb.org/anthology/S19-1018
PWC https://paperswithcode.com/paper/enthymemetic-conditionals
Repo
Framework

Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits

Title Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits
Authors Sivan Sabato
Abstract We study epsilon-best-arm identification, in a setting where during the exploration phase, the cost of each arm pull is proportional to the expected future reward of that arm. We term this setting Pay-Per-Reward. We provide an algorithm for this setting, that with a high probability returns an epsilon-best arm, while incurring a cost that depends only linearly on the total expected reward of all arms, and does not depend at all on the number of arms. Under mild assumptions, the algorithm can be applied also to problems with infinitely many arms.
Tasks Multi-Armed Bandits
Published 2019-12-01
URL http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits
PDF http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits.pdf
PWC https://paperswithcode.com/paper/epsilon-best-arm-identification-in-pay-per
Repo
Framework

Enhancing Unsupervised Generative Dependency Parser with Contextual Information

Title Enhancing Unsupervised Generative Dependency Parser with Contextual Information
Authors Wenjuan Han, Yong Jiang, Kewei Tu
Abstract Most of the unsupervised dependency parsers are based on probabilistic generative models that learn the joint distribution of the given sentence and its parse. Probabilistic generative models usually explicit decompose the desired dependency tree into factorized grammar rules, which lack the global features of the entire sentence. In this paper, we propose a novel probabilistic model called discriminative neural dependency model with valence (D-NDMV) that generates a sentence and its parse from a continuous latent representation, which encodes global contextual information of the generated sentence. We propose two approaches to model the latent representation: the first deterministically summarizes the representation from the sentence and the second probabilistically models the representation conditioned on the sentence. Our approach can be regarded as a new type of autoencoder model to unsupervised dependency parsing that combines the benefits of both generative and discriminative techniques. In particular, our approach breaks the context-free independence assumption in previous generative approaches and therefore becomes more expressive. Our extensive experimental results on seventeen datasets from various sources show that our approach achieves competitive accuracy compared with both generative and discriminative state-of-the-art unsupervised dependency parsers.
Tasks Constituency Grammar Induction, Dependency Grammar Induction
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1526/
PDF https://www.aclweb.org/anthology/P19-1526
PWC https://paperswithcode.com/paper/enhancing-unsupervised-generative-dependency
Repo
Framework

Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization

Title Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization
Authors Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber
Abstract Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings. However, orthogonal mapping only works on language pairs whose embeddings are naturally isomorphic. For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language{'}s average vector is zero. Iterative Normalization consistently improves word translation accuracy of three CLWE methods, with the largest improvement observed on English-Japanese (from 2{%} to 44{%} test accuracy).
Tasks Word Embeddings
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1307/
PDF https://www.aclweb.org/anthology/P19-1307
PWC https://paperswithcode.com/paper/are-girls-neko-or-shojo-cross-lingual-1
Repo
Framework

Tilde’s Machine Translation Systems for WMT 2019

Title Tilde’s Machine Translation Systems for WMT 2019
Authors Marcis Pinnis, Rihards Kri{\v{s}}lauks, Mat{=\i}ss Rikters
Abstract The paper describes the development process of Tilde{'}s NMT systems for the WMT 2019 shared task on news translation. We trained systems for the English-Lithuanian and Lithuanian-English translation directions in constrained and unconstrained tracks. We build upon the best methods of the previous year{'}s competition and combine them with recent advancements in the field. We also present a new method to ensure source domain adherence in back-translated data. Our systems achieved a shared first place in human evaluation.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5335/
PDF https://www.aclweb.org/anthology/W19-5335
PWC https://paperswithcode.com/paper/tildes-machine-translation-systems-for-wmt-2
Repo
Framework

The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification

Title The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification
Authors Avishek Garain, Arpan Basu
Abstract This system paper is a description of the system submitted to {``}SemEval-2019 Task 6{''}, where we had to detect offensive language in Twitter. There were two specific target audiences, immigrants and women. The language of the tweets was English. We were required to first detect whether a tweet contains offensive content, and then we had to find out whether the tweet was targeted against some individual, group or other entity. Finally we were required to classify the targeted audience. |
Tasks Language Identification
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2133/
PDF https://www.aclweb.org/anthology/S19-2133
PWC https://paperswithcode.com/paper/the-titans-at-semeval-2019-task-6-offensive
Repo
Framework

P=ali Sandhi – A computational approach

Title P=ali Sandhi – A computational approach
Authors Swati Basapur, Shivani V, Sivaja Nair
Abstract
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-7513/
PDF https://www.aclweb.org/anthology/W19-7513
PWC https://paperswithcode.com/paper/pali-sandhi-a-computational-approach
Repo
Framework
comments powered by Disqus