Paper Group NANR 96
The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task. Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation. CUNI Submission for Low-Resource Languages in WMT News 2019. Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese. Talking about what is no …
The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task
Title | The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task |
Authors | Javier Iranzo-S{'a}nchez, Gon{\c{c}}al Garc{'e}s D{'\i}az-Mun{'\i}o, Jorge Civera, Alfons Juan |
Abstract | This paper describes the participation of the MLLP research group of the Universitat Polit{`e}cnica de Val{`e}ncia in the WMT 2019 News Translation Shared Task. In this edition, we have submitted systems for the German ↔ English and German ↔ French language pairs, participating in both directions of each pair. Our submitted systems, based on the Transformer architecture, make ample use of data filtering, synthetic data and domain adaptation through fine-tuning. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5320/ |
https://www.aclweb.org/anthology/W19-5320 | |
PWC | https://paperswithcode.com/paper/the-mllp-upv-supervised-machine-translation |
Repo | |
Framework | |
Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation
Title | Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation |
Authors | Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, Anderson Nascimento |
Abstract | Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We propose the first privacy-preserving solution for text classification that is provably secure. Our method, which is based on Secure Multiparty Computation (SMC), encompasses both feature extraction from texts, and subsequent classification with logistic regression and tree ensembles. We prove that when using our secure text classification method, the application does not learn anything about the text, and the author of the text does not learn anything about the text classification model used by the application beyond what is given by the classification result itself. We perform end-to-end experiments with an application for detecting hate speech against women and immigrants, demonstrating excellent runtime results without loss of accuracy. |
Tasks | Text Classification |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation |
http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-classification-of-personal-1 |
Repo | |
Framework | |
CUNI Submission for Low-Resource Languages in WMT News 2019
Title | CUNI Submission for Low-Resource Languages in WMT News 2019 |
Authors | Tom Kocmi, Ond{\v{r}}ej Bojar |
Abstract | This paper describes the CUNI submission to the WMT 2019 News Translation Shared Task for the low-resource languages: Gujarati-English and Kazakh-English. We participated in both language pairs in both translation directions. Our system combines transfer learning from a different high-resource language pair followed by training on backtranslated monolingual data. Thanks to the simultaneous training in both directions, we can iterate the backtranslation process. We are using the Transformer model in a constrained submission. |
Tasks | Transfer Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5322/ |
https://www.aclweb.org/anthology/W19-5322 | |
PWC | https://paperswithcode.com/paper/cuni-submission-for-low-resource-languages-in |
Repo | |
Framework | |
Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese
Title | Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese |
Authors | Marco Antonio Sobrevilla Cabezudo, Thiago Pardo |
Abstract | Abstract Meaning Representation (AMR) is a recent and prominent semantic representation with good acceptance and several applications in the Natural Language Processing area. For English, there is a large annotated corpus (with approximately 39K sentences) that supports the research with the representation. However, to the best of our knowledge, there is only one restricted corpus for Portuguese, which contains 1,527 sentences. In this context, this paper presents an effort to build a general purpose AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines. Our results show that such approach is feasible, but there are some challenging phenomena to solve. More than this, efforts are necessary to increase the coverage of the corresponding lexical resource that supports the annotation. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4028/ |
https://www.aclweb.org/anthology/W19-4028 | |
PWC | https://paperswithcode.com/paper/towards-a-general-abstract-meaning |
Repo | |
Framework | |
Talking about what is not there: Generating indefinite referring expressions in Minecraft
Title | Talking about what is not there: Generating indefinite referring expressions in Minecraft |
Authors | Arne K{"o}hn, Alex Koller, er |
Abstract | When generating technical instructions, it is often necessary to describe an object that does not exist yet. For example, an NLG system which explains how to build a house needs to generate sentences like {}build *a wall of height five to your left*{''} and { }now build a wall on the other side.{''} Generating (indefinite) referring expressions to objects that do not exist yet is fundamentally different from generating the usual definite referring expressions, because the new object must be distinguished from an infinite set of possible alternatives. We formalize this problem and present an algorithm for generating such expressions, in the context of generating building instructions within the Minecraft video game. |
Tasks | |
Published | 2019-10-01 |
URL | https://www.aclweb.org/anthology/W19-8601/ |
https://www.aclweb.org/anthology/W19-8601 | |
PWC | https://paperswithcode.com/paper/talking-about-what-is-not-there-generating |
Repo | |
Framework | |
The NiuTrans Machine Translation Systems for WMT19
Title | The NiuTrans Machine Translation Systems for WMT19 |
Authors | Bei Li, Yinqiao Li, Chen Xu, Ye Lin, Jiqiang Liu, Hui Liu, Ziyang Wang, Yuhao Zhang, Nuo Xu, Zeyang Wang, Kai Feng, Hexuan Chen, Tengbo Liu, Yanyang Li, Qiang Wang, Tong Xiao, Jingbo Zhu |
Abstract | This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5325/ |
https://www.aclweb.org/anthology/W19-5325 | |
PWC | https://paperswithcode.com/paper/the-niutrans-machine-translation-systems-for |
Repo | |
Framework | |
Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation
Title | Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation |
Authors | Patrick Littell, Chi-kiu Lo, Samuel Larkin, Darlene Stewart |
Abstract | We describe the neural machine translation (NMT) system developed at the National Research Council of Canada (NRC) for the Kazakh-English news translation task of the Fourth Conference on Machine Translation (WMT19). Our submission is a multi-source NMT taking both the original Kazakh sentence and its Russian translation as input for translating into English. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5326/ |
https://www.aclweb.org/anthology/W19-5326 | |
PWC | https://paperswithcode.com/paper/multi-source-transformer-for-kazakh-russian |
Repo | |
Framework | |
Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence
Title | Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence |
Authors | Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace, Ani Nenkova |
Abstract | Standard paradigms for search do not work well in the medical context. Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search. Instead, we propose faceted-search in which a user specifies a condition and then can browse treatments and outcomes that have been evaluated. Choosing from these, they can access randomized control trials (RCTs) describing individual studies. Realizing such a view of the medical evidence requires information extraction techniques to identify the population, interventions, and outcome measures in an RCT. Patients, health practitioners, and biomedical librarians all stand to benefit from such innovation in search of medical evidence. We present an initial prototype of such an interface applied to pre-registered clinical studies. We also discuss pilot studies into the applicability of information extraction methods to allow for similar access to all published trial results. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-2606/ |
https://www.aclweb.org/anthology/W19-2606 | |
PWC | https://paperswithcode.com/paper/browsing-health-information-extraction-to |
Repo | |
Framework | |
Enthymemetic Conditionals
Title | Enthymemetic Conditionals |
Authors | Eimear Maguire |
Abstract | To model conditionals in a way that reflects their acceptability, we must include some means of making judgements about whether antecedent and consequent are meaningfully related or not. Enthymemes are non-logical arguments which do not hold up by themselves, but are acceptable through their relation to a topos, an already-known general principle or pattern for reasoning. This paper uses enthymemes and topoi as a way to model the world-knowledge behind these judgements. In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as networks rather than functions, and information state update rules for conditionals. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-1018/ |
https://www.aclweb.org/anthology/S19-1018 | |
PWC | https://paperswithcode.com/paper/enthymemetic-conditionals |
Repo | |
Framework | |
Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits
Title | Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits |
Authors | Sivan Sabato |
Abstract | We study epsilon-best-arm identification, in a setting where during the exploration phase, the cost of each arm pull is proportional to the expected future reward of that arm. We term this setting Pay-Per-Reward. We provide an algorithm for this setting, that with a high probability returns an epsilon-best arm, while incurring a cost that depends only linearly on the total expected reward of all arms, and does not depend at all on the number of arms. Under mild assumptions, the algorithm can be applied also to problems with infinitely many arms. |
Tasks | Multi-Armed Bandits |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits |
http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits.pdf | |
PWC | https://paperswithcode.com/paper/epsilon-best-arm-identification-in-pay-per |
Repo | |
Framework | |
Enhancing Unsupervised Generative Dependency Parser with Contextual Information
Title | Enhancing Unsupervised Generative Dependency Parser with Contextual Information |
Authors | Wenjuan Han, Yong Jiang, Kewei Tu |
Abstract | Most of the unsupervised dependency parsers are based on probabilistic generative models that learn the joint distribution of the given sentence and its parse. Probabilistic generative models usually explicit decompose the desired dependency tree into factorized grammar rules, which lack the global features of the entire sentence. In this paper, we propose a novel probabilistic model called discriminative neural dependency model with valence (D-NDMV) that generates a sentence and its parse from a continuous latent representation, which encodes global contextual information of the generated sentence. We propose two approaches to model the latent representation: the first deterministically summarizes the representation from the sentence and the second probabilistically models the representation conditioned on the sentence. Our approach can be regarded as a new type of autoencoder model to unsupervised dependency parsing that combines the benefits of both generative and discriminative techniques. In particular, our approach breaks the context-free independence assumption in previous generative approaches and therefore becomes more expressive. Our extensive experimental results on seventeen datasets from various sources show that our approach achieves competitive accuracy compared with both generative and discriminative state-of-the-art unsupervised dependency parsers. |
Tasks | Constituency Grammar Induction, Dependency Grammar Induction |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1526/ |
https://www.aclweb.org/anthology/P19-1526 | |
PWC | https://paperswithcode.com/paper/enhancing-unsupervised-generative-dependency |
Repo | |
Framework | |
Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization
Title | Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization |
Authors | Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber |
Abstract | Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings. However, orthogonal mapping only works on language pairs whose embeddings are naturally isomorphic. For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language{'}s average vector is zero. Iterative Normalization consistently improves word translation accuracy of three CLWE methods, with the largest improvement observed on English-Japanese (from 2{%} to 44{%} test accuracy). |
Tasks | Word Embeddings |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1307/ |
https://www.aclweb.org/anthology/P19-1307 | |
PWC | https://paperswithcode.com/paper/are-girls-neko-or-shojo-cross-lingual-1 |
Repo | |
Framework | |
Tilde’s Machine Translation Systems for WMT 2019
Title | Tilde’s Machine Translation Systems for WMT 2019 |
Authors | Marcis Pinnis, Rihards Kri{\v{s}}lauks, Mat{=\i}ss Rikters |
Abstract | The paper describes the development process of Tilde{'}s NMT systems for the WMT 2019 shared task on news translation. We trained systems for the English-Lithuanian and Lithuanian-English translation directions in constrained and unconstrained tracks. We build upon the best methods of the previous year{'}s competition and combine them with recent advancements in the field. We also present a new method to ensure source domain adherence in back-translated data. Our systems achieved a shared first place in human evaluation. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5335/ |
https://www.aclweb.org/anthology/W19-5335 | |
PWC | https://paperswithcode.com/paper/tildes-machine-translation-systems-for-wmt-2 |
Repo | |
Framework | |
The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification
Title | The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification |
Authors | Avishek Garain, Arpan Basu |
Abstract | This system paper is a description of the system submitted to {``}SemEval-2019 Task 6{''}, where we had to detect offensive language in Twitter. There were two specific target audiences, immigrants and women. The language of the tweets was English. We were required to first detect whether a tweet contains offensive content, and then we had to find out whether the tweet was targeted against some individual, group or other entity. Finally we were required to classify the targeted audience. | |
Tasks | Language Identification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2133/ |
https://www.aclweb.org/anthology/S19-2133 | |
PWC | https://paperswithcode.com/paper/the-titans-at-semeval-2019-task-6-offensive |
Repo | |
Framework | |
P=ali Sandhi – A computational approach
Title | P=ali Sandhi – A computational approach |
Authors | Swati Basapur, Shivani V, Sivaja Nair |
Abstract | |
Tasks | |
Published | 2019-10-01 |
URL | https://www.aclweb.org/anthology/W19-7513/ |
https://www.aclweb.org/anthology/W19-7513 | |
PWC | https://paperswithcode.com/paper/pali-sandhi-a-computational-approach |
Repo | |
Framework | |