January 25, 2020

2127 words 10 mins read

Paper Group NANR 96

The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task. Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation. CUNI Submission for Low-Resource Languages in WMT News 2019. Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese. Talking about what is no …

The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task


Title	The MLLP-UPV Supervised Machine Translation Systems for WMT19 News Translation Task
Authors	Javier Iranzo-S{'a}nchez, Gon{\c{c}}al Garc{'e}s D{'\i}az-Mun{'\i}o, Jorge Civera, Alfons Juan
Abstract	This paper describes the participation of the MLLP research group of the Universitat Polit{`e}cnica de Val{`e}ncia in the WMT 2019 News Translation Shared Task. In this edition, we have submitted systems for the German ↔ English and German ↔ French language pairs, participating in both directions of each pair. Our submitted systems, based on the Transformer architecture, make ample use of data filtering, synthetic data and domain adaptation through fine-tuning.
Tasks	Domain Adaptation, Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5320/
PDF	https://www.aclweb.org/anthology/W19-5320
PWC	https://paperswithcode.com/paper/the-mllp-upv-supervised-machine-translation
Repo
Framework

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation


Title	Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation
Authors	Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, Anderson Nascimento
Abstract	Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We propose the first privacy-preserving solution for text classification that is provably secure. Our method, which is based on Secure Multiparty Computation (SMC), encompasses both feature extraction from texts, and subsequent classification with logistic regression and tree ensembles. We prove that when using our secure text classification method, the application does not learn anything about the text, and the author of the text does not learn anything about the text classification model used by the application beyond what is given by the classification result itself. We perform end-to-end experiments with an application for detecting hate speech against women and immigrants, demonstrating excellent runtime results without loss of accuracy.
Tasks	Text Classification
Published	2019-12-01
URL	http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation
PDF	http://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-classification-of-personal-1
Repo
Framework

CUNI Submission for Low-Resource Languages in WMT News 2019


Title	CUNI Submission for Low-Resource Languages in WMT News 2019
Authors	Tom Kocmi, Ond{\v{r}}ej Bojar
Abstract	This paper describes the CUNI submission to the WMT 2019 News Translation Shared Task for the low-resource languages: Gujarati-English and Kazakh-English. We participated in both language pairs in both translation directions. Our system combines transfer learning from a different high-resource language pair followed by training on backtranslated monolingual data. Thanks to the simultaneous training in both directions, we can iterate the backtranslation process. We are using the Transformer model in a constrained submission.
Tasks	Transfer Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5322/
PDF	https://www.aclweb.org/anthology/W19-5322
PWC	https://paperswithcode.com/paper/cuni-submission-for-low-resource-languages-in
Repo
Framework

Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese


Title	Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese
Authors	Marco Antonio Sobrevilla Cabezudo, Thiago Pardo
Abstract	Abstract Meaning Representation (AMR) is a recent and prominent semantic representation with good acceptance and several applications in the Natural Language Processing area. For English, there is a large annotated corpus (with approximately 39K sentences) that supports the research with the representation. However, to the best of our knowledge, there is only one restricted corpus for Portuguese, which contains 1,527 sentences. In this context, this paper presents an effort to build a general purpose AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines. Our results show that such approach is feasible, but there are some challenging phenomena to solve. More than this, efforts are necessary to increase the coverage of the corresponding lexical resource that supports the annotation.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4028/
PDF	https://www.aclweb.org/anthology/W19-4028
PWC	https://paperswithcode.com/paper/towards-a-general-abstract-meaning
Repo
Framework

Talking about what is not there: Generating indefinite referring expressions in Minecraft


Title	Talking about what is not there: Generating indefinite referring expressions in Minecraft
Authors	Arne K{"o}hn, Alex Koller, er
Abstract	When generating technical instructions, it is often necessary to describe an object that does not exist yet. For example, an NLG system which explains how to build a house needs to generate sentences like {`}build a wall of height five to your left{''} and {`}now build a wall on the other side.{''} Generating (indefinite) referring expressions to objects that do not exist yet is fundamentally different from generating the usual definite referring expressions, because the new object must be distinguished from an infinite set of possible alternatives. We formalize this problem and present an algorithm for generating such expressions, in the context of generating building instructions within the Minecraft video game.
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-8601/
PDF	https://www.aclweb.org/anthology/W19-8601
PWC	https://paperswithcode.com/paper/talking-about-what-is-not-there-generating
Repo
Framework

The NiuTrans Machine Translation Systems for WMT19


Title	The NiuTrans Machine Translation Systems for WMT19
Authors	Bei Li, Yinqiao Li, Chen Xu, Ye Lin, Jiqiang Liu, Hui Liu, Ziyang Wang, Yuhao Zhang, Nuo Xu, Zeyang Wang, Kai Feng, Hexuan Chen, Tengbo Liu, Yanyang Li, Qiang Wang, Tong Xiao, Jingbo Zhu
Abstract	This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5325/
PDF	https://www.aclweb.org/anthology/W19-5325
PWC	https://paperswithcode.com/paper/the-niutrans-machine-translation-systems-for
Repo
Framework

Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation


Title	Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation
Authors	Patrick Littell, Chi-kiu Lo, Samuel Larkin, Darlene Stewart
Abstract	We describe the neural machine translation (NMT) system developed at the National Research Council of Canada (NRC) for the Kazakh-English news translation task of the Fourth Conference on Machine Translation (WMT19). Our submission is a multi-source NMT taking both the original Kazakh sentence and its Russian translation as input for translating into English.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5326/
PDF	https://www.aclweb.org/anthology/W19-5326
PWC	https://paperswithcode.com/paper/multi-source-transformer-for-kazakh-russian
Repo
Framework

Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence


Title	Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence
Authors	Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace, Ani Nenkova
Abstract	Standard paradigms for search do not work well in the medical context. Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search. Instead, we propose faceted-search in which a user specifies a condition and then can browse treatments and outcomes that have been evaluated. Choosing from these, they can access randomized control trials (RCTs) describing individual studies. Realizing such a view of the medical evidence requires information extraction techniques to identify the population, interventions, and outcome measures in an RCT. Patients, health practitioners, and biomedical librarians all stand to benefit from such innovation in search of medical evidence. We present an initial prototype of such an interface applied to pre-registered clinical studies. We also discuss pilot studies into the applicability of information extraction methods to allow for similar access to all published trial results.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2606/
PDF	https://www.aclweb.org/anthology/W19-2606
PWC	https://paperswithcode.com/paper/browsing-health-information-extraction-to
Repo
Framework

Enthymemetic Conditionals


Title	Enthymemetic Conditionals
Authors	Eimear Maguire
Abstract	To model conditionals in a way that reflects their acceptability, we must include some means of making judgements about whether antecedent and consequent are meaningfully related or not. Enthymemes are non-logical arguments which do not hold up by themselves, but are acceptable through their relation to a topos, an already-known general principle or pattern for reasoning. This paper uses enthymemes and topoi as a way to model the world-knowledge behind these judgements. In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as networks rather than functions, and information state update rules for conditionals.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-1018/
PDF	https://www.aclweb.org/anthology/S19-1018
PWC	https://paperswithcode.com/paper/enthymemetic-conditionals
Repo
Framework

Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits


Title	Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits
Authors	Sivan Sabato
Abstract	We study epsilon-best-arm identification, in a setting where during the exploration phase, the cost of each arm pull is proportional to the expected future reward of that arm. We term this setting Pay-Per-Reward. We provide an algorithm for this setting, that with a high probability returns an epsilon-best arm, while incurring a cost that depends only linearly on the total expected reward of all arms, and does not depend at all on the number of arms. Under mild assumptions, the algorithm can be applied also to problems with infinitely many arms.
Tasks	Multi-Armed Bandits
Published	2019-12-01
URL	http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits
PDF	http://papers.nips.cc/paper/8554-epsilon-best-arm-identification-in-pay-per-reward-multi-armed-bandits.pdf
PWC	https://paperswithcode.com/paper/epsilon-best-arm-identification-in-pay-per
Repo
Framework

Enhancing Unsupervised Generative Dependency Parser with Contextual Information


Title	Enhancing Unsupervised Generative Dependency Parser with Contextual Information
Authors	Wenjuan Han, Yong Jiang, Kewei Tu
Abstract	Most of the unsupervised dependency parsers are based on probabilistic generative models that learn the joint distribution of the given sentence and its parse. Probabilistic generative models usually explicit decompose the desired dependency tree into factorized grammar rules, which lack the global features of the entire sentence. In this paper, we propose a novel probabilistic model called discriminative neural dependency model with valence (D-NDMV) that generates a sentence and its parse from a continuous latent representation, which encodes global contextual information of the generated sentence. We propose two approaches to model the latent representation: the first deterministically summarizes the representation from the sentence and the second probabilistically models the representation conditioned on the sentence. Our approach can be regarded as a new type of autoencoder model to unsupervised dependency parsing that combines the benefits of both generative and discriminative techniques. In particular, our approach breaks the context-free independence assumption in previous generative approaches and therefore becomes more expressive. Our extensive experimental results on seventeen datasets from various sources show that our approach achieves competitive accuracy compared with both generative and discriminative state-of-the-art unsupervised dependency parsers.
Tasks	Constituency Grammar Induction, Dependency Grammar Induction
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1526/
PDF	https://www.aclweb.org/anthology/P19-1526
PWC	https://paperswithcode.com/paper/enhancing-unsupervised-generative-dependency
Repo
Framework

Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization


Title	Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization
Authors	Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber
Abstract	Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings. However, orthogonal mapping only works on language pairs whose embeddings are naturally isomorphic. For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language{'}s average vector is zero. Iterative Normalization consistently improves word translation accuracy of three CLWE methods, with the largest improvement observed on English-Japanese (from 2{%} to 44{%} test accuracy).
Tasks	Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1307/
PDF	https://www.aclweb.org/anthology/P19-1307
PWC	https://paperswithcode.com/paper/are-girls-neko-or-shojo-cross-lingual-1
Repo
Framework

Tilde’s Machine Translation Systems for WMT 2019


Title	Tilde’s Machine Translation Systems for WMT 2019
Authors	Marcis Pinnis, Rihards Kri{\v{s}}lauks, Mat{=\i}ss Rikters
Abstract	The paper describes the development process of Tilde{'}s NMT systems for the WMT 2019 shared task on news translation. We trained systems for the English-Lithuanian and Lithuanian-English translation directions in constrained and unconstrained tracks. We build upon the best methods of the previous year{'}s competition and combine them with recent advancements in the field. We also present a new method to ensure source domain adherence in back-translated data. Our systems achieved a shared first place in human evaluation.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5335/
PDF	https://www.aclweb.org/anthology/W19-5335
PWC	https://paperswithcode.com/paper/tildes-machine-translation-systems-for-wmt-2
Repo
Framework

The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification


Title	The Titans at SemEval-2019 Task 6: Offensive Language Identification, Categorization and Target Identification
Authors	Avishek Garain, Arpan Basu
Abstract	This system paper is a description of the system submitted to {``}SemEval-2019 Task 6{''}, where we had to detect offensive language in Twitter. There were two specific target audiences, immigrants and women. The language of the tweets was English. We were required to first detect whether a tweet contains offensive content, and then we had to find out whether the tweet was targeted against some individual, group or other entity. Finally we were required to classify the targeted audience. \|
Tasks	Language Identification
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2133/
PDF	https://www.aclweb.org/anthology/S19-2133
PWC	https://paperswithcode.com/paper/the-titans-at-semeval-2019-task-6-offensive
Repo
Framework

P=ali Sandhi – A computational approach


Title	P=ali Sandhi – A computational approach
Authors	Swati Basapur, Shivani V, Sivaja Nair
Abstract
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-7513/
PDF	https://www.aclweb.org/anthology/W19-7513
PWC	https://paperswithcode.com/paper/pali-sandhi-a-computational-approach
Repo
Framework