January 24, 2020

2419 words 12 mins read

Paper Group NANR 159

Naive Regularizers for Low-Resource Neural Machine Translation. A Multi-Task Learning Framework for Extracting Bacteria Biotope Information. Corpus of usage examples: What is it good for?. Solving Vision Problems via Filtering. A Modular Tool for Automatic Summarization. Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machi …

Naive Regularizers for Low-Resource Neural Machine Translation


Title	Naive Regularizers for Low-Resource Neural Machine Translation
Authors	Meriem Beloucif, Ana Valeria Gonzalez, Marcel Bollmann, Anders S{\o}gaard
Abstract	Neural machine translation models have little inductive bias, which can be a disadvantage in low-resource scenarios. Neural models have to be trained on large amounts of data and have been shown to perform poorly when only limited data is available. We show that using naive regularization methods, based on sentence length, punctuation and word frequencies, to penalize translations that are very different from the input sentences, consistently improves the translation quality across multiple low-resource languages. We experiment with 12 language pairs, varying the training data size between 17k to 230k sentence pairs. Our best regularizer achieves an average increase of 1.5 BLEU score and 1.0 TER score across all the language pairs. For example, we achieve a BLEU score of 26.70 on the IWSLT15 English{–}Vietnamese translation task simply by using relative differences in punctuation as a regularizer.
Tasks	Low-Resource Neural Machine Translation, Machine Translation
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1013/
PDF	https://www.aclweb.org/anthology/R19-1013
PWC	https://paperswithcode.com/paper/naive-regularizers-for-low-resource-neural
Repo
Framework

A Multi-Task Learning Framework for Extracting Bacteria Biotope Information


Title	A Multi-Task Learning Framework for Extracting Bacteria Biotope Information
Authors	Qi Zhang, Chao Liu, Ying Chi, Xuansong Xie, Xiansheng Hua
Abstract	This paper presents a novel transfer multi-task learning method for Bacteria Biotope rel+ner task at BioNLP-OST 2019. To alleviate the data deficiency problem in domain-specific information extraction, we use BERT(Bidirectional Encoder Representations from Transformers) and pre-train it using mask language models and next sentence prediction on both general corpus and medical corpus like PubMed. In fine-tuning stage, we fine-tune the relation extraction layer and mention recognition layer designed by us on the top of BERT to extract mentions and relations simultaneously. The evaluation results show that our method achieves the best performance on all metrics (including slot error rate, precision and recall) in the Bacteria Biotope rel+ner subtask.
Tasks	Multi-Task Learning, Relation Extraction
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5716/
PDF	https://www.aclweb.org/anthology/D19-5716
PWC	https://paperswithcode.com/paper/a-multi-task-learning-framework-for-2
Repo
Framework

Corpus of usage examples: What is it good for?


Title	Corpus of usage examples: What is it good for?
Authors	Timofey Arkhangelskiy
Abstract
Tasks
Published	2019-02-01
URL	https://www.aclweb.org/anthology/W19-6008/
PDF	https://www.aclweb.org/anthology/W19-6008
PWC	https://paperswithcode.com/paper/corpus-of-usage-examples-what-is-it-good-for
Repo
Framework

Solving Vision Problems via Filtering


Title	Solving Vision Problems via Filtering
Authors	Sean I. Young, Aous T. Naman, Bernd Girod, David Taubman
Abstract	We propose a new, filtering approach for solving a large number of regularized inverse problems commonly found in computer vision. Traditionally, such problems are solved by finding the solution to the system of equations that expresses the first-order optimality conditions of the problem. This can be slow if the system of equations is dense due to the use of nonlocal regularization, necessitating iterative solvers such as successive over-relaxation or conjugate gradients. In this paper, we show that similar solutions can be obtained more easily via filtering, obviating the need to solve a potentially dense system of equations using slow iterative methods. Our filtered solutions are very similar to the true ones, but often up to 10 times faster to compute.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Young_Solving_Vision_Problems_via_Filtering_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Young_Solving_Vision_Problems_via_Filtering_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/solving-vision-problems-via-filtering
Repo
Framework

A Modular Tool for Automatic Summarization


Title	A Modular Tool for Automatic Summarization
Authors	Valentin Nyzam, Aur{'e}lien Bossard
Abstract	This paper introduces the first fine-grained modular tool for automatic summarization. Open source and written in Java, it is designed to be as straightforward as possible for end-users. Its modular architecture is meant to ease its maintenance and the development and integration of new modules. We hope that it will ease the work of researchers in automatic summarization by providing a reliable baseline for future works as well as an easy way to evaluate methods on different corpora.
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-3030/
PDF	https://www.aclweb.org/anthology/P19-3030
PWC	https://paperswithcode.com/paper/a-modular-tool-for-automatic-summarization
Repo
Framework

Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machine Translation


Title	Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machine Translation
Authors	Pamela Shapiro, Kevin Duh
Abstract	When translating diglossic languages such as Arabic, situations may arise where we would like to translate a text but do not know which dialect it is. A traditional approach to this problem is to design dialect identification systems and dialect-specific machine translation systems. However, under the recent paradigm of neural machine translation, shared multi-dialectal systems have become a natural alternative. Here we explore under which conditions it is beneficial to perform dialect identification for Arabic neural machine translation versus using a general system for all dialects.
Tasks	Machine Translation
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-1424/
PDF	https://www.aclweb.org/anthology/W19-1424
PWC	https://paperswithcode.com/paper/comparing-pipelined-and-integrated-approaches
Repo
Framework

Variational Bayesian Phylogenetic Inference


Title	Variational Bayesian Phylogenetic Inference
Authors	Cheng Zhang, Frederick A. Matsen IV
Abstract	Bayesian phylogenetic inference is currently done via Markov chain Monte Carlo with simple mechanisms for proposing new states, which hinders exploration efficiency and often requires long runs to deliver accurate posterior estimates. In this paper we present an alternative approach: a variational framework for Bayesian phylogenetic analysis. We approximate the true posterior using an expressive graphical model for tree distributions, called a subsplit Bayesian network, together with appropriate branch length distributions. We train the variational approximation via stochastic gradient ascent and adopt multi-sample based gradient estimators for different latent variables separately to handle the composite latent space of phylogenetic models. We show that our structured variational approximations are flexible enough to provide comparable posterior estimation to MCMC, while requiring less computation due to a more efficient tree exploration mechanism enabled by variational inference. Moreover, the variational approximations can be readily used for further statistical analysis such as marginal likelihood estimation for model comparison via importance sampling. Experiments on both synthetic data and real data Bayesian phylogenetic inference problems demonstrate the effectiveness and efficiency of our methods.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SJVmjjR9FX
PDF	https://openreview.net/pdf?id=SJVmjjR9FX
PWC	https://paperswithcode.com/paper/variational-bayesian-phylogenetic-inference
Repo
Framework

Empirically Characterizing Overparameterization Impact on Convergence


Title	Empirically Characterizing Overparameterization Impact on Convergence
Authors	Newsha Ardalani, Joel Hestness, Gregory Diamos
Abstract	A long-held conventional wisdom states that larger models train more slowly when using gradient descent. This work challenges this widely-held belief, showing that larger models can potentially train faster despite the increasing computational requirements of each training step. In particular, we study the effect of network structure (depth and width) on halting time and show that larger models—wider models in particular—take fewer training steps to converge. We design simple experiments to quantitatively characterize the effect of overparametrization on weight space traversal. Results show that halting time improves when growing model’s width for three different applications, and the improvement comes from each factor: The distance from initialized weights to converged weights shrinks with a power-law-like relationship, the average step size grows with a power-law-like relationship, and gradient vectors become more aligned with each other during traversal.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=S1lPShAqFm
PDF	https://openreview.net/pdf?id=S1lPShAqFm
PWC	https://paperswithcode.com/paper/empirically-characterizing
Repo
Framework

Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation


Title	Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation
Authors	Xinyan Zhao, Deahan Yu, V.G.Vinod Vydiswaran
Abstract	Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3209/
PDF	https://www.aclweb.org/anthology/W19-3209
PWC	https://paperswithcode.com/paper/identifying-adverse-drug-events-mentions-in
Repo
Framework

NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System


Title	NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System
Authors	Amit Kumar, Anil Kumar Singh
Abstract	This paper describes the Machine Translation system for Tamil-English Indic Task organized at WAT 2019. We use Transformer- based architecture for Neural Machine Translation.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5222/
PDF	https://www.aclweb.org/anthology/D19-5222
PWC	https://paperswithcode.com/paper/nlprl-at-wat2019-transformer-based-tamil
Repo
Framework

Combining Discourse Markers and Cross-lingual Embeddings for Synonym–Antonym Classification


Title	Combining Discourse Markers and Cross-lingual Embeddings for Synonym–Antonym Classification
Authors	Michael Roth, Shyam Upadhyay
Abstract	It is well-known that distributional semantic approaches have difficulty in distinguishing between synonyms and antonyms (Grefenstette, 1992; Pad{'o} and Lapata, 2003). Recent work has shown that supervision available in English for this task (e.g., lexical resources) can be transferred to other languages via cross-lingual word embeddings. However, this kind of transfer misses monolingual distributional information available in a target language, such as contrast relations that are indicative of antonymy (e.g. hot … while … cold). In this work, we improve the transfer by exploiting monolingual information, expressed in the form of co-occurrences with discourse markers that convey contrast. Our approach makes use of less than a dozen markers, which can easily be obtained for many languages. Compared to a baseline using only cross-lingual embeddings, we show absolute improvements of 4{–}10{%} F1-score in Vietnamese and Hindi.
Tasks	Word Embeddings
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1390/
PDF	https://www.aclweb.org/anthology/N19-1390
PWC	https://paperswithcode.com/paper/combining-discourse-markers-and-cross-lingual
Repo
Framework

ZQM at SemEval-2019 Task9: A Single Layer CNN Based on Pre-trained Model for Suggestion Mining


Title	ZQM at SemEval-2019 Task9: A Single Layer CNN Based on Pre-trained Model for Suggestion Mining
Authors	Qimin Zhou, Zhengxin Zhang, Hao Wu, Linmao Wang
Abstract	This paper describes our system that competed at SemEval 2019 Task 9 - SubTask A: {''}Sug- gestion Mining from Online Reviews and Forums{''}. Our system fuses the convolutional neural network and the latest BERT model to conduct suggestion mining. In our system, the input of convolutional neural network is the embedding vectors which are drawn from the pre-trained BERT model. And to enhance the effectiveness of the whole system, the pre-trained BERT model is fine-tuned by provided datasets before the procedure of embedding vectors extraction. Empirical results show the effectiveness of our model which obtained 9th position out of 34 teams with F1 score equals to 0.715.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2226/
PDF	https://www.aclweb.org/anthology/S19-2226
PWC	https://paperswithcode.com/paper/zqm-at-semeval-2019-task9-a-single-layer-cnn
Repo
Framework

One format to rule them all – The emtsv pipeline for Hungarian


Title	One format to rule them all – The emtsv pipeline for Hungarian
Authors	Bal{'a}zs Indig, B{'a}lint Sass, Eszter Simon, Iv{'a}n Mittelholcz, No{'e}mi Vad{'a}sz, M{'a}rton Makrai
Abstract	We present a more efficient version of the e-magyar NLP pipeline for Hungarian called emtsv. It integrates Hungarian NLP tools in a framework whose individual modules can be developed or replaced independently and allows new ones to be added. The design also allows convenient investigation and manual correction of the data flow from one module to another. The improvements we publish include effective communication between the modules and support of the use of individual modules both in the chain and standing alone. Our goals are accomplished using extended tsv (tab separated values) files, a simple, uniform, generic and self-documenting input/output format. Our vision is maintaining the system for a long time and making it easier for external developers to fit their own modules into the system, thus sharing existing competencies in the field of processing Hungarian, a mid-resourced language. The source code is available under LGPL 3.0 license at https://github.com/dlt-rilmta/emtsv .
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4018/
PDF	https://www.aclweb.org/anthology/W19-4018
PWC	https://paperswithcode.com/paper/one-format-to-rule-them-all-the-emtsv
Repo
Framework

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy


Title	Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy
Authors	Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang
Abstract	Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning. However, due to nonconvexity, the global convergence of PPO and TRPO remains less understood, which separates theory from practice. In this paper, we prove that a variant of PPO and TRPO equipped with overparametrized neural networks converges to the globally optimal policy at a sublinear rate. The key to our analysis is the global convergence of infinite-dimensional mirror descent under a notion of one-point monotonicity, where the gradient and iterate are instantiated by neural networks. In particular, the desirable representation power and optimization geometry induced by the overparametrization of such neural networks allow them to accurately approximate the infinite-dimensional gradient and iterate.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9242-neural-trust-regionproximal-policy-optimization-attains-globally-optimal-policy
PDF	http://papers.nips.cc/paper/9242-neural-trust-regionproximal-policy-optimization-attains-globally-optimal-policy.pdf
PWC	https://paperswithcode.com/paper/neural-trust-regionproximal-policy
Repo
Framework

Learning data-derived privacy preserving representations from information metrics


Title	Learning data-derived privacy preserving representations from information metrics
Authors	Martin Bertran, Natalia Martinez, Afroditi Papadaki, Qiang Qiu, Miguel Rodrigues, Guillermo Sapiro
Abstract	It is clear that users should own and control their data and privacy. Utility providers are also becoming more interested in guaranteeing data privacy. Therefore, users and providers can and should collaborate in privacy protecting challenges, and this paper addresses this new paradigm. We propose a framework where the user controls what characteristics of the data they want to share (utility) and what they want to keep private (secret), without necessarily asking the utility provider to change its existing machine learning algorithms. We first analyze the space of privacy-preserving representations and derive natural information-theoretic bounds on the utility-privacy trade-off when disclosing a sanitized version of the data X. We present explicit learning architectures to learn privacy-preserving representations that approach this bound in a data-driven fashion. We describe important use-case scenarios where the utility providers are willing to collaborate with the sanitization process. We study space-preserving transformations where the utility provider can use the same algorithm on original and sanitized data, a critical and novel attribute to help service providers accommodate varying privacy requirements with a single set of utility algorithms. We illustrate this framework through the implementation of three use cases; subject-within-subject, where we tackle the problem of having a face identity detector that works only on a consenting subset of users, an important application, for example, for mobile devices activated by face recognition; gender-and-subject, where we preserve facial verification while hiding the gender attribute for users who choose to do so; and emotion-and-gender, where we hide independent variables, as is the case of hiding gender while preserving emotion detection.
Tasks	Face Recognition
Published	2019-05-01
URL	https://openreview.net/forum?id=SJe2so0qF7
PDF	https://openreview.net/pdf?id=SJe2so0qF7
PWC	https://paperswithcode.com/paper/learning-data-derived-privacy-preserving
Repo
Framework