January 24, 2020

2419 words 12 mins read

Paper Group NANR 159

Paper Group NANR 159

Naive Regularizers for Low-Resource Neural Machine Translation. A Multi-Task Learning Framework for Extracting Bacteria Biotope Information. Corpus of usage examples: What is it good for?. Solving Vision Problems via Filtering. A Modular Tool for Automatic Summarization. Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machi …

Naive Regularizers for Low-Resource Neural Machine Translation

Title Naive Regularizers for Low-Resource Neural Machine Translation
Authors Meriem Beloucif, Ana Valeria Gonzalez, Marcel Bollmann, Anders S{\o}gaard
Abstract Neural machine translation models have little inductive bias, which can be a disadvantage in low-resource scenarios. Neural models have to be trained on large amounts of data and have been shown to perform poorly when only limited data is available. We show that using naive regularization methods, based on sentence length, punctuation and word frequencies, to penalize translations that are very different from the input sentences, consistently improves the translation quality across multiple low-resource languages. We experiment with 12 language pairs, varying the training data size between 17k to 230k sentence pairs. Our best regularizer achieves an average increase of 1.5 BLEU score and 1.0 TER score across all the language pairs. For example, we achieve a BLEU score of 26.70 on the IWSLT15 English{–}Vietnamese translation task simply by using relative differences in punctuation as a regularizer.
Tasks Low-Resource Neural Machine Translation, Machine Translation
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1013/
PDF https://www.aclweb.org/anthology/R19-1013
PWC https://paperswithcode.com/paper/naive-regularizers-for-low-resource-neural
Repo
Framework

A Multi-Task Learning Framework for Extracting Bacteria Biotope Information

Title A Multi-Task Learning Framework for Extracting Bacteria Biotope Information
Authors Qi Zhang, Chao Liu, Ying Chi, Xuansong Xie, Xiansheng Hua
Abstract This paper presents a novel transfer multi-task learning method for Bacteria Biotope rel+ner task at BioNLP-OST 2019. To alleviate the data deficiency problem in domain-specific information extraction, we use BERT(Bidirectional Encoder Representations from Transformers) and pre-train it using mask language models and next sentence prediction on both general corpus and medical corpus like PubMed. In fine-tuning stage, we fine-tune the relation extraction layer and mention recognition layer designed by us on the top of BERT to extract mentions and relations simultaneously. The evaluation results show that our method achieves the best performance on all metrics (including slot error rate, precision and recall) in the Bacteria Biotope rel+ner subtask.
Tasks Multi-Task Learning, Relation Extraction
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5716/
PDF https://www.aclweb.org/anthology/D19-5716
PWC https://paperswithcode.com/paper/a-multi-task-learning-framework-for-2
Repo
Framework

Corpus of usage examples: What is it good for?

Title Corpus of usage examples: What is it good for?
Authors Timofey Arkhangelskiy
Abstract
Tasks
Published 2019-02-01
URL https://www.aclweb.org/anthology/W19-6008/
PDF https://www.aclweb.org/anthology/W19-6008
PWC https://paperswithcode.com/paper/corpus-of-usage-examples-what-is-it-good-for
Repo
Framework

Solving Vision Problems via Filtering

Title Solving Vision Problems via Filtering
Authors Sean I. Young, Aous T. Naman, Bernd Girod, David Taubman
Abstract We propose a new, filtering approach for solving a large number of regularized inverse problems commonly found in computer vision. Traditionally, such problems are solved by finding the solution to the system of equations that expresses the first-order optimality conditions of the problem. This can be slow if the system of equations is dense due to the use of nonlocal regularization, necessitating iterative solvers such as successive over-relaxation or conjugate gradients. In this paper, we show that similar solutions can be obtained more easily via filtering, obviating the need to solve a potentially dense system of equations using slow iterative methods. Our filtered solutions are very similar to the true ones, but often up to 10 times faster to compute.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Young_Solving_Vision_Problems_via_Filtering_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Young_Solving_Vision_Problems_via_Filtering_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/solving-vision-problems-via-filtering
Repo
Framework

A Modular Tool for Automatic Summarization

Title A Modular Tool for Automatic Summarization
Authors Valentin Nyzam, Aur{'e}lien Bossard
Abstract This paper introduces the first fine-grained modular tool for automatic summarization. Open source and written in Java, it is designed to be as straightforward as possible for end-users. Its modular architecture is meant to ease its maintenance and the development and integration of new modules. We hope that it will ease the work of researchers in automatic summarization by providing a reliable baseline for future works as well as an easy way to evaluate methods on different corpora.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-3030/
PDF https://www.aclweb.org/anthology/P19-3030
PWC https://paperswithcode.com/paper/a-modular-tool-for-automatic-summarization
Repo
Framework

Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machine Translation

Title Comparing Pipelined and Integrated Approaches to Dialectal Arabic Neural Machine Translation
Authors Pamela Shapiro, Kevin Duh
Abstract When translating diglossic languages such as Arabic, situations may arise where we would like to translate a text but do not know which dialect it is. A traditional approach to this problem is to design dialect identification systems and dialect-specific machine translation systems. However, under the recent paradigm of neural machine translation, shared multi-dialectal systems have become a natural alternative. Here we explore under which conditions it is beneficial to perform dialect identification for Arabic neural machine translation versus using a general system for all dialects.
Tasks Machine Translation
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1424/
PDF https://www.aclweb.org/anthology/W19-1424
PWC https://paperswithcode.com/paper/comparing-pipelined-and-integrated-approaches
Repo
Framework

Variational Bayesian Phylogenetic Inference

Title Variational Bayesian Phylogenetic Inference
Authors Cheng Zhang, Frederick A. Matsen IV
Abstract Bayesian phylogenetic inference is currently done via Markov chain Monte Carlo with simple mechanisms for proposing new states, which hinders exploration efficiency and often requires long runs to deliver accurate posterior estimates. In this paper we present an alternative approach: a variational framework for Bayesian phylogenetic analysis. We approximate the true posterior using an expressive graphical model for tree distributions, called a subsplit Bayesian network, together with appropriate branch length distributions. We train the variational approximation via stochastic gradient ascent and adopt multi-sample based gradient estimators for different latent variables separately to handle the composite latent space of phylogenetic models. We show that our structured variational approximations are flexible enough to provide comparable posterior estimation to MCMC, while requiring less computation due to a more efficient tree exploration mechanism enabled by variational inference. Moreover, the variational approximations can be readily used for further statistical analysis such as marginal likelihood estimation for model comparison via importance sampling. Experiments on both synthetic data and real data Bayesian phylogenetic inference problems demonstrate the effectiveness and efficiency of our methods.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SJVmjjR9FX
PDF https://openreview.net/pdf?id=SJVmjjR9FX
PWC https://paperswithcode.com/paper/variational-bayesian-phylogenetic-inference
Repo
Framework

Empirically Characterizing Overparameterization Impact on Convergence

Title Empirically Characterizing Overparameterization Impact on Convergence
Authors Newsha Ardalani, Joel Hestness, Gregory Diamos
Abstract A long-held conventional wisdom states that larger models train more slowly when using gradient descent. This work challenges this widely-held belief, showing that larger models can potentially train faster despite the increasing computational requirements of each training step. In particular, we study the effect of network structure (depth and width) on halting time and show that larger models—wider models in particular—take fewer training steps to converge. We design simple experiments to quantitatively characterize the effect of overparametrization on weight space traversal. Results show that halting time improves when growing model’s width for three different applications, and the improvement comes from each factor: The distance from initialized weights to converged weights shrinks with a power-law-like relationship, the average step size grows with a power-law-like relationship, and gradient vectors become more aligned with each other during traversal.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=S1lPShAqFm
PDF https://openreview.net/pdf?id=S1lPShAqFm
PWC https://paperswithcode.com/paper/empirically-characterizing
Repo
Framework

Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation

Title Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation
Authors Xinyan Zhao, Deahan Yu, V.G.Vinod Vydiswaran
Abstract Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3209/
PDF https://www.aclweb.org/anthology/W19-3209
PWC https://paperswithcode.com/paper/identifying-adverse-drug-events-mentions-in
Repo
Framework

NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System

Title NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System
Authors Amit Kumar, Anil Kumar Singh
Abstract This paper describes the Machine Translation system for Tamil-English Indic Task organized at WAT 2019. We use Transformer- based architecture for Neural Machine Translation.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5222/
PDF https://www.aclweb.org/anthology/D19-5222
PWC https://paperswithcode.com/paper/nlprl-at-wat2019-transformer-based-tamil
Repo
Framework

Combining Discourse Markers and Cross-lingual Embeddings for Synonym–Antonym Classification

Title Combining Discourse Markers and Cross-lingual Embeddings for Synonym–Antonym Classification
Authors Michael Roth, Shyam Upadhyay
Abstract It is well-known that distributional semantic approaches have difficulty in distinguishing between synonyms and antonyms (Grefenstette, 1992; Pad{'o} and Lapata, 2003). Recent work has shown that supervision available in English for this task (e.g., lexical resources) can be transferred to other languages via cross-lingual word embeddings. However, this kind of transfer misses monolingual distributional information available in a target language, such as contrast relations that are indicative of antonymy (e.g. hot … while … cold). In this work, we improve the transfer by exploiting monolingual information, expressed in the form of co-occurrences with discourse markers that convey contrast. Our approach makes use of less than a dozen markers, which can easily be obtained for many languages. Compared to a baseline using only cross-lingual embeddings, we show absolute improvements of 4{–}10{%} F1-score in Vietnamese and Hindi.
Tasks Word Embeddings
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1390/
PDF https://www.aclweb.org/anthology/N19-1390
PWC https://paperswithcode.com/paper/combining-discourse-markers-and-cross-lingual
Repo
Framework

ZQM at SemEval-2019 Task9: A Single Layer CNN Based on Pre-trained Model for Suggestion Mining

Title ZQM at SemEval-2019 Task9: A Single Layer CNN Based on Pre-trained Model for Suggestion Mining
Authors Qimin Zhou, Zhengxin Zhang, Hao Wu, Linmao Wang
Abstract This paper describes our system that competed at SemEval 2019 Task 9 - SubTask A: {''}Sug- gestion Mining from Online Reviews and Forums{''}. Our system fuses the convolutional neural network and the latest BERT model to conduct suggestion mining. In our system, the input of convolutional neural network is the embedding vectors which are drawn from the pre-trained BERT model. And to enhance the effectiveness of the whole system, the pre-trained BERT model is fine-tuned by provided datasets before the procedure of embedding vectors extraction. Empirical results show the effectiveness of our model which obtained 9th position out of 34 teams with F1 score equals to 0.715.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2226/
PDF https://www.aclweb.org/anthology/S19-2226
PWC https://paperswithcode.com/paper/zqm-at-semeval-2019-task9-a-single-layer-cnn
Repo
Framework

One format to rule them all – The emtsv pipeline for Hungarian

Title One format to rule them all – The emtsv pipeline for Hungarian
Authors Bal{'a}zs Indig, B{'a}lint Sass, Eszter Simon, Iv{'a}n Mittelholcz, No{'e}mi Vad{'a}sz, M{'a}rton Makrai
Abstract We present a more efficient version of the e-magyar NLP pipeline for Hungarian called emtsv. It integrates Hungarian NLP tools in a framework whose individual modules can be developed or replaced independently and allows new ones to be added. The design also allows convenient investigation and manual correction of the data flow from one module to another. The improvements we publish include effective communication between the modules and support of the use of individual modules both in the chain and standing alone. Our goals are accomplished using extended tsv (tab separated values) files, a simple, uniform, generic and self-documenting input/output format. Our vision is maintaining the system for a long time and making it easier for external developers to fit their own modules into the system, thus sharing existing competencies in the field of processing Hungarian, a mid-resourced language. The source code is available under LGPL 3.0 license at https://github.com/dlt-rilmta/emtsv .
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4018/
PDF https://www.aclweb.org/anthology/W19-4018
PWC https://paperswithcode.com/paper/one-format-to-rule-them-all-the-emtsv
Repo
Framework

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy

Title Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy
Authors Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang
Abstract Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning. However, due to nonconvexity, the global convergence of PPO and TRPO remains less understood, which separates theory from practice. In this paper, we prove that a variant of PPO and TRPO equipped with overparametrized neural networks converges to the globally optimal policy at a sublinear rate. The key to our analysis is the global convergence of infinite-dimensional mirror descent under a notion of one-point monotonicity, where the gradient and iterate are instantiated by neural networks. In particular, the desirable representation power and optimization geometry induced by the overparametrization of such neural networks allow them to accurately approximate the infinite-dimensional gradient and iterate.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9242-neural-trust-regionproximal-policy-optimization-attains-globally-optimal-policy
PDF http://papers.nips.cc/paper/9242-neural-trust-regionproximal-policy-optimization-attains-globally-optimal-policy.pdf
PWC https://paperswithcode.com/paper/neural-trust-regionproximal-policy
Repo
Framework

Learning data-derived privacy preserving representations from information metrics

Title Learning data-derived privacy preserving representations from information metrics
Authors Martin Bertran, Natalia Martinez, Afroditi Papadaki, Qiang Qiu, Miguel Rodrigues, Guillermo Sapiro
Abstract It is clear that users should own and control their data and privacy. Utility providers are also becoming more interested in guaranteeing data privacy. Therefore, users and providers can and should collaborate in privacy protecting challenges, and this paper addresses this new paradigm. We propose a framework where the user controls what characteristics of the data they want to share (utility) and what they want to keep private (secret), without necessarily asking the utility provider to change its existing machine learning algorithms. We first analyze the space of privacy-preserving representations and derive natural information-theoretic bounds on the utility-privacy trade-off when disclosing a sanitized version of the data X. We present explicit learning architectures to learn privacy-preserving representations that approach this bound in a data-driven fashion. We describe important use-case scenarios where the utility providers are willing to collaborate with the sanitization process. We study space-preserving transformations where the utility provider can use the same algorithm on original and sanitized data, a critical and novel attribute to help service providers accommodate varying privacy requirements with a single set of utility algorithms. We illustrate this framework through the implementation of three use cases; subject-within-subject, where we tackle the problem of having a face identity detector that works only on a consenting subset of users, an important application, for example, for mobile devices activated by face recognition; gender-and-subject, where we preserve facial verification while hiding the gender attribute for users who choose to do so; and emotion-and-gender, where we hide independent variables, as is the case of hiding gender while preserving emotion detection.
Tasks Face Recognition
Published 2019-05-01
URL https://openreview.net/forum?id=SJe2so0qF7
PDF https://openreview.net/pdf?id=SJe2so0qF7
PWC https://paperswithcode.com/paper/learning-data-derived-privacy-preserving
Repo
Framework
comments powered by Disqus