January 24, 2020

2277 words 11 mins read

Paper Group NANR 128

Paper Group NANR 128

English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019. ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets. Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding. NTT Neural Machine Translation Systems at WAT 2019. Lexical Normalization …

English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019

Title English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019
Authors Rui Wang, Haipeng Sun, Kehai Chen, Chenchen Ding, Masao Utiyama, Eiichiro Sumita
Abstract This paper presents the NICT{'}s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions. We built neural machine translation (NMT) systems for these tasks. Our NMT systems were trained with language model pretraining. Back-translation technology is adopted to NMT. Our NMT systems rank the third in English-to-Myanmar and the second in Myanmar-to-English according to BLEU score.
Tasks Language Modelling, Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5209/
PDF https://www.aclweb.org/anthology/D19-5209
PWC https://paperswithcode.com/paper/english-myanmar-supervised-and-unsupervised
Repo
Framework

ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets

Title ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets
Authors Skatje Myers, Martha Palmer
Abstract This paper proposes using a Bidirectional LSTM-CRF model in order to identify the tense and aspect of verbs. The information that this classifier outputs can be useful for ordering events and can provide a pre-processing step to improve efficiency of annotating this type of information. This neural network architecture has been successfully employed for other sequential labeling tasks, and we show that it significantly outperforms the rule-based tool TMV-annotator on the Propbank I dataset.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3315/
PDF https://www.aclweb.org/anthology/W19-3315
PWC https://paperswithcode.com/paper/cleartac-verb-tense-aspect-and-form
Repo
Framework

Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding

Title Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding
Authors Shang-Chi Tsai, Ting-Yun Chang, Yun-Nung Chen
Abstract Clinical notes are essential medical documents to record each patient{'}s symptoms. Each record is typically annotated with medical diagnostic codes, which means diagnosis and treatment. This paper focuses on predicting diagnostic codes given the descriptive present illness in electronic health records by leveraging domain knowledge. We investigate various losses in a convolutional model to utilize hierarchical category knowledge of diagnostic codes in order to allow the model to share semantics across different labels under the same category. The proposed model not only considers the external domain knowledge but also addresses the issue about data imbalance. The MIMIC3 benchmark experiments show that the proposed methods can effectively utilize category knowledge and provide informative cues to improve the performance in terms of the top-ranked diagnostic codes which is better than the prior state-of-the-art. The investigation and discussion express the potential of integrating the domain knowledge in the current machine learning based models and guiding future research directions.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6206/
PDF https://www.aclweb.org/anthology/D19-6206
PWC https://paperswithcode.com/paper/leveraging-hierarchical-category-knowledge
Repo
Framework

NTT Neural Machine Translation Systems at WAT 2019

Title NTT Neural Machine Translation Systems at WAT 2019
Authors Makoto Morishita, Jun Suzuki, Masaaki Nagata
Abstract In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5211/
PDF https://www.aclweb.org/anthology/D19-5211
PWC https://paperswithcode.com/paper/ntt-neural-machine-translation-systems-at-wat-1
Repo
Framework

Lexical Normalization of User-Generated Medical Text

Title Lexical Normalization of User-Generated Medical Text
Authors Anne Dirkson, Suzan Verberne, Wessel Kraaij
Abstract In the medical domain, user-generated social media text is increasingly used as a valuable complementary knowledge source to scientific medical literature. The extraction of this knowledge is complicated by colloquial language use and misspellings. Yet, lexical normalization of such data has not been addressed properly. This paper presents an unsupervised, data-driven spelling correction module for medical social media. Our method outperforms state-of-the-art spelling correction and can detect mistakes with an F0.5 of 0.888. Additionally, we present a novel corpus for spelling mistake detection and correction on a medical patient forum.
Tasks Lexical Normalization, Spelling Correction
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3202/
PDF https://www.aclweb.org/anthology/W19-3202
PWC https://paperswithcode.com/paper/lexical-normalization-of-user-generated
Repo
Framework

Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019

Title Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019
Authors Hideya Mino, Hitoshi Ito, Isao Goto, Ichiro Yamada, Hideki Tanaka, Takenobu Tokunaga
Abstract This paper describes NHK and NHK Engineering System (NHK-ES){'}s submission to the newswire translation tasks of WAT 2019 in both directions of Japanese→English and English→Japanese. In addition to the JIJI Corpus that was officially provided by the task organizer, we developed a corpus of 0.22M sentence pairs by manually, translating Japanese news sentences into English content- equivalently. The content-equivalent corpus was effective for improving translation quality, and our systems achieved the best human evaluation scores in the newswire translation tasks at WAT 2019.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5212/
PDF https://www.aclweb.org/anthology/D19-5212
PWC https://paperswithcode.com/paper/neural-machine-translation-system-using-a
Repo
Framework

Combining Translation Memory with Neural Machine Translation

Title Combining Translation Memory with Neural Machine Translation
Authors Akiko Eriguchi, Spencer Rarrick, Hitokazu Matsushita
Abstract In this paper, we report our submission systems (geoduck) to the Timely Disclosure task on the 6th Workshop on Asian Translation (WAT) (Nakazawa et al., 2019). Our system employs a combined approach of translation memory and Neural Machine Translation (NMT) models, where we can select final translation outputs from either a translation memory or an NMT system, when the similarity score of a test source sentence exceeds the predefined threshold. We observed that this combination approach significantly improves the translation performance on the Timely Disclosure corpus, as compared to a standalone NMT system. We also conducted source-based direct assessment on the final output, and we discuss the comparison between human references and each system{'}s output.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5214/
PDF https://www.aclweb.org/anthology/D19-5214
PWC https://paperswithcode.com/paper/combining-translation-memory-with-neural
Repo
Framework

Abstract Meaning Representation for Human-Robot Dialogue

Title Abstract Meaning Representation for Human-Robot Dialogue
Authors Claire N. Bonial, Lucia Donatelli, Jessica Ervin, Clare R. Voss
Abstract
Tasks
Published 2019-01-01
URL https://www.aclweb.org/anthology/W19-0124/
PDF https://www.aclweb.org/anthology/W19-0124
PWC https://paperswithcode.com/paper/abstract-meaning-representation-for-human
Repo
Framework

Dilated LSTM with attention for Classification of Suicide Notes

Title Dilated LSTM with attention for Classification of Suicide Notes
Authors Annika M Schoene, George Lacey, Alex Turner, er P, Nina Dethlefs
Abstract In this paper we present a dilated LSTM with attention mechanism for document-level classification of suicide notes, last statements and depressed notes. We achieve an accuracy of 87.34{%} compared to competitive baselines of 80.35{%} (Logistic Model Tree) and 82.27{%} (Bi-directional LSTM with Attention). Furthermore, we provide an analysis of both the grammatical and thematic content of suicide notes, last statements and depressed notes. We find that the use of personal pronouns, cognitive processes and references to loved ones are most important. Finally, we show through visualisations of attention weights that the Dilated LSTM with attention is able to identify the same distinguishing features across documents as the linguistic analysis.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6217/
PDF https://www.aclweb.org/anthology/D19-6217
PWC https://paperswithcode.com/paper/dilated-lstm-with-attention-for
Repo
Framework

Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019

Title Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Authors Kenji Imamura, Eiichiro Sumita
Abstract This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks{—}ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en{—}using this system.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5217/
PDF https://www.aclweb.org/anthology/D19-5217
PWC https://paperswithcode.com/paper/long-warm-up-and-self-training-training
Repo
Framework

Outlier Detection and Robust PCA Using a Convex Measure of Innovation

Title Outlier Detection and Robust PCA Using a Convex Measure of Innovation
Authors Mostafa Rahmani, Ping Li
Abstract This paper presents a provable and strong algorithm, termed Innovation Search (iSearch), to robust Principal Component Analysis (PCA) and outlier detection. An outlier by definition is a data point which does not participate in forming a low dimensional structure with a large number of data points in the data. In other word, an outlier carries some innovation with respect to most of the other data points. iSearch ranks the data points based on their values of innovation. A convex optimization problem is proposed whose optimal value is used as our measure of innovation. We derive analytical performance guarantees for the proposed robust PCA method under different models for the distribution of the outliers including randomly distributed outliers, clustered outliers, and linearly dependent outliers. Moreover, it is shown that iSearch provably recovers the span of the inliers when the inliers lie in a union of subspaces. In the challenging scenarios in which the outliers are close to each other or they are close to the span of the inliers, iSearch is shown to outperform most of the existing methods.
Tasks Outlier Detection
Published 2019-12-01
URL http://papers.nips.cc/paper/9568-outlier-detection-and-robust-pca-using-a-convex-measure-of-innovation
PDF http://papers.nips.cc/paper/9568-outlier-detection-and-robust-pca-using-a-convex-measure-of-innovation.pdf
PWC https://paperswithcode.com/paper/outlier-detection-and-robust-pca-using-a
Repo
Framework

Confirming the Non-compositionality of Idioms for Sentiment Analysis

Title Confirming the Non-compositionality of Idioms for Sentiment Analysis
Authors Alyssa Hwang, Christopher Hidey
Abstract An idiom is defined as a non-compositional multiword expression, one whose meaning cannot be deduced from the definitions of the component words. This definition does not explicitly define the compositionality of an idiom{'}s sentiment; this paper aims to determine whether the sentiment of the component words of an idiom is related to the sentiment of that idiom. We use the Dictionary of Affect in Language augmented by WordNet to give each idiom in the Sentiment Lexicon of IDiomatic Expressions (SLIDE) a component-wise sentiment score and compare it to the phrase-level sentiment label crowdsourced by the creators of SLIDE. We find that there is no discernible relation between these two measures of idiom sentiment. This supports the hypothesis that idioms are not compositional for sentiment along with semantics and motivates further work in handling idioms for sentiment analysis.
Tasks Sentiment Analysis
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5114/
PDF https://www.aclweb.org/anthology/W19-5114
PWC https://paperswithcode.com/paper/confirming-the-non-compositionality-of-idioms
Repo
Framework

Sarah’s Participation in WAT 2019

Title Sarah’s Participation in WAT 2019
Authors Raymond Hendy Susanto, Ohnmar Htun, Liling Tan
Abstract This paper describes our MT systems{'} participation in the of WAT 2019. We participated in the (i) Patent, (ii) Timely Disclosure, (iii) Newswire and (iv) Mixed-domain tasks. Our main focus is to explore how similar Transformer models perform on various tasks. We observed that for tasks with smaller datasets, our best model setup are shallower models with lesser number of attention heads. We investigated practical issues in NMT that often appear in production settings, such as coping with multilinguality and simplifying pre- and post-processing pipeline in deployment.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5219/
PDF https://www.aclweb.org/anthology/D19-5219
PWC https://paperswithcode.com/paper/sarahs-participation-in-wat-2019
Repo
Framework

Deep Global Generalized Gaussian Networks

Title Deep Global Generalized Gaussian Networks
Authors Qilong Wang, Peihua Li, Qinghua Hu, Pengfei Zhu, Wangmeng Zuo
Abstract Recently, global covariance pooling (GCP) has shown great advance in improving classification performance of deep convolutional neural networks (CNNs). However, existing deep GCP networks compute covariance pooling of convolutional activations with assumption that activations are sampled from Gaussian distributions, which may not hold in practice and fails to fully characterize the statistics of activations. To handle this issue, this paper proposes a novel deep global generalized Gaussian network (3G-Net), whose core is to estimate a global covariance of generalized Gaussian for modeling the last convolutional activations. Compared with GCP in Gaussian setting, our 3G-Net assumes the distribution of activations follows a generalized Gaussian, which can capture more precise characteristics of activations. However, there exists no analytic solution for parameter estimation of generalized Gaussian, making our 3G-Net challenging. To this end, we first present a novel regularized maximum likelihood estimator for robust estimating covariance of generalized Gaussian, which can be optimized by a modified iterative re-weighted method. Then, to efficiently estimate the covariance of generaized Gaussian under deep CNN architectures, we approximate this re-weighted method by developing an unrolling re-weighted module and a square root covariance layer. In this way, 3GNet can be flexibly trained in an end-to-end manner. The experiments are conducted on large-scale ImageNet-1K and Places365 datasets, and the results demonstrate our 3G-Net outperforms its counterparts while achieving very competitive performance to state-of-the-arts.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Deep_Global_Generalized_Gaussian_Networks_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Deep_Global_Generalized_Gaussian_Networks_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/deep-global-generalized-gaussian-networks
Repo
Framework

WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset

Title WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset
Authors Loitongbam Sanayai Meetei, Thoudam Doren Singh, B, Sivaji yopadhyay
Abstract A multimodal translation is a task of translating a source language to a target language with the help of a parallel text corpus paired with images that represent the contextual details of the text. In this paper, we carried out an extensive comparison to evaluate the benefits of using a multimodal approach on translating text in English to a low resource language, Hindi as a part of WAT2019 shared task. We carried out the translation of English to Hindi in three separate tasks with both the evaluation and challenge dataset. First, by using only the parallel text corpora, then through an image caption generation approach and, finally with the multimodal approach. Our experiment shows a significant improvement in the result with the multimodal approach than the other approach.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5224/
PDF https://www.aclweb.org/anthology/D19-5224
PWC https://paperswithcode.com/paper/wat2019-english-hindi-translation-on-hindi
Repo
Framework
comments powered by Disqus