January 24, 2020

2277 words 11 mins read

Paper Group NANR 128

English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019. ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets. Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding. NTT Neural Machine Translation Systems at WAT 2019. Lexical Normalization …

English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019


Title	English-Myanmar Supervised and Unsupervised NMT: NICT’s Machine Translation Systems at WAT-2019
Authors	Rui Wang, Haipeng Sun, Kehai Chen, Chenchen Ding, Masao Utiyama, Eiichiro Sumita
Abstract	This paper presents the NICT{'}s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions. We built neural machine translation (NMT) systems for these tasks. Our NMT systems were trained with language model pretraining. Back-translation technology is adopted to NMT. Our NMT systems rank the third in English-to-Myanmar and the second in Myanmar-to-English according to BLEU score.
Tasks	Language Modelling, Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5209/
PDF	https://www.aclweb.org/anthology/D19-5209
PWC	https://paperswithcode.com/paper/english-myanmar-supervised-and-unsupervised
Repo
Framework

ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets


Title	ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets
Authors	Skatje Myers, Martha Palmer
Abstract	This paper proposes using a Bidirectional LSTM-CRF model in order to identify the tense and aspect of verbs. The information that this classifier outputs can be useful for ordering events and can provide a pre-processing step to improve efficiency of annotating this type of information. This neural network architecture has been successfully employed for other sequential labeling tasks, and we show that it significantly outperforms the rule-based tool TMV-annotator on the Propbank I dataset.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3315/
PDF	https://www.aclweb.org/anthology/W19-3315
PWC	https://paperswithcode.com/paper/cleartac-verb-tense-aspect-and-form
Repo
Framework

Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding


Title	Leveraging Hierarchical Category Knowledge for Data-Imbalanced Multi-Label Diagnostic Text Understanding
Authors	Shang-Chi Tsai, Ting-Yun Chang, Yun-Nung Chen
Abstract	Clinical notes are essential medical documents to record each patient{'}s symptoms. Each record is typically annotated with medical diagnostic codes, which means diagnosis and treatment. This paper focuses on predicting diagnostic codes given the descriptive present illness in electronic health records by leveraging domain knowledge. We investigate various losses in a convolutional model to utilize hierarchical category knowledge of diagnostic codes in order to allow the model to share semantics across different labels under the same category. The proposed model not only considers the external domain knowledge but also addresses the issue about data imbalance. The MIMIC3 benchmark experiments show that the proposed methods can effectively utilize category knowledge and provide informative cues to improve the performance in terms of the top-ranked diagnostic codes which is better than the prior state-of-the-art. The investigation and discussion express the potential of integrating the domain knowledge in the current machine learning based models and guiding future research directions.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6206/
PDF	https://www.aclweb.org/anthology/D19-6206
PWC	https://paperswithcode.com/paper/leveraging-hierarchical-category-knowledge
Repo
Framework

NTT Neural Machine Translation Systems at WAT 2019


Title	NTT Neural Machine Translation Systems at WAT 2019
Authors	Makoto Morishita, Jun Suzuki, Masaaki Nagata
Abstract	In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5211/
PDF	https://www.aclweb.org/anthology/D19-5211
PWC	https://paperswithcode.com/paper/ntt-neural-machine-translation-systems-at-wat-1
Repo
Framework

Lexical Normalization of User-Generated Medical Text


Title	Lexical Normalization of User-Generated Medical Text
Authors	Anne Dirkson, Suzan Verberne, Wessel Kraaij
Abstract	In the medical domain, user-generated social media text is increasingly used as a valuable complementary knowledge source to scientific medical literature. The extraction of this knowledge is complicated by colloquial language use and misspellings. Yet, lexical normalization of such data has not been addressed properly. This paper presents an unsupervised, data-driven spelling correction module for medical social media. Our method outperforms state-of-the-art spelling correction and can detect mistakes with an F0.5 of 0.888. Additionally, we present a novel corpus for spelling mistake detection and correction on a medical patient forum.
Tasks	Lexical Normalization, Spelling Correction
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3202/
PDF	https://www.aclweb.org/anthology/W19-3202
PWC	https://paperswithcode.com/paper/lexical-normalization-of-user-generated
Repo
Framework

Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019


Title	Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019
Authors	Hideya Mino, Hitoshi Ito, Isao Goto, Ichiro Yamada, Hideki Tanaka, Takenobu Tokunaga
Abstract	This paper describes NHK and NHK Engineering System (NHK-ES){'}s submission to the newswire translation tasks of WAT 2019 in both directions of Japanese→English and English→Japanese. In addition to the JIJI Corpus that was officially provided by the task organizer, we developed a corpus of 0.22M sentence pairs by manually, translating Japanese news sentences into English content- equivalently. The content-equivalent corpus was effective for improving translation quality, and our systems achieved the best human evaluation scores in the newswire translation tasks at WAT 2019.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5212/
PDF	https://www.aclweb.org/anthology/D19-5212
PWC	https://paperswithcode.com/paper/neural-machine-translation-system-using-a
Repo
Framework

Combining Translation Memory with Neural Machine Translation


Title	Combining Translation Memory with Neural Machine Translation
Authors	Akiko Eriguchi, Spencer Rarrick, Hitokazu Matsushita
Abstract	In this paper, we report our submission systems (geoduck) to the Timely Disclosure task on the 6th Workshop on Asian Translation (WAT) (Nakazawa et al., 2019). Our system employs a combined approach of translation memory and Neural Machine Translation (NMT) models, where we can select final translation outputs from either a translation memory or an NMT system, when the similarity score of a test source sentence exceeds the predefined threshold. We observed that this combination approach significantly improves the translation performance on the Timely Disclosure corpus, as compared to a standalone NMT system. We also conducted source-based direct assessment on the final output, and we discuss the comparison between human references and each system{'}s output.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5214/
PDF	https://www.aclweb.org/anthology/D19-5214
PWC	https://paperswithcode.com/paper/combining-translation-memory-with-neural
Repo
Framework

Abstract Meaning Representation for Human-Robot Dialogue


Title	Abstract Meaning Representation for Human-Robot Dialogue
Authors	Claire N. Bonial, Lucia Donatelli, Jessica Ervin, Clare R. Voss
Abstract
Tasks
Published	2019-01-01
URL	https://www.aclweb.org/anthology/W19-0124/
PDF	https://www.aclweb.org/anthology/W19-0124
PWC	https://paperswithcode.com/paper/abstract-meaning-representation-for-human
Repo
Framework

Dilated LSTM with attention for Classification of Suicide Notes


Title	Dilated LSTM with attention for Classification of Suicide Notes
Authors	Annika M Schoene, George Lacey, Alex Turner, er P, Nina Dethlefs
Abstract	In this paper we present a dilated LSTM with attention mechanism for document-level classification of suicide notes, last statements and depressed notes. We achieve an accuracy of 87.34{%} compared to competitive baselines of 80.35{%} (Logistic Model Tree) and 82.27{%} (Bi-directional LSTM with Attention). Furthermore, we provide an analysis of both the grammatical and thematic content of suicide notes, last statements and depressed notes. We find that the use of personal pronouns, cognitive processes and references to loved ones are most important. Finally, we show through visualisations of attention weights that the Dilated LSTM with attention is able to identify the same distinguishing features across documents as the linguistic analysis.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6217/
PDF	https://www.aclweb.org/anthology/D19-6217
PWC	https://paperswithcode.com/paper/dilated-lstm-with-attention-for
Repo
Framework

Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019


Title	Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Authors	Kenji Imamura, Eiichiro Sumita
Abstract	This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks{—}ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en{—}using this system.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5217/
PDF	https://www.aclweb.org/anthology/D19-5217
PWC	https://paperswithcode.com/paper/long-warm-up-and-self-training-training
Repo
Framework

Outlier Detection and Robust PCA Using a Convex Measure of Innovation


Title	Outlier Detection and Robust PCA Using a Convex Measure of Innovation
Authors	Mostafa Rahmani, Ping Li
Abstract	This paper presents a provable and strong algorithm, termed Innovation Search (iSearch), to robust Principal Component Analysis (PCA) and outlier detection. An outlier by definition is a data point which does not participate in forming a low dimensional structure with a large number of data points in the data. In other word, an outlier carries some innovation with respect to most of the other data points. iSearch ranks the data points based on their values of innovation. A convex optimization problem is proposed whose optimal value is used as our measure of innovation. We derive analytical performance guarantees for the proposed robust PCA method under different models for the distribution of the outliers including randomly distributed outliers, clustered outliers, and linearly dependent outliers. Moreover, it is shown that iSearch provably recovers the span of the inliers when the inliers lie in a union of subspaces. In the challenging scenarios in which the outliers are close to each other or they are close to the span of the inliers, iSearch is shown to outperform most of the existing methods.
Tasks	Outlier Detection
Published	2019-12-01
URL	http://papers.nips.cc/paper/9568-outlier-detection-and-robust-pca-using-a-convex-measure-of-innovation
PDF	http://papers.nips.cc/paper/9568-outlier-detection-and-robust-pca-using-a-convex-measure-of-innovation.pdf
PWC	https://paperswithcode.com/paper/outlier-detection-and-robust-pca-using-a
Repo
Framework

Confirming the Non-compositionality of Idioms for Sentiment Analysis


Title	Confirming the Non-compositionality of Idioms for Sentiment Analysis
Authors	Alyssa Hwang, Christopher Hidey
Abstract	An idiom is defined as a non-compositional multiword expression, one whose meaning cannot be deduced from the definitions of the component words. This definition does not explicitly define the compositionality of an idiom{'}s sentiment; this paper aims to determine whether the sentiment of the component words of an idiom is related to the sentiment of that idiom. We use the Dictionary of Affect in Language augmented by WordNet to give each idiom in the Sentiment Lexicon of IDiomatic Expressions (SLIDE) a component-wise sentiment score and compare it to the phrase-level sentiment label crowdsourced by the creators of SLIDE. We find that there is no discernible relation between these two measures of idiom sentiment. This supports the hypothesis that idioms are not compositional for sentiment along with semantics and motivates further work in handling idioms for sentiment analysis.
Tasks	Sentiment Analysis
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5114/
PDF	https://www.aclweb.org/anthology/W19-5114
PWC	https://paperswithcode.com/paper/confirming-the-non-compositionality-of-idioms
Repo
Framework

Sarah’s Participation in WAT 2019


Title	Sarah’s Participation in WAT 2019
Authors	Raymond Hendy Susanto, Ohnmar Htun, Liling Tan
Abstract	This paper describes our MT systems{'} participation in the of WAT 2019. We participated in the (i) Patent, (ii) Timely Disclosure, (iii) Newswire and (iv) Mixed-domain tasks. Our main focus is to explore how similar Transformer models perform on various tasks. We observed that for tasks with smaller datasets, our best model setup are shallower models with lesser number of attention heads. We investigated practical issues in NMT that often appear in production settings, such as coping with multilinguality and simplifying pre- and post-processing pipeline in deployment.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5219/
PDF	https://www.aclweb.org/anthology/D19-5219
PWC	https://paperswithcode.com/paper/sarahs-participation-in-wat-2019
Repo
Framework

Deep Global Generalized Gaussian Networks


Title	Deep Global Generalized Gaussian Networks
Authors	Qilong Wang, Peihua Li, Qinghua Hu, Pengfei Zhu, Wangmeng Zuo
Abstract	Recently, global covariance pooling (GCP) has shown great advance in improving classification performance of deep convolutional neural networks (CNNs). However, existing deep GCP networks compute covariance pooling of convolutional activations with assumption that activations are sampled from Gaussian distributions, which may not hold in practice and fails to fully characterize the statistics of activations. To handle this issue, this paper proposes a novel deep global generalized Gaussian network (3G-Net), whose core is to estimate a global covariance of generalized Gaussian for modeling the last convolutional activations. Compared with GCP in Gaussian setting, our 3G-Net assumes the distribution of activations follows a generalized Gaussian, which can capture more precise characteristics of activations. However, there exists no analytic solution for parameter estimation of generalized Gaussian, making our 3G-Net challenging. To this end, we first present a novel regularized maximum likelihood estimator for robust estimating covariance of generalized Gaussian, which can be optimized by a modified iterative re-weighted method. Then, to efficiently estimate the covariance of generaized Gaussian under deep CNN architectures, we approximate this re-weighted method by developing an unrolling re-weighted module and a square root covariance layer. In this way, 3GNet can be flexibly trained in an end-to-end manner. The experiments are conducted on large-scale ImageNet-1K and Places365 datasets, and the results demonstrate our 3G-Net outperforms its counterparts while achieving very competitive performance to state-of-the-arts.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Deep_Global_Generalized_Gaussian_Networks_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Deep_Global_Generalized_Gaussian_Networks_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deep-global-generalized-gaussian-networks
Repo
Framework

WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset


Title	WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset
Authors	Loitongbam Sanayai Meetei, Thoudam Doren Singh, B, Sivaji yopadhyay
Abstract	A multimodal translation is a task of translating a source language to a target language with the help of a parallel text corpus paired with images that represent the contextual details of the text. In this paper, we carried out an extensive comparison to evaluate the benefits of using a multimodal approach on translating text in English to a low resource language, Hindi as a part of WAT2019 shared task. We carried out the translation of English to Hindi in three separate tasks with both the evaluation and challenge dataset. First, by using only the parallel text corpora, then through an image caption generation approach and, finally with the multimodal approach. Our experiment shows a significant improvement in the result with the multimodal approach than the other approach.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5224/
PDF	https://www.aclweb.org/anthology/D19-5224
PWC	https://paperswithcode.com/paper/wat2019-english-hindi-translation-on-hindi
Repo
Framework