January 25, 2020

2692 words 13 mins read

Paper Group NANR 95

Context-Reinforced Semantic Segmentation. Learning Backpropagation-Free Deep Architectures with Kernels. DBMS-KU Interpolation for WMT19 News Translation Task. Automatic diacritization of Tunisian dialect text using Recurrent Neural Network. Hierarchical Deep Learning for Arabic Dialect Identification. Energy-Based Modelling for Dialogue State Trac …

Context-Reinforced Semantic Segmentation


Title	Context-Reinforced Semantic Segmentation
Authors	Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng
Abstract	Recent efforts have shown the importance of context on deep convolutional neural network based semantic segmentation. Among others, the predicted segmentation map (p-map) itself which encodes rich high-level semantic cues (e.g. objects and layout) can be regarded as a promising source of context. In this paper, we propose a dedicated module, Context Net, to better explore the context information in p-maps. Without introducing any new supervisions, we formulate the context learning problem as a Markov Decision Process and optimize it using reinforcement learning during which the p-map and Context Net are treated as environment and agent, respectively. Through adequate explorations, the Context Net selects the information which has long-term benefit for segmentation inference. By incorporating the Context Net with a baseline segmentation scheme, we then propose a Context-reinforced Semantic Segmentation network (CiSS-Net), which is fully end-to-end trainable. Experimental results show that the learned context brings 3.9% absolute improvement on mIoU over the baseline segmentation method, and the CiSS-Net achieves the state-of-the-art segmentation performance on ADE20K, PASCAL-Context and Cityscapes.
Tasks	Semantic Segmentation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Zhou_Context-Reinforced_Semantic_Segmentation_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhou_Context-Reinforced_Semantic_Segmentation_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/context-reinforced-semantic-segmentation
Repo
Framework

Learning Backpropagation-Free Deep Architectures with Kernels


Title	Learning Backpropagation-Free Deep Architectures with Kernels
Authors	Shiyu Duan, Shujian Yu, Yunmei Chen, Jose Principe
Abstract	One can substitute each neuron in any neural network with a kernel machine and obtain a counterpart powered by kernel machines. The new network inherits the expressive power and architecture of the original but works in a more intuitive way since each node enjoys the simple interpretation as a hyperplane (in a reproducing kernel Hilbert space). Further, using the kernel multilayer perceptron as an example, we prove that in classification, an optimal representation that minimizes the risk of the network can be characterized for each hidden layer. This result removes the need of backpropagation in learning the model and can be generalized to any feedforward kernel network. Moreover, unlike backpropagation, which turns models into black boxes, the optimal hidden representation enjoys an intuitive geometric interpretation, making the dynamics of learning in a deep kernel network simple to understand. Empirical results are provided to validate our theory.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=H1GLm2R9Km
PDF	https://openreview.net/pdf?id=H1GLm2R9Km
PWC	https://paperswithcode.com/paper/learning-backpropagation-free-deep-1
Repo
Framework

DBMS-KU Interpolation for WMT19 News Translation Task


Title	DBMS-KU Interpolation for WMT19 News Translation Task
Authors	Sari Dewi Budiwati, Al Hafiz Akbar Maulana Siagian, Tirana Noor Fatyanosa, Masayoshi Aritsugi
Abstract	This paper presents the participation of DBMS-KU Interpolation system in WMT19 shared task, namely, Kazakh-English language pair. We examine the use of interpolation method using a different language model order. Our Interpolation system combines a direct translation with Russian as a pivot language. We use 3-gram and 5-gram language model orders to perform the language translation in this work. To reduce noise in the pivot translation process, we prune the phrase table of source-pivot and pivot-target. Our experimental results show that our Interpolation system outperforms the Baseline in terms of BLEU-cased score by +0.5 and +0.1 points in Kazakh-English and English-Kazakh, respectively. In particular, using the 5-gram language model order in our system could obtain better BLEU-cased score than utilizing the 3-gram one. Interestingly, we found that by employing the Interpolation system could reduce the perplexity score of English-Kazakh when using 3-gram language model order.
Tasks	Language Modelling
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5309/
PDF	https://www.aclweb.org/anthology/W19-5309
PWC	https://paperswithcode.com/paper/dbms-ku-interpolation-for-wmt19-news
Repo
Framework

Automatic diacritization of Tunisian dialect text using Recurrent Neural Network


Title	Automatic diacritization of Tunisian dialect text using Recurrent Neural Network
Authors	Abir Masmoudi, Mariem Ellouze, Lamia Hadrich belguith
Abstract	The absence of diacritical marks in the Arabic texts generally leads to morphological, syntactic and semantic ambiguities. This can be more blatant when one deals with under-resourced languages, such as the Tunisian dialect, which suffers from unavailability of basic tools and linguistic resources, like sufficient amount of corpora, multilingual dictionaries, morphological and syntactic analyzers. Thus, this language processing faces greater challenges due to the lack of these resources. The automatic diacritization of MSA text is one of the various complex problems that can be solved by deep neural networks today. Since the Tunisian dialect is an under-resourced language of MSA and as there are a lot of resemblance between both languages, we suggest to investigate a recurrent neural network (RNN) for this dialect diacritization problem. This model will be compared to our previous models models CRF and SMT (CITATION) based on the same dialect corpus. We can experimentally show that our model can achieve better outcomes (DER of 10.72{%}), as compared to the two models CRF (DER of 20.25{%}) and SMT (DER of 33.15{%}).
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1085/
PDF	https://www.aclweb.org/anthology/R19-1085
PWC	https://paperswithcode.com/paper/automatic-diacritization-of-tunisian-dialect
Repo
Framework

Hierarchical Deep Learning for Arabic Dialect Identification


Title	Hierarchical Deep Learning for Arabic Dialect Identification
Authors	Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, Abdessalam Bouchekif
Abstract	In this paper, we present two approaches for Arabic Fine-Grained Dialect Identification. The first approach is based on Recurrent Neural Networks (BLSTM, BGRU) using hierarchical classification. The main idea is to separate the classification process for a sentence from a given text in two stages. We start with a higher level of classification (8 classes) and then the finer-grained classification (26 classes). The second approach is given by a voting system based on Naive Bayes and Random Forest. Our system achieves an F1 score of 63.02 {%} on the subtask evaluation dataset.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4631/
PDF	https://www.aclweb.org/anthology/W19-4631
PWC	https://paperswithcode.com/paper/hierarchical-deep-learning-for-arabic-dialect
Repo
Framework

Energy-Based Modelling for Dialogue State Tracking


Title	Energy-Based Modelling for Dialogue State Tracking
Authors	Anh Duong Trinh, Robert Ross, John Kelleher
Abstract	The uncertainties of language and the complexity of dialogue contexts make accurate dialogue state tracking one of the more challenging aspects of dialogue processing. To improve state tracking quality, we argue that relationships between different aspects of dialogue state must be taken into account as they can often guide a more accurate interpretation process. To this end, we present an energy-based approach to dialogue state tracking as a structured classification task. The novelty of our approach lies in the use of an energy network on top of a deep learning architecture to explore more signal correlations between network variables including input features and output labels. We demonstrate that the energy-based approach improves the performance of a deep learning dialogue state tracker towards state-of-the-art results without the need for many of the other steps required by current state-of-the-art methods.
Tasks	Dialogue State Tracking
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4109/
PDF	https://www.aclweb.org/anthology/W19-4109
PWC	https://paperswithcode.com/paper/energy-based-modelling-for-dialogue-state
Repo
Framework

Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces


Title	Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces
Authors	Chuan Guo, Ali Mousavi, Xiang Wu, Daniel N. Holtmann-Rice, Satyen Kale, Sashank Reddi, Sanjiv Kumar
Abstract	In extreme classification settings, embedding-based neural network models are currently not competitive with sparse linear and tree-based methods in terms of accuracy. Most prior works attribute this poor performance to the low-dimensional bottleneck in embedding-based methods. In this paper, we demonstrate that theoretically there is no limitation to using low-dimensional embedding-based methods, and provide experimental evidence that overfitting is the root cause of the poor performance of embedding-based methods. These findings motivate us to investigate novel data augmentation and regularization techniques to mitigate overfitting. To this end, we propose GLaS, a new regularizer for embedding-based neural network approaches. It is a natural generalization from the graph Laplacian and spread-out regularizers, and empirically it addresses the drawback of each regularizer alone when applied to the extreme classification setup. With the proposed techniques, we attain or improve upon the state-of-the-art on most widely tested public extreme classification datasets with hundreds of thousands of labels.
Tasks	Data Augmentation
Published	2019-12-01
URL	http://papers.nips.cc/paper/8740-breaking-the-glass-ceiling-for-embedding-based-classifiers-for-large-output-spaces
PDF	http://papers.nips.cc/paper/8740-breaking-the-glass-ceiling-for-embedding-based-classifiers-for-large-output-spaces.pdf
PWC	https://paperswithcode.com/paper/breaking-the-glass-ceiling-for-embedding
Repo
Framework

A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning


Title	A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning
Authors	Gon{\c{c}}alo M. Correia, Andr{'e} F. T. Martins
Abstract	Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through back-translations, a time-consuming process often no easier than training a MT system from scratch. in this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. By only training on a dataset of 23K sentences for 3 hours on a single GPU we obtain results that are competitive with systems that were trained on 5M artificial sentences. When we add this artificial data our method obtains state-of-the-art results.
Tasks	Automatic Post-Editing, Machine Translation, Transfer Learning
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1292/
PDF	https://www.aclweb.org/anthology/P19-1292
PWC	https://paperswithcode.com/paper/a-simple-and-effective-approach-to-automatic-1
Repo
Framework

A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition


Title	A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition
Authors	Yang Li, Wenming Zheng, Lei Wang, Yuan Zong, Lei Qi, Zhen Cui, Tong Zhang, Tengfei Song
Abstract	The neuroscience study has revealed the discrepancy of emotion expression between left and right hemispheres of human brain. Inspired by this study, in this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) to learn the asymmetric differences between two hemispheres for electroencephalograph (EEG) emotion recognition. Concretely, we first employ four directed recurrent neural networks (RNNs) based on two spatial orientations to traverse electrode signals on two separate brain regions, which enables the model to obtain the deep representations of all the EEG electrodes’ signals while keeping the intrinsic spatial dependence. Then we design a pairwise subnetwork to capture the discrepancy information between two hemispheres and extract higher-level features for final classification. Besides, in order to reduce the domain shift between training and testing data, we use a domain discriminator that adversarially induces the overall feature learning module to generate emotion-related but domain-invariant feature, which can further promote EEG emotion recognition. We conduct experiments on three public EEG emotional datasets, and the experiments show that the new state-of-the-art results can be achieved.
Tasks	EEG, Emotion Recognition
Published	2019-05-11
URL	https://arxiv.org/abs/1906.01704
PDF	https://arxiv.org/pdf/1906.01704
PWC	https://paperswithcode.com/paper/a-novel-bi-hemispheric-discrepancy-model-for
Repo
Framework

Evaluating the Consistency of Word Embeddings from Small Data


Title	Evaluating the Consistency of Word Embeddings from Small Data
Authors	Jelke Bloem, Antske Fokkens, Aur{'e}lie Herbelot
Abstract	In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, specifically, philosophical text. Specifically, we inspect the behaviour of models using a pre-trained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learnt terms, both in the background space and in the in-domain data of interest.
Tasks	Word Embeddings
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1016/
PDF	https://www.aclweb.org/anthology/R19-1016
PWC	https://paperswithcode.com/paper/evaluating-the-consistency-of-word-embeddings
Repo
Framework

Pay ``Attention’’ to your Context when Classifying Abusive Language


Title	Pay ``Attention’’ to your Context when Classifying Abusive Language \|
Authors	Tuhin Chakrabarty, Kilol Gupta, Smar Muresan, a
Abstract	The goal of any social media platform is to facilitate healthy and meaningful interactions among its users. But more often than not, it has been found that it becomes an avenue for wanton attacks. We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of abusive language, which are sometimes overlapping (racism, sexism, hate speech, offensive language, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. self-attention) is better for abusive language detection using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3508/
PDF	https://www.aclweb.org/anthology/W19-3508
PWC	https://paperswithcode.com/paper/pay-attention-to-your-context-when
Repo
Framework

NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task


Title	NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task
Authors	Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita
Abstract	In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh↔English and Gujarati↔English translation. For the Chinese↔English translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese↔English. For English→Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year{'}s task.
Tasks	Machine Translation, Transfer Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5313/
PDF	https://www.aclweb.org/anthology/W19-5313
PWC	https://paperswithcode.com/paper/nicts-supervised-neural-machine-translation
Repo
Framework

Kingsoft’s Neural Machine Translation System for WMT19


Title	Kingsoft’s Neural Machine Translation System for WMT19
Authors	Xinze Guo, Chang Liu, Xiaolong Li, Yiran Wang, Guoliang Li, Feng Wang, Zhitao Xu, Liuyi Yang, Li Ma, Changliang Li
Abstract	This paper describes the Kingsoft AI Lab{'}s submission to the WMT2019 news translation shared task. We participated in two language directions: English-Chinese and Chinese-English. For both language directions, we trained several variants of Transformer models using the provided parallel data enlarged with a large quantity of back-translated monolingual data. The best translation result was obtained with ensemble and reranking techniques. According to automatic metrics (BLEU) our Chinese-English system reached the second highest score, and our English-Chinese system reached the second highest score for this subtask.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5317/
PDF	https://www.aclweb.org/anthology/W19-5317
PWC	https://paperswithcode.com/paper/kingsofts-neural-machine-translation-system
Repo
Framework

Label-Specific Document Representation for Multi-Label Text Classification


Title	Label-Specific Document Representation for Multi-Label Text Classification
Authors	Lin Xiao, Xin Huang, Boli Chen, Liping Jing
Abstract	Multi-label text classification (MLTC) aims to tag most relevant labels for the given document. In this paper, we propose a Label-Specific Attention Network (LSAN) to learn a label-specific document representation. LSAN takes advantage of label semantic information to determine the semantic connection between labels and document for constructing label-specific document representation. Meanwhile, the self-attention mechanism is adopted to identify the label-specific document representation from document content information. In order to seamlessly integrate the above two parts, an adaptive fusion strategy is proposed, which can effectively output the comprehensive label-specific document representation to build multi-label text classifier. Extensive experimental results demonstrate that LSAN consistently outperforms the state-of-the-art methods on four different datasets, especially on the prediction of low-frequency labels. The code and hyper-parameter settings are released to facilitate other researchers.
Tasks	Multi-Label Text Classification, Text Classification
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1044/
PDF	https://www.aclweb.org/anthology/D19-1044
PWC	https://paperswithcode.com/paper/label-specific-document-representation-for
Repo
Framework

Hierarchical Transfer Learning for Multi-label Text Classification


Title	Hierarchical Transfer Learning for Multi-label Text Classification
Authors	Siddhartha Banerjee, Cem Akkaya, Francisco Perez-Sorrosal, Kostas Tsioutsiouliklis
Abstract	Multi-Label Hierarchical Text Classification (MLHTC) is the task of categorizing documents into one or more topics organized in an hierarchical taxonomy. MLHTC can be formulated by combining multiple binary classification problems with an independent classifier for each category. We propose a novel transfer learning based strategy, HTrans, where binary classifiers at lower levels in the hierarchy are initialized using parameters of the parent classifier and fine-tuned on the child category classification task. In HTrans, we use a Gated Recurrent Unit (GRU)-based deep learning architecture coupled with attention. Compared to binary classifiers trained from scratch, our HTrans approach results in significant improvements of 1{%} on micro-F1 and 3{%} on macro-F1 on the RCV1 dataset. Our experiments also show that binary classifiers trained from scratch are significantly better than single multi-label models.
Tasks	Multi-Label Text Classification, Text Classification, Transfer Learning
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1633/
PDF	https://www.aclweb.org/anthology/P19-1633
PWC	https://paperswithcode.com/paper/hierarchical-transfer-learning-for-multi
Repo
Framework