Paper Group NANR 95
Context-Reinforced Semantic Segmentation. Learning Backpropagation-Free Deep Architectures with Kernels. DBMS-KU Interpolation for WMT19 News Translation Task. Automatic diacritization of Tunisian dialect text using Recurrent Neural Network. Hierarchical Deep Learning for Arabic Dialect Identification. Energy-Based Modelling for Dialogue State Trac …
Context-Reinforced Semantic Segmentation
Title | Context-Reinforced Semantic Segmentation |
Authors | Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng |
Abstract | Recent efforts have shown the importance of context on deep convolutional neural network based semantic segmentation. Among others, the predicted segmentation map (p-map) itself which encodes rich high-level semantic cues (e.g. objects and layout) can be regarded as a promising source of context. In this paper, we propose a dedicated module, Context Net, to better explore the context information in p-maps. Without introducing any new supervisions, we formulate the context learning problem as a Markov Decision Process and optimize it using reinforcement learning during which the p-map and Context Net are treated as environment and agent, respectively. Through adequate explorations, the Context Net selects the information which has long-term benefit for segmentation inference. By incorporating the Context Net with a baseline segmentation scheme, we then propose a Context-reinforced Semantic Segmentation network (CiSS-Net), which is fully end-to-end trainable. Experimental results show that the learned context brings 3.9% absolute improvement on mIoU over the baseline segmentation method, and the CiSS-Net achieves the state-of-the-art segmentation performance on ADE20K, PASCAL-Context and Cityscapes. |
Tasks | Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Zhou_Context-Reinforced_Semantic_Segmentation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhou_Context-Reinforced_Semantic_Segmentation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/context-reinforced-semantic-segmentation |
Repo | |
Framework | |
Learning Backpropagation-Free Deep Architectures with Kernels
Title | Learning Backpropagation-Free Deep Architectures with Kernels |
Authors | Shiyu Duan, Shujian Yu, Yunmei Chen, Jose Principe |
Abstract | One can substitute each neuron in any neural network with a kernel machine and obtain a counterpart powered by kernel machines. The new network inherits the expressive power and architecture of the original but works in a more intuitive way since each node enjoys the simple interpretation as a hyperplane (in a reproducing kernel Hilbert space). Further, using the kernel multilayer perceptron as an example, we prove that in classification, an optimal representation that minimizes the risk of the network can be characterized for each hidden layer. This result removes the need of backpropagation in learning the model and can be generalized to any feedforward kernel network. Moreover, unlike backpropagation, which turns models into black boxes, the optimal hidden representation enjoys an intuitive geometric interpretation, making the dynamics of learning in a deep kernel network simple to understand. Empirical results are provided to validate our theory. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=H1GLm2R9Km |
https://openreview.net/pdf?id=H1GLm2R9Km | |
PWC | https://paperswithcode.com/paper/learning-backpropagation-free-deep-1 |
Repo | |
Framework | |
DBMS-KU Interpolation for WMT19 News Translation Task
Title | DBMS-KU Interpolation for WMT19 News Translation Task |
Authors | Sari Dewi Budiwati, Al Hafiz Akbar Maulana Siagian, Tirana Noor Fatyanosa, Masayoshi Aritsugi |
Abstract | This paper presents the participation of DBMS-KU Interpolation system in WMT19 shared task, namely, Kazakh-English language pair. We examine the use of interpolation method using a different language model order. Our Interpolation system combines a direct translation with Russian as a pivot language. We use 3-gram and 5-gram language model orders to perform the language translation in this work. To reduce noise in the pivot translation process, we prune the phrase table of source-pivot and pivot-target. Our experimental results show that our Interpolation system outperforms the Baseline in terms of BLEU-cased score by +0.5 and +0.1 points in Kazakh-English and English-Kazakh, respectively. In particular, using the 5-gram language model order in our system could obtain better BLEU-cased score than utilizing the 3-gram one. Interestingly, we found that by employing the Interpolation system could reduce the perplexity score of English-Kazakh when using 3-gram language model order. |
Tasks | Language Modelling |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5309/ |
https://www.aclweb.org/anthology/W19-5309 | |
PWC | https://paperswithcode.com/paper/dbms-ku-interpolation-for-wmt19-news |
Repo | |
Framework | |
Automatic diacritization of Tunisian dialect text using Recurrent Neural Network
Title | Automatic diacritization of Tunisian dialect text using Recurrent Neural Network |
Authors | Abir Masmoudi, Mariem Ellouze, Lamia Hadrich belguith |
Abstract | The absence of diacritical marks in the Arabic texts generally leads to morphological, syntactic and semantic ambiguities. This can be more blatant when one deals with under-resourced languages, such as the Tunisian dialect, which suffers from unavailability of basic tools and linguistic resources, like sufficient amount of corpora, multilingual dictionaries, morphological and syntactic analyzers. Thus, this language processing faces greater challenges due to the lack of these resources. The automatic diacritization of MSA text is one of the various complex problems that can be solved by deep neural networks today. Since the Tunisian dialect is an under-resourced language of MSA and as there are a lot of resemblance between both languages, we suggest to investigate a recurrent neural network (RNN) for this dialect diacritization problem. This model will be compared to our previous models models CRF and SMT (CITATION) based on the same dialect corpus. We can experimentally show that our model can achieve better outcomes (DER of 10.72{%}), as compared to the two models CRF (DER of 20.25{%}) and SMT (DER of 33.15{%}). |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1085/ |
https://www.aclweb.org/anthology/R19-1085 | |
PWC | https://paperswithcode.com/paper/automatic-diacritization-of-tunisian-dialect |
Repo | |
Framework | |
Hierarchical Deep Learning for Arabic Dialect Identification
Title | Hierarchical Deep Learning for Arabic Dialect Identification |
Authors | Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, Abdessalam Bouchekif |
Abstract | In this paper, we present two approaches for Arabic Fine-Grained Dialect Identification. The first approach is based on Recurrent Neural Networks (BLSTM, BGRU) using hierarchical classification. The main idea is to separate the classification process for a sentence from a given text in two stages. We start with a higher level of classification (8 classes) and then the finer-grained classification (26 classes). The second approach is given by a voting system based on Naive Bayes and Random Forest. Our system achieves an F1 score of 63.02 {%} on the subtask evaluation dataset. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4631/ |
https://www.aclweb.org/anthology/W19-4631 | |
PWC | https://paperswithcode.com/paper/hierarchical-deep-learning-for-arabic-dialect |
Repo | |
Framework | |
Energy-Based Modelling for Dialogue State Tracking
Title | Energy-Based Modelling for Dialogue State Tracking |
Authors | Anh Duong Trinh, Robert Ross, John Kelleher |
Abstract | The uncertainties of language and the complexity of dialogue contexts make accurate dialogue state tracking one of the more challenging aspects of dialogue processing. To improve state tracking quality, we argue that relationships between different aspects of dialogue state must be taken into account as they can often guide a more accurate interpretation process. To this end, we present an energy-based approach to dialogue state tracking as a structured classification task. The novelty of our approach lies in the use of an energy network on top of a deep learning architecture to explore more signal correlations between network variables including input features and output labels. We demonstrate that the energy-based approach improves the performance of a deep learning dialogue state tracker towards state-of-the-art results without the need for many of the other steps required by current state-of-the-art methods. |
Tasks | Dialogue State Tracking |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4109/ |
https://www.aclweb.org/anthology/W19-4109 | |
PWC | https://paperswithcode.com/paper/energy-based-modelling-for-dialogue-state |
Repo | |
Framework | |
Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces
Title | Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces |
Authors | Chuan Guo, Ali Mousavi, Xiang Wu, Daniel N. Holtmann-Rice, Satyen Kale, Sashank Reddi, Sanjiv Kumar |
Abstract | In extreme classification settings, embedding-based neural network models are currently not competitive with sparse linear and tree-based methods in terms of accuracy. Most prior works attribute this poor performance to the low-dimensional bottleneck in embedding-based methods. In this paper, we demonstrate that theoretically there is no limitation to using low-dimensional embedding-based methods, and provide experimental evidence that overfitting is the root cause of the poor performance of embedding-based methods. These findings motivate us to investigate novel data augmentation and regularization techniques to mitigate overfitting. To this end, we propose GLaS, a new regularizer for embedding-based neural network approaches. It is a natural generalization from the graph Laplacian and spread-out regularizers, and empirically it addresses the drawback of each regularizer alone when applied to the extreme classification setup. With the proposed techniques, we attain or improve upon the state-of-the-art on most widely tested public extreme classification datasets with hundreds of thousands of labels. |
Tasks | Data Augmentation |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8740-breaking-the-glass-ceiling-for-embedding-based-classifiers-for-large-output-spaces |
http://papers.nips.cc/paper/8740-breaking-the-glass-ceiling-for-embedding-based-classifiers-for-large-output-spaces.pdf | |
PWC | https://paperswithcode.com/paper/breaking-the-glass-ceiling-for-embedding |
Repo | |
Framework | |
A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning
Title | A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning |
Authors | Gon{\c{c}}alo M. Correia, Andr{'e} F. T. Martins |
Abstract | Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through back-translations, a time-consuming process often no easier than training a MT system from scratch. in this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. By only training on a dataset of 23K sentences for 3 hours on a single GPU we obtain results that are competitive with systems that were trained on 5M artificial sentences. When we add this artificial data our method obtains state-of-the-art results. |
Tasks | Automatic Post-Editing, Machine Translation, Transfer Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1292/ |
https://www.aclweb.org/anthology/P19-1292 | |
PWC | https://paperswithcode.com/paper/a-simple-and-effective-approach-to-automatic-1 |
Repo | |
Framework | |
A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition
Title | A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition |
Authors | Yang Li, Wenming Zheng, Lei Wang, Yuan Zong, Lei Qi, Zhen Cui, Tong Zhang, Tengfei Song |
Abstract | The neuroscience study has revealed the discrepancy of emotion expression between left and right hemispheres of human brain. Inspired by this study, in this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) to learn the asymmetric differences between two hemispheres for electroencephalograph (EEG) emotion recognition. Concretely, we first employ four directed recurrent neural networks (RNNs) based on two spatial orientations to traverse electrode signals on two separate brain regions, which enables the model to obtain the deep representations of all the EEG electrodes’ signals while keeping the intrinsic spatial dependence. Then we design a pairwise subnetwork to capture the discrepancy information between two hemispheres and extract higher-level features for final classification. Besides, in order to reduce the domain shift between training and testing data, we use a domain discriminator that adversarially induces the overall feature learning module to generate emotion-related but domain-invariant feature, which can further promote EEG emotion recognition. We conduct experiments on three public EEG emotional datasets, and the experiments show that the new state-of-the-art results can be achieved. |
Tasks | EEG, Emotion Recognition |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1906.01704 |
https://arxiv.org/pdf/1906.01704 | |
PWC | https://paperswithcode.com/paper/a-novel-bi-hemispheric-discrepancy-model-for |
Repo | |
Framework | |
Evaluating the Consistency of Word Embeddings from Small Data
Title | Evaluating the Consistency of Word Embeddings from Small Data |
Authors | Jelke Bloem, Antske Fokkens, Aur{'e}lie Herbelot |
Abstract | In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, specifically, philosophical text. Specifically, we inspect the behaviour of models using a pre-trained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learnt terms, both in the background space and in the in-domain data of interest. |
Tasks | Word Embeddings |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1016/ |
https://www.aclweb.org/anthology/R19-1016 | |
PWC | https://paperswithcode.com/paper/evaluating-the-consistency-of-word-embeddings |
Repo | |
Framework | |
Pay ``Attention’’ to your Context when Classifying Abusive Language
Title | Pay ``Attention’’ to your Context when Classifying Abusive Language | |
Authors | Tuhin Chakrabarty, Kilol Gupta, Smar Muresan, a |
Abstract | The goal of any social media platform is to facilitate healthy and meaningful interactions among its users. But more often than not, it has been found that it becomes an avenue for wanton attacks. We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of abusive language, which are sometimes overlapping (racism, sexism, hate speech, offensive language, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. self-attention) is better for abusive language detection using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3508/ |
https://www.aclweb.org/anthology/W19-3508 | |
PWC | https://paperswithcode.com/paper/pay-attention-to-your-context-when |
Repo | |
Framework | |
NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task
Title | NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task |
Authors | Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita |
Abstract | In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions. We focused on leveraging multilingual transfer learning and back-translation for the extremely low-resource language pairs: Kazakh↔English and Gujarati↔English translation. For the Chinese↔English translation, we used the provided parallel data augmented with a large quantity of back-translated monolingual data to train state-of-the-art NMT systems. We then employed techniques that have been proven to be most effective, such as back-translation, fine-tuning, and model ensembling, to generate the primary submissions of Chinese↔English. For English→Finnish, our submission from WMT18 remains a strong baseline despite the increase in parallel corpora for this year{'}s task. |
Tasks | Machine Translation, Transfer Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5313/ |
https://www.aclweb.org/anthology/W19-5313 | |
PWC | https://paperswithcode.com/paper/nicts-supervised-neural-machine-translation |
Repo | |
Framework | |
Kingsoft’s Neural Machine Translation System for WMT19
Title | Kingsoft’s Neural Machine Translation System for WMT19 |
Authors | Xinze Guo, Chang Liu, Xiaolong Li, Yiran Wang, Guoliang Li, Feng Wang, Zhitao Xu, Liuyi Yang, Li Ma, Changliang Li |
Abstract | This paper describes the Kingsoft AI Lab{'}s submission to the WMT2019 news translation shared task. We participated in two language directions: English-Chinese and Chinese-English. For both language directions, we trained several variants of Transformer models using the provided parallel data enlarged with a large quantity of back-translated monolingual data. The best translation result was obtained with ensemble and reranking techniques. According to automatic metrics (BLEU) our Chinese-English system reached the second highest score, and our English-Chinese system reached the second highest score for this subtask. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5317/ |
https://www.aclweb.org/anthology/W19-5317 | |
PWC | https://paperswithcode.com/paper/kingsofts-neural-machine-translation-system |
Repo | |
Framework | |
Label-Specific Document Representation for Multi-Label Text Classification
Title | Label-Specific Document Representation for Multi-Label Text Classification |
Authors | Lin Xiao, Xin Huang, Boli Chen, Liping Jing |
Abstract | Multi-label text classification (MLTC) aims to tag most relevant labels for the given document. In this paper, we propose a Label-Specific Attention Network (LSAN) to learn a label-specific document representation. LSAN takes advantage of label semantic information to determine the semantic connection between labels and document for constructing label-specific document representation. Meanwhile, the self-attention mechanism is adopted to identify the label-specific document representation from document content information. In order to seamlessly integrate the above two parts, an adaptive fusion strategy is proposed, which can effectively output the comprehensive label-specific document representation to build multi-label text classifier. Extensive experimental results demonstrate that LSAN consistently outperforms the state-of-the-art methods on four different datasets, especially on the prediction of low-frequency labels. The code and hyper-parameter settings are released to facilitate other researchers. |
Tasks | Multi-Label Text Classification, Text Classification |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1044/ |
https://www.aclweb.org/anthology/D19-1044 | |
PWC | https://paperswithcode.com/paper/label-specific-document-representation-for |
Repo | |
Framework | |
Hierarchical Transfer Learning for Multi-label Text Classification
Title | Hierarchical Transfer Learning for Multi-label Text Classification |
Authors | Siddhartha Banerjee, Cem Akkaya, Francisco Perez-Sorrosal, Kostas Tsioutsiouliklis |
Abstract | Multi-Label Hierarchical Text Classification (MLHTC) is the task of categorizing documents into one or more topics organized in an hierarchical taxonomy. MLHTC can be formulated by combining multiple binary classification problems with an independent classifier for each category. We propose a novel transfer learning based strategy, HTrans, where binary classifiers at lower levels in the hierarchy are initialized using parameters of the parent classifier and fine-tuned on the child category classification task. In HTrans, we use a Gated Recurrent Unit (GRU)-based deep learning architecture coupled with attention. Compared to binary classifiers trained from scratch, our HTrans approach results in significant improvements of 1{%} on micro-F1 and 3{%} on macro-F1 on the RCV1 dataset. Our experiments also show that binary classifiers trained from scratch are significantly better than single multi-label models. |
Tasks | Multi-Label Text Classification, Text Classification, Transfer Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1633/ |
https://www.aclweb.org/anthology/P19-1633 | |
PWC | https://paperswithcode.com/paper/hierarchical-transfer-learning-for-multi |
Repo | |
Framework | |