October 16, 2019

2888 words 14 mins read

Paper Group NAWR 15

Paper Group NAWR 15

HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization. Neural Sign Language Translation. Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline. MGAN: Training Generative Adversarial Nets with Multiple Generators. Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language …

HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization

Title HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization
Authors Kazuya Shimura, Jiyi Li, Fumiyo Fukumoto
Abstract We focus on the multi-label categorization task for short texts and explore the use of a hierarchical structure (HS) of categories. In contrast to the existing work using non-hierarchical flat model, the method leverages the hierarchical relations between the pre-defined categories to tackle the data sparsity problem. The lower the HS level, the less the categorization performance. Because the number of training data per category in a lower level is much smaller than that in an upper level. We propose an approach which can effectively utilize the data in the upper levels to contribute the categorization in the lower levels by applying the Convolutional Neural Network (CNN) with a fine-tuning technique. The results using two benchmark datasets show that proposed method, Hierarchical Fine-Tuning based CNN (HFT-CNN) is competitive with the state-of-the-art CNN based methods.
Tasks Extreme Multi-Label Classification, Multi-Label Classification, Text Categorization
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1093/
PDF https://www.aclweb.org/anthology/D18-1093
PWC https://paperswithcode.com/paper/hft-cnn-learning-hierarchical-category
Repo https://github.com/ShimShim46/HFT-CNN
Framework none

Neural Sign Language Translation

Title Neural Sign Language Translation
Authors Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, Richard Bowden
Abstract Sign Language Recognition (SLR) has been an active research field for the last two decades. However, most research to date has considered SLR as a naive gesture recognition problem. SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language. In contrast, we introduce the Sign Language Translation (SLT) problem. Here, the objective is to generate spoken language translations from sign language videos, taking into account the different word orders and grammar. We formalize SLT in the framework of Neural Machine Translation (NMT) for both end-to-end and pretrained settings (using expert knowledge). This allows us to jointly learn the spatial representations, the underlying language model, and the mapping between sign and spoken language. To evaluate the performance of Neural SLT, we collected the first publicly available Continuous SLT dataset, RWTH-PHOENIX-Weather 2014T. It provides spoken language translations and gloss level annotations for German Sign Language videos of weather broadcasts. Our dataset contains over .95M frames with >67K signs from a sign vocabulary of >1K and >99K words from a German vocabulary of >2.8K. We report quantitative and qualitative results for various SLT setups to underpin future research in this newly established field. The upper bound for translation performance is calculated at 19.26 BLEU-4, while our end-to-end frame-level and gloss-level tokenization networks were able to achieve 9.58 and 18.13 respectively.
Tasks Gesture Recognition, Language Modelling, Machine Translation, Sign Language Recognition, Sign Language Translation, Tokenization
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Camgoz_Neural_Sign_Language_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Camgoz_Neural_Sign_Language_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/neural-sign-language-translation
Repo https://github.com/neccam/nslt
Framework tf

Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline

Title Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline
Authors Kawin Ethayarajh
Abstract Using a random walk model of text generation, Arora et al. (2017) proposed a strong baseline for computing sentence embeddings: take a weighted average of word embeddings and modify with SVD. This simple method even outperforms far more complex approaches such as LSTMs on textual similarity tasks. In this paper, we first show that word vector length has a confounding effect on the probability of a sentence being generated in Arora et al.{'}s model. We propose a random walk model that is robust to this confound, where the probability of word generation is inversely related to the angular distance between the word and sentence embeddings. Our approach beats Arora et al.{'}s by up to 44.4{%} on textual similarity tasks and is competitive with state-of-the-art methods. Unlike Arora et al.{'}s method, ours requires no hyperparameter tuning, which means it can be used when there is no labelled data.
Tasks Representation Learning, Sentence Embeddings, Text Generation, Transfer Learning, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3012/
PDF https://www.aclweb.org/anthology/W18-3012
PWC https://paperswithcode.com/paper/unsupervised-random-walk-sentence-embeddings
Repo https://github.com/kawine/usif
Framework none

MGAN: Training Generative Adversarial Nets with Multiple Generators

Title MGAN: Training Generative Adversarial Nets with Multiple Generators
Authors Quan Hoang, Tu Dinh Nguyen, Trung Le, Dinh Phung
Abstract We propose in this paper a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapsing problem and delivering state-of-the-art results. A minimax formulation was able to establish among a classifier, a discriminator, and a set of generators in a similar spirit with GAN. Generators create samples that are intended to come from the same distribution as the training data, whilst the discriminator determines whether samples are true data or generated by generators, and the classifier specifies which generator a sample comes from. The distinguishing feature is that internal samples are created from multiple generators, and then one of them will be randomly selected as final output similar to the mechanism of a probabilistic mixture model. We term our method Mixture Generative Adversarial Nets (MGAN). We develop theoretical analysis to prove that, at the equilibrium, the Jensen-Shannon divergence (JSD) between the mixture of generators’ distributions and the empirical data distribution is minimal, whilst the JSD among generators’ distributions is maximal, hence effectively avoiding the mode collapsing problem. By utilizing parameter sharing, our proposed model adds minimal computational cost to the standard GAN, and thus can also efficiently scale to large-scale datasets. We conduct extensive experiments on synthetic 2D data and natural image databases (CIFAR-10, STL-10 and ImageNet) to demonstrate the superior performance of our MGAN in achieving state-of-the-art Inception scores over latest baselines, generating diverse and appealing recognizable objects at different resolutions, and specializing in capturing different types of objects by the generators.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=rkmu5b0a-
PDF https://openreview.net/pdf?id=rkmu5b0a-
PWC https://paperswithcode.com/paper/mgan-training-generative-adversarial-nets
Repo https://github.com/qhoangdl/MGAN
Framework tf

Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation

Title Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Authors Oliver Adams, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird, Alexis Michaud
Abstract
Tasks Acoustic Modelling, Language Modelling, Speech Recognition
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1530/
PDF https://www.aclweb.org/anthology/L18-1530
PWC https://paperswithcode.com/paper/evaluation-phonemic-transcription-of-low
Repo https://github.com/oadams/persephone
Framework tf

Progressive Attention Guided Recurrent Network for Salient Object Detection

Title Progressive Attention Guided Recurrent Network for Salient Object Detection
Authors Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang
Abstract Effective convolutional features play an important role in saliency estimation but how to learn powerful features for saliency is still a challenging task. FCN-based methods directly apply multi-level convolutional features without distinction, which leads to sub-optimal results due to the distraction from redundant details. In this paper, we propose a novel attention guided network which selectively integrates multi-level contextual information in a progressive manner. Attentive features generated by our network can alleviate distraction of background thus achieve better performance. On the other hand, it is observed that most of existing algorithms conduct salient object detection by exploiting side-output features of the backbone feature extraction network. However, shallower layers of backbone network lack the ability to obtain global semantic information, which limits the effective feature learning. To address the problem, we introduce multi-path recurrent feedback to enhance our proposed progressive attention driven framework. Through multi-path recurrent connections, global semantic information from the top convolutional layer is transferred to shallower layers, which intrinsically refines the entire network. Experimental results on six benchmark datasets demonstrate that our algorithm performs favorably against the state-of-the-art approaches.
Tasks Object Detection, Saliency Prediction, Salient Object Detection
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Progressive_Attention_Guided_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Progressive_Attention_Guided_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/progressive-attention-guided-recurrent
Repo https://github.com/zhangxiaoning666/PAGR
Framework none

Wide Contextual Residual Network with Active Learning for Remote Sensing Image Classification

Title Wide Contextual Residual Network with Active Learning for Remote Sensing Image Classification
Authors Shengjie Liu, Haowen Luo, Ying Tu, Zhi He, Jun Li
Abstract In this paper, we propose a wide contextual residual network (WCRN) with active learning (AL) for remote sensing image (RSI) classification. Although ResNets have achieved great success in various applications (e.g. RSI classification), its performance is limited by the requirement of abundant labeled samples. As it is very difficult and expensive to obtain class labels in real world, we integrate the proposed WCRN with AL to improve its generalization by using the most informative training samples. Specifically, we first design a wide contextual residual network for RSI classification. We then integrate it with AL to achieve good machine generalization with limited number of training sampling. Experimental results on the University of Pavia and Flevoland datasets demonstrate that the proposed WCRN with AL can significantly reduce the needs of samples.
Tasks Active Learning, Classification Of Hyperspectral Images, Hyperspectral Image Classification, Image Classification, Remote Sensing Image Classification
Published 2018-07-22
URL https://ieeexplore.ieee.org/document/8517855
PDF https://www.researchgate.net/publication/328991664_Wide_Contextual_Residual_Network_with_Active_Learning_for_Remote_Sensing_Image_Classification
PWC https://paperswithcode.com/paper/wide-contextual-residual-network-with-active
Repo https://github.com/codeRimoe/DL_for_RSIs
Framework tf

Training RNNs as Fast as CNNs

Title Training RNNs as Fast as CNNs
Authors Tao Lei, Yu Zhang, Yoav Artzi
Abstract Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU) architecture, a recurrent unit that simplifies the computation and exposes more parallelism. In SRU, the majority of computation for each step is independent of the recurrence and can be easily parallelized. SRU is as fast as a convolutional layer and 5-10x faster than an optimized LSTM implementation. We study SRUs on a wide range of applications, including classification, question answering, language modeling, translation and speech recognition. Our experiments demonstrate the effectiveness of SRU and the trade-off it enables between speed and performance.
Tasks Language Modelling, Question Answering, Speech Recognition
Published 2018-01-01
URL https://openreview.net/forum?id=rJBiunlAW
PDF https://openreview.net/pdf?id=rJBiunlAW
PWC https://paperswithcode.com/paper/training-rnns-as-fast-as-cnns
Repo https://github.com/asappresearch/sru
Framework pytorch

Extracting Commonsense Properties from Embeddings with Limited Human Guidance

Title Extracting Commonsense Properties from Embeddings with Limited Human Guidance
Authors Yiben Yang, Larry Birnbaum, Ji-Ping Wang, Doug Downey
Abstract Intelligent systems require common sense, but automatically extracting this knowledge from text can be difficult. We propose and assess methods for extracting one type of commonsense knowledge, object-property comparisons, from pre-trained embeddings. In experiments, we show that our approach exceeds the accuracy of previous work but requires substantially less hand-annotated knowledge. Further, we show that an active learning approach that synthesizes common-sense queries can boost accuracy.
Tasks Active Learning, Common Sense Reasoning, Zero-Shot Learning
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2102/
PDF https://www.aclweb.org/anthology/P18-2102
PWC https://paperswithcode.com/paper/extracting-commonsense-properties-from
Repo https://github.com/yangyiben/PCE
Framework pytorch

Deep Diffeomorphic Transformer Networks

Title Deep Diffeomorphic Transformer Networks
Authors Nicki Skafte Detlefsen, Oren Freifeld, Søren Hauberg
Abstract Spatial Transformer layers allow neural networks, at least in principle, to be invariant to large spatial transformations in image data. The model has, however, seen limited uptake as most practical implementations support only transformations that are too restricted, e.g. affine or homographic maps, and/or destructive maps, such as thin plate splines. We investigate the use of flexible diffeomorphic image transformations within such networks and demonstrate that significant performance gains can be attained over currently-used models. The learned transformations are found to be both simple and intuitive, thereby providing insights into individual problem domains. With the proposed framework, a standard convolutional neural network matches state-of-the-art results on face verification with only two extra lines of simple TensorFlow code.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Detlefsen_Deep_Diffeomorphic_Transformer_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Detlefsen_Deep_Diffeomorphic_Transformer_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/deep-diffeomorphic-transformer-networks
Repo https://github.com/SkafteNicki/ddtn
Framework tf

How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues

Title How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues
Authors Shang-Yu Su, Pei-Chieh Yuan, Yun-Nung Chen
Abstract Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU components treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, most previous models only paid attention to the related content in history utterances, ignoring their temporal information. In the dialogues, it is intuitive that the most recent utterances are more important than the least recent ones, in other words, time-aware attention should be in a decaying manner. Therefore, this paper designs and investigates various types of time-decay attention on the sentence-level and speaker-level, and further proposes a flexible universal time-decay attention mechanism. The experiments on the benchmark Dialogue State Tracking Challenge (DSTC4) dataset show that the proposed time-decay attention mechanisms significantly improve the state-of-the-art model for contextual understanding performance.
Tasks Dialogue State Tracking, Image Captioning, Machine Translation, Slot Filling, Spoken Dialogue Systems, Spoken Language Understanding
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1194/
PDF https://www.aclweb.org/anthology/N18-1194
PWC https://paperswithcode.com/paper/how-time-matters-learning-time-decay
Repo https://github.com/MiuLab/Time-Decay-SLU
Framework tf

Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs

Title Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs
Authors Yu-Sheng Chen, Yu-Ching Wang, Man-Hsin Kao, Yung-Yu Chuang
Abstract This paper proposes an unpaired learning method for image enhancement. Given a set of photographs with the desired characteristics, the proposed method learns a photo enhancer which transforms an input image into an enhanced image with those characteristics. The method is based on the framework of two-way generative adversarial networks (GANs) with several improvements. First, we augment the U-Net with global features and show that it is more effective. The global U-Net acts as the generator in our GAN model. Second, we improve Wasserstein GAN (WGAN) with an adaptive weighting scheme. With this scheme, training converges faster and better, and is less sensitive to parameters than WGAN-GP. Finally, we propose to use individual batch normalization layers for generators in two-way GANs. It helps generators better adapt to their own input distributions. All together, they significantly improve the stability of GAN training for our application. Both quantitative and visual results show that the proposed method is effective for enhancing images.
Tasks Image Enhancement
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_Deep_Photo_Enhancer_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_Deep_Photo_Enhancer_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/deep-photo-enhancer-unpaired-learning-for
Repo https://github.com/nothinglo/Deep-Photo-Enhancer
Framework tf

Deep One-Class Classification

Title Deep One-Class Classification
Authors Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, Marius Kloft
Abstract Despite the great advances made by deep learning in many machine learning problems, there is a relative dearth of deep learning approaches for anomaly detection. Those approaches which do exist involve networks trained to perform a task other than anomaly detection, namely generative models or compression, which are in turn adapted for use in anomaly detection; they are not trained on an anomaly detection based objective. In this paper we introduce a new anomaly detection method—Deep Support Vector Data Description—, which is trained on an anomaly detection based objective. The adaptation to the deep regime necessitates that our neural network and training procedure satisfy certain properties, which we demonstrate theoretically. We show the effectiveness of our method on MNIST and CIFAR-10 image benchmark datasets as well as on the detection of adversarial examples of GTSRB stop signs.
Tasks Anomaly Detection
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2483
PDF http://proceedings.mlr.press/v80/ruff18a/ruff18a.pdf
PWC https://paperswithcode.com/paper/deep-one-class-classification
Repo https://github.com/lukasruff/Deep-SVDD
Framework pytorch

Active Learning for Non-Parametric Regression Using Purely Random Trees

Title Active Learning for Non-Parametric Regression Using Purely Random Trees
Authors Jack Goetz, Ambuj Tewari, Paul Zimmerman
Abstract Active learning is the task of using labelled data to select additional points to label, with the goal of fitting the most accurate model with a fixed budget of labelled points. In binary classification active learning is known to produce faster rates than passive learning for a broad range of settings. However in regression restrictive structure and tailored methods were previously needed to obtain theoretically superior performance. In this paper we propose an intuitive tree based active learning algorithm for non-parametric regression with provable improvement over random sampling. When implemented with Mondrian Trees our algorithm is tuning parameter free, consistent and minimax optimal for Lipschitz functions.
Tasks Active Learning
Published 2018-12-01
URL http://papers.nips.cc/paper/7520-active-learning-for-non-parametric-regression-using-purely-random-trees
PDF http://papers.nips.cc/paper/7520-active-learning-for-non-parametric-regression-using-purely-random-trees.pdf
PWC https://paperswithcode.com/paper/active-learning-for-non-parametric-regression
Repo https://github.com/jackrgoetz/Mondrian_Tree_AL
Framework none

Graph Classification using Structural Attention

Title Graph Classification using Structural Attention
Authors John Boaz Lee, Ryan Rossi, Xiangnan Kong
Abstract Graph classification is a problem with practical applications in many different domains. To solve this problem, one usually calculates certain graph statistics (i.e., graph features) that help discriminate between graphs of different classes. When calculating such features, most existing approaches process the entire graph. In a graphlet-based approach, for instance, the entire graph is processed to get the total count of different graphlets or subgraphs. In many real-world applications, however, graphs can be noisy with discriminative patterns confined to certain regions in the graph only. In this work, we study the problem of attention-based graph classification. The use of attention allows us to focus on small but informative parts of the graph, avoiding noise in the rest of the graph. We present a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of “informative” nodes. Experimental results on multiple real-world datasets show that the proposed method is competitive against various well-known methods in graph classification even though our method is limited to only a portion of the graph.
Tasks Graph Classification
Published 2018-07-19
URL https://www.kdd.org/kdd2018/accepted-papers/view/graph-classification-using-structural-attention
PDF http://ryanrossi.com/pubs/KDD18-graph-attention-model.pdf
PWC https://paperswithcode.com/paper/graph-classification-using-structural
Repo https://github.com/benedekrozemberczki/GAM
Framework pytorch
comments powered by Disqus