January 26, 2020

2910 words 14 mins read

Paper Group ANR 1397

SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors. Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions. A Comparative Study on End-to-end Speech to Text Translation. Detecting GAN generated Fake Images using Co-occurrence Matrices. Modularized Textual Grounding for Counte …

SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors


Title	SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors
Authors	Zhangxiaowen Gong, Houxiang Ji, Christopher Fletcher, Christopher Hughes, Josep Torrellas
Abstract	Our community has greatly improved the efficiency of deep learning applications, including by exploiting sparsity in inputs. Most of that work, though, is for inference, where weight sparsity is known statically, and/or for specialized hardware. We propose a scheme to leverage dynamic sparsity during training. In particular, we exploit zeros introduced by the ReLU activation function to both feature maps and their gradients. This is challenging because the sparsity degree is moderate and the locations of zeros change over time. We also rely purely on software. We identify zeros in a dense data representation without transforming the data and performs conventional vectorized computation. Variations of the scheme are applicable to all major components of training: forward propagation, backward propagation by inputs, and backward propagation by weights. Our method significantly outperforms a highly-optimized dense direct convolution on several popular deep neural networks. At realistic sparsity, we speed up the training of the non-initial convolutional layers in VGG16, ResNet-34, ResNet-50, and Fixup ResNet-50 by 2.19x, 1.37x, 1.31x, and 1.51x respectively on an Intel Skylake-X CPU.
Tasks
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10175v1
PDF	https://arxiv.org/pdf/1911.10175v1.pdf
PWC	https://paperswithcode.com/paper/sparsetrainleveraging-dynamic-sparsity-in
Repo
Framework

Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions


Title	Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Authors	Yunwen Lei, Ting Hu, Guiying Li, Ke Tang
Abstract	Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing theoretical results for SGD applied to nonconvex objective functions are far from mature. For example, existing results require to impose a nontrivial assumption on the uniform boundedness of gradients for all iterates encountered in the learning process, which is hard to verify in practical implementations. In this paper, we establish a rigorous theoretical foundation for SGD in nonconvex learning by showing that this boundedness assumption can be removed without affecting convergence rates. In particular, we establish sufficient conditions for almost sure convergence as well as optimal convergence rates for SGD applied to both general nonconvex objective functions and gradient-dominated objective functions. A linear convergence is further derived in the case with zero variances.
Tasks
Published	2019-02-03
URL	https://arxiv.org/abs/1902.00908v3
PDF	https://arxiv.org/pdf/1902.00908v3.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-descent-for-nonconvex
Repo
Framework

A Comparative Study on End-to-end Speech to Text Translation


Title	A Comparative Study on End-to-end Speech to Text Translation
Authors	Parnia Bahar, Tobias Bieschke, Hermann Ney
Abstract	Recent advances in deep learning show that end-to-end speech to text translation model is a promising approach to direct the speech translation field. In this work, we provide an overview of different end-to-end architectures, as well as the usage of an auxiliary connectionist temporal classification (CTC) loss for better convergence. We also investigate on pre-training variants such as initializing different components of a model using pre-trained models, and their impact on the final performance, which gives boosts up to 4% in BLEU and 5% in TER. Our experiments are performed on 270h IWSLT TED-talks En->De, and 100h LibriSpeech Audiobooks En->Fr. We also show improvements over the current end-to-end state-of-the-art systems on both tasks.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08870v1
PDF	https://arxiv.org/pdf/1911.08870v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-on-end-to-end-speech-to
Repo
Framework

Detecting GAN generated Fake Images using Co-occurrence Matrices


Title	Detecting GAN generated Fake Images using Co-occurrence Matrices
Authors	Lakshmanan Nataraj, Tajuddin Manhar Mohammed, Shivkumar Chandrasekaran, Arjuna Flenner, Jawadul H. Bappy, Amit K. Roy-Chowdhury, B. S. Manjunath
Abstract	The advent of Generative Adversarial Networks (GANs) has brought about completely novel ways of transforming and manipulating pixels in digital images. GAN based techniques such as Image-to-Image translations, DeepFakes, and other automated methods have become increasingly popular in creating fake images. In this paper, we propose a novel approach to detect GAN generated fake images using a combination of co-occurrence matrices and deep learning. We extract co-occurrence matrices on three color channels in the pixel domain and train a model using a deep convolutional neural network (CNN) framework. Experimental results on two diverse and challenging GAN datasets comprising more than 56,000 images based on unpaired image-to-image translations (cycleGAN [1]) and facial attributes/expressions (StarGAN [2]) show that our approach is promising and achieves more than 99% classification accuracy in both datasets. Further, our approach also generalizes well and achieves good results when trained on one dataset and tested on the other.
Tasks
Published	2019-03-15
URL	https://arxiv.org/abs/1903.06836v2
PDF	https://arxiv.org/pdf/1903.06836v2.pdf
PWC	https://paperswithcode.com/paper/detecting-gan-generated-fake-images-using-co
Repo
Framework

Modularized Textual Grounding for Counterfactual Resilience


Title	Modularized Textual Grounding for Counterfactual Resilience
Authors	Zhiyuan Fang, Shu Kong, Charless Fowlkes, Yezhou Yang
Abstract	Computer Vision applications often require a textual grounding module with precision, interpretability, and resilience to counterfactual inputs/queries. To achieve high grounding precision, current textual grounding methods heavily rely on large-scale training data with manual annotations at the pixel level. Such annotations are expensive to obtain and thus severely narrow the model’s scope of real-world applications. Moreover, most of these methods sacrifice interpretability, generalizability, and they neglect the importance of being resilient to counterfactual inputs. To address these issues, we propose a visual grounding system which is 1) end-to-end trainable in a weakly supervised fashion with only image-level annotations, and 2) counterfactually resilient owing to the modular design. Specifically, we decompose textual descriptions into three levels: entity, semantic attribute, color information, and perform compositional grounding progressively. We validate our model through a series of experiments and demonstrate its improvement over the state-of-the-art methods. In particular, our model’s performance not only surpasses other weakly/un-supervised methods and even approaches the strongly supervised ones, but also is interpretable for decision making and performs much better in face of counterfactual classes than all the others.
Tasks	Natural Language Visual Grounding, Phrase Grounding, Weakly-Supervised Object Localization
Published	2019-04-07
URL	https://arxiv.org/abs/1904.03589v2
PDF	https://arxiv.org/pdf/1904.03589v2.pdf
PWC	https://paperswithcode.com/paper/modularized-textual-grounding-for
Repo
Framework

Weakly-Supervised Degree of Eye-Closeness Estimation


Title	Weakly-Supervised Degree of Eye-Closeness Estimation
Authors	Eyasu Mequanint, Shuai Zhang, Bijan Forutanpour, Yingyong Qi, Ning Bi
Abstract	Following recent technological advances there is a growing interest in building non-intrusive methods that help us communicate with computing devices. In this regard, accurate information from eye is a promising input medium between a user and computing devices. In this paper we propose a method that captures the degree of eye closeness. Although many methods exist for detection of eyelid openness, they are inherently unable to satisfactorily perform in real world applications. Detailed eye state estimation is more important, in extracting meaningful information, than estimating whether eyes are open or closed. However, learning reliable eye state estimator requires accurate annotations which is cost prohibitive. In this work, we leverage synthetic face images which can be generated via computer graphics rendering techniques and automatically annotated with different levels of eye openness. These synthesized training data images, however, have a domain shift from real-world data. To alleviate this issue, we propose a weakly-supervised method which utilizes the accurate annotation from the synthetic data set, to learn accurate degree of eye openness, and the weakly labeled (open or closed) real world eye data set to control the domain shift. We introduce a data set of 1.3M synthetic face images with detail eye openness and eye gaze information, and 21k real-world images with open/closed annotation. The dataset will be released online upon acceptance. Extensive experiments validate the effectiveness of the proposed approach.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10845v1
PDF	https://arxiv.org/pdf/1910.10845v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-degree-of-eye-closeness
Repo
Framework

Pattern Spotting in Historical Documents Using Convolutional Models


Title	Pattern Spotting in Historical Documents Using Convolutional Models
Authors	Ignacio Úbeda, Jose M. Saavedra, Stéphane Nicolas, Caroline Petitjean, Laurent Heutte
Abstract	Pattern spotting consists of searching in a collection of historical document images for occurrences of a graphical object using an image query. Contrary to object detection, no prior information nor predefined class is given about the query so training a model of the object is not feasible. In this paper, a convolutional neural network approach is proposed to tackle this problem. We use RetinaNet as a feature extractor to obtain multiscale embeddings of the regions of the documents and also for the queries. Experiments conducted on the DocExplore dataset show that our proposal is better at locating patterns and requires less storage for indexing images than the state-of-the-art system, but fails at retrieving some pages containing multiple instances of the query.
Tasks	Object Detection
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08580v1
PDF	https://arxiv.org/pdf/1906.08580v1.pdf
PWC	https://paperswithcode.com/paper/pattern-spotting-in-historical-documents
Repo
Framework

Adapting Image Super-Resolution State-of-the-arts and Learning Multi-model Ensemble for Video Super-Resolution


Title	Adapting Image Super-Resolution State-of-the-arts and Learning Multi-model Ensemble for Video Super-Resolution
Authors	Chao Li, Dongliang He, Xiao Liu, Yukang Ding, Shilei Wen
Abstract	Recently, image super-resolution has been widely studied and achieved significant progress by leveraging the power of deep convolutional neural networks. However, there has been limited advancement in video super-resolution (VSR) due to the complex temporal patterns in videos. In this paper, we investigate how to adapt state-of-the-art methods of image super-resolution for video super-resolution. The proposed adapting method is straightforward. The information among successive frames is well exploited, while the overhead on the original image super-resolution method is negligible. Furthermore, we propose a learning-based method to ensemble the outputs from multiple super-resolution models. Our methods show superior performance and rank second in the NTIRE2019 Video Super-Resolution Challenge Track 1.
Tasks	Image Super-Resolution, Super-Resolution, Video Super-Resolution
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02462v1
PDF	https://arxiv.org/pdf/1905.02462v1.pdf
PWC	https://paperswithcode.com/paper/adapting-image-super-resolution-state-of-the
Repo
Framework

weg2vec: Event embedding for temporal networks


Title	weg2vec: Event embedding for temporal networks
Authors	Maddalena Torricelli, Márton Karsai, Laetitia Gauvin
Abstract	Network embedding techniques are powerful to capture structural regularities in networks and to identify similarities between their local fabrics. However, conventional network embedding models are developed for static structures, commonly consider nodes only and they are seriously challenged when the network is varying in time. Temporal networks may provide an advantage in the description of real systems, but they code more complex information, which could be effectively represented only by a handful of methods so far. Here, we propose a new method of event embedding of temporal networks, called weg2vec, which builds on temporal and structural similarities of events to learn a low dimensional representation of a temporal network. This projection successfully captures latent structures and similarities between events involving different nodes at different times and provides ways to predict the final outcome of spreading processes unfolding on the temporal structure.
Tasks	Network Embedding
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02425v1
PDF	https://arxiv.org/pdf/1911.02425v1.pdf
PWC	https://paperswithcode.com/paper/weg2vec-event-embedding-for-temporal-networks
Repo
Framework

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness


Title	Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness
Authors	Andrey Malinin, Mark Gales
Abstract	Ensemble approaches for uncertainty estimation have recently been applied to the tasks of misclassification detection, out-of-distribution input detection and adversarial attack detection. Prior Networks have been proposed as an approach to efficiently \emph{emulate} an ensemble of models for classification by parameterising a Dirichlet prior distribution over output distributions. These models have been shown to outperform alternative ensemble approaches, such as Monte-Carlo Dropout, on the task of out-of-distribution input detection. However, scaling Prior Networks to complex datasets with many classes is difficult using the training criteria originally proposed. This paper makes two contributions. First, we show that the appropriate training criterion for Prior Networks is the \emph{reverse} KL-divergence between Dirichlet distributions. This addresses issues in the nature of the training data target distributions, enabling prior networks to be successfully trained on classification tasks with arbitrarily many classes, as well as improving out-of-distribution detection performance. Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training. It is shown that the construction of successful \emph{adaptive} whitebox attacks, which affect the prediction and evade detection, against Prior Networks trained on CIFAR-10 and CIFAR-100 using the proposed approach requires a greater amount of computational effort than against networks defended using standard adversarial training or MC-dropout.
Tasks	Adversarial Attack, Image Classification, Out-of-Distribution Detection
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13472v2
PDF	https://arxiv.org/pdf/1905.13472v2.pdf
PWC	https://paperswithcode.com/paper/reverse-kl-divergence-training-of-prior
Repo
Framework

Solving Advanced Argumentation Problems with Answer Set Programming


Title	Solving Advanced Argumentation Problems with Answer Set Programming
Authors	Gerhard Brewka, Martin Diller, Georg Heissenberger, Thomas Linsbichler, Stefan Woltran
Abstract	Powerful formalisms for abstract argumentation have been proposed, among them abstract dialectical frameworks (ADFs) that allow for a succinct and flexible specification of the relationship between arguments, and the GRAPPA framework which allows argumentation scenarios to be represented as arbitrary edge-labelled graphs. The complexity of ADFs and GRAPPA is located beyond NP and ranges up to the third level of the polynomial hierarchy. The combined complexity of Answer Set Programming (ASP) exactly matches this complexity when programs are restricted to predicates of bounded arity. In this paper, we exploit this coincidence and present novel efficient translations from ADFs and GRAPPA to ASP. More specifically, we provide reductions for the five main ADF semantics of admissible, complete, preferred, grounded, and stable interpretations, and exemplify how these reductions need to be adapted for GRAPPA for the admissible, complete and preferred semantics. Under consideration in Theory and Practice of Logic Programming (TPLP).
Tasks	Abstract Argumentation
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02734v1
PDF	https://arxiv.org/pdf/1912.02734v1.pdf
PWC	https://paperswithcode.com/paper/solving-advanced-argumentation-problems-with
Repo
Framework

Weakly Supervised Semantic Segmentation Using Constrained Dominant Sets


Title	Weakly Supervised Semantic Segmentation Using Constrained Dominant Sets
Authors	Sinem Aslan, Marcello Pelillo
Abstract	The availability of large-scale data sets is an essential pre-requisite for deep learning based semantic segmentation schemes. Since obtaining pixel-level labels is extremely expensive, supervising deep semantic segmentation networks using low-cost weak annotations has been an attractive research problem in recent years. In this work, we explore the potential of Constrained Dominant Sets (CDS) for generating multi-labeled full mask predictions to train a fully convolutional network (FCN) for semantic segmentation. Our experimental results show that using CDS’s yields higher-quality mask predictions compared to methods that have been adopted in the literature for the same purpose.
Tasks	Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09414v1
PDF	https://arxiv.org/pdf/1909.09414v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-semantic-segmentation-using-3
Repo
Framework

GradMask: Reduce Overfitting by Regularizing Saliency


Title	GradMask: Reduce Overfitting by Regularizing Saliency
Authors	Becks Simpson, Francis Dutil, Yoshua Bengio, Joseph Paul Cohen
Abstract	With too few samples or too many model parameters, overfitting can inhibit the ability to generalise predictions to new data. Within medical imaging, this can occur when features are incorrectly assigned importance such as distinct hospital specific artifacts, leading to poor performance on a new dataset from a different institution without those features, which is undesirable. Most regularization methods do not explicitly penalize the incorrect association of these features to the target class and hence fail to address this issue. We propose a regularization method, GradMask, which penalizes saliency maps inferred from the classifier gradients when they are not consistent with the lesion segmentation. This prevents non-tumor related features to contribute to the classification of unhealthy samples. We demonstrate that this method can improve test accuracy between 1-3% compared to the baseline without GradMask, showing that it has an impact on reducing overfitting.
Tasks	Lesion Segmentation
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07478v1
PDF	http://arxiv.org/pdf/1904.07478v1.pdf
PWC	https://paperswithcode.com/paper/gradmask-reduce-overfitting-by-regularizing
Repo
Framework

A tunable multiresolution smoother for scattered data with application to particle filtering


Title	A tunable multiresolution smoother for scattered data with application to particle filtering
Authors	Gregor A. Robinson, Ian G. Grooms
Abstract	A smoothing algorithm is presented that can reduce the small-scale content of data observed at scattered locations in a spatially extended domain. The smoother works by forming a Gaussian interpolant of the input data, and then convolving the interpolant with a multiresolution Gaussian approximation of the Green’s function to a differential operator whose spectrum can be tuned for problem-specific considerations. This smoother is developed for its potential application to particle filtering, which often involves data scattered over a spatial domain, since preprocessing observations with a smoother reduces the ensemble size required to avoid particle filter collapse. An example on meteorological data verifies that our smoother improves the balance of particle filter weights.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06722v1
PDF	https://arxiv.org/pdf/1906.06722v1.pdf
PWC	https://paperswithcode.com/paper/a-tunable-multiresolution-smoother-for
Repo
Framework

Compressing RNNs for IoT devices by 15-38x using Kronecker Products


Title	Compressing RNNs for IoT devices by 15-38x using Kronecker Products
Authors	Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, Matthew Mattina
Abstract	Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size.As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy. This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP). KPs can compress RNN layers by 15-38x with minimal accuracy loss. By quantizing the resulting models to 8-bits, we further push the compression factor to 50x. We show that KP can beat the task accuracy achieved by other state-of-the-art compression techniques across 5 benchmarks spanning 3 different applications, while simultaneously improving inference run-time. We show that the KP compression mechanism does introduce an accuracy loss, which can be mitigated by a proposed hybrid KP (HKP) approach. Our HKP algorithm provides fine-grained control over the compression ratio, enabling us to regain accuracy lost during compression by adding a small number of model parameters.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02876v5
PDF	https://arxiv.org/pdf/1906.02876v5.pdf
PWC	https://paperswithcode.com/paper/compressing-rnns-for-iot-devices-by-15-38x
Repo
Framework