February 1, 2020

3126 words 15 mins read

Paper Group AWR 314

Paper Group AWR 314

MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification. DAG-GNN: DAG Structure Learning with Graph Neural Networks. High-Fidelity Image Generation With Fewer Labels. Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction. Dual Graph Convolutional Network for Semantic Segmentation. Learning …

MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification

Title MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification
Authors Lukas Liebel, Marco Körner
Abstract We introduce MultiDepth, a novel training strategy and convolutional neural network (CNN) architecture that allows approaching single-image depth estimation (SIDE) as a multi-task problem. SIDE is an important part of road scene understanding. It, thus, plays a vital role in advanced driver assistance systems and autonomous vehicles. Best results for the SIDE task so far have been achieved using deep CNNs. However, optimization of regression problems, such as estimating depth, is still a challenging task. For the related tasks of image classification and semantic segmentation, numerous CNN-based methods with robust training behavior have been proposed. Hence, in order to overcome the notorious instability and slow convergence of depth value regression during training, MultiDepth makes use of depth interval classification as an auxiliary task. The auxiliary task can be disabled at test-time to predict continuous depth values using the main regression branch more efficiently. We applied MultiDepth to road scenes and present results on the KITTI depth prediction dataset. In experiments, we were able to show that end-to-end multi-task learning with both, regression and classification, is able to considerably improve training and yield more accurate results.
Tasks Autonomous Vehicles, Depth Estimation, Image Classification, Multi-Task Learning, Scene Understanding, Semantic Segmentation
Published 2019-07-25
URL https://arxiv.org/abs/1907.11111v1
PDF https://arxiv.org/pdf/1907.11111v1.pdf
PWC https://paperswithcode.com/paper/multidepth-single-image-depth-estimation-via
Repo https://github.com/lukasliebel/MultiDepth
Framework tf

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Title DAG-GNN: DAG Structure Learning with Graph Neural Networks
Authors Yue Yu, Jie Chen, Tian Gao, Mo Yu
Abstract Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph nodes. A recent breakthrough formulates the problem as a continuous optimization with a structural constraint that ensures acyclicity (Zheng et al., 2018). The authors apply the approach to the linear structural equation model (SEM) and the least-squares loss function that are statistically well justified but nevertheless limited. Motivated by the widespread success of deep learning that is capable of capturing complex nonlinear mappings, in this work we propose a deep generative model and apply a variant of the structural constraint to learn the DAG. At the heart of the generative model is a variational autoencoder parameterized by a novel graph neural network architecture, which we coin DAG-GNN. In addition to the richer capacity, an advantage of the proposed model is that it naturally handles discrete variables as well as vector-valued ones. We demonstrate that on synthetic data sets, the proposed method learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima. The code is available at \url{https://github.com/fishmoon1234/DAG-GNN}.
Tasks
Published 2019-04-22
URL http://arxiv.org/abs/1904.10098v1
PDF http://arxiv.org/pdf/1904.10098v1.pdf
PWC https://paperswithcode.com/paper/dag-gnn-dag-structure-learning-with-graph
Repo https://github.com/fishmoon1234/DAG-GNN
Framework pytorch

High-Fidelity Image Generation With Fewer Labels

Title High-Fidelity Image Generation With Fewer Labels
Authors Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly
Abstract Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work we demonstrate how one can benefit from recent work on self- and semi-supervised learning to outperform the state of the art on both unsupervised ImageNet synthesis, as well as in the conditional setting. In particular, the proposed approach is able to match the sample quality (as measured by FID) of the current state-of-the-art conditional model BigGAN on ImageNet using only 10% of the labels and outperform it using 20% of the labels.
Tasks Conditional Image Generation, Image Generation
Published 2019-03-06
URL https://arxiv.org/abs/1903.02271v2
PDF https://arxiv.org/pdf/1903.02271v2.pdf
PWC https://paperswithcode.com/paper/high-fidelity-image-generation-with-fewer
Repo https://github.com/google/compare_gan
Framework tf

Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction

Title Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction
Authors Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, Liang Wang
Abstract Click-through rate (CTR) prediction is an essential task in web applications such as online advertising and recommender systems, whose features are usually in multi-field form. The key of this task is to model feature interactions among different feature fields. Recently proposed deep learning based models follow a general paradigm: raw sparse input multi-filed features are first mapped into dense field embedding vectors, and then simply concatenated together to feed into deep neural networks (DNN) or other specifically designed networks to learn high-order feature interactions. However, the simple \emph{unstructured combination} of feature fields will inevitably limit the capability to model sophisticated interactions among different fields in a sufficiently flexible and explicit fashion. In this work, we propose to represent the multi-field features in a graph structure intuitively, where each node corresponds to a feature field and different fields can interact through edges. The task of modeling feature interactions can be thus converted to modeling node interactions on the corresponding graph. To this end, we design a novel model Feature Interaction Graph Neural Networks (Fi-GNN). Taking advantage of the strong representative power of graphs, our proposed model can not only model sophisticated feature interactions in a flexible and explicit fashion, but also provide good model explanations for CTR prediction. Experimental results on two real-world datasets show its superiority over the state-of-the-arts.
Tasks Click-Through Rate Prediction, Recommendation Systems
Published 2019-10-12
URL https://arxiv.org/abs/1910.05552v1
PDF https://arxiv.org/pdf/1910.05552v1.pdf
PWC https://paperswithcode.com/paper/fi-gnn-modeling-feature-interactions-via
Repo https://github.com/CRIPAC-DIG/Fi_GNNs
Framework tf

Dual Graph Convolutional Network for Semantic Segmentation

Title Dual Graph Convolutional Network for Semantic Segmentation
Authors Li Zhang, Xiangtai Li, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, Philip H. S. Torr
Abstract Exploiting long-range contextual information is key for pixel-wise prediction tasks such as semantic segmentation. In contrast to previous work that uses multi-scale feature fusion or dilated convolutions, we propose a novel graph-convolutional network (GCN) to address this problem. Our Dual Graph Convolutional Network (DGCNet) models the global context of the input feature by modelling two orthogonal graphs in a single framework. The first component models spatial relationships between pixels in the image, whilst the second models interdependencies along the channel dimensions of the network’s feature map. This is done efficiently by projecting the feature into a new, lower-dimensional space where all pairwise interactions can be modelled, before reprojecting into the original space. Our simple method provides substantial benefits over a strong baseline and achieves state-of-the-art results on both Cityscapes (82.0% mean IoU) and Pascal Context (53.7% mean IoU) datasets.
Tasks Semantic Segmentation
Published 2019-09-13
URL https://arxiv.org/abs/1909.06121v2
PDF https://arxiv.org/pdf/1909.06121v2.pdf
PWC https://paperswithcode.com/paper/dual-graph-convolutional-network-for-semantic
Repo https://github.com/lxtGH/GALD-Net
Framework pytorch

Learning to Adapt for Stereo

Title Learning to Adapt for Stereo
Authors Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr
Abstract Real world applications of stereo depth estimation require models that are robust to dynamic variations in the environment. Even though deep learning based stereo methods are successful, they often fail to generalize to unseen variations in the environment, making them less suitable for practical applications such as autonomous driving. In this work, we introduce a “learning-to-adapt” framework that enables deep stereo methods to continuously adapt to new target domains in an unsupervised manner. Specifically, our approach incorporates the adaptation procedure into the learning objective to obtain a base set of parameters that are better suited for unsupervised online adaptation. To further improve the quality of the adaptation, we learn a confidence measure that effectively masks the errors introduced during the unsupervised adaptation. We evaluate our method on synthetic and real-world stereo datasets and our experiments evidence that learning-to-adapt is, indeed beneficial for online adaptation on vastly different domains.
Tasks Autonomous Driving, Depth Estimation, Stereo Depth Estimation
Published 2019-04-05
URL http://arxiv.org/abs/1904.02957v1
PDF http://arxiv.org/pdf/1904.02957v1.pdf
PWC https://paperswithcode.com/paper/learning-to-adapt-for-stereo
Repo https://github.com/CVLAB-Unibo/Learning2AdaptForStereo
Framework tf

A Novel Independent RNN Approach to Classification of Seizures against Non-seizures

Title A Novel Independent RNN Approach to Classification of Seizures against Non-seizures
Authors Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang
Abstract In current clinical practices, electroencephalograms (EEG) are reviewed and analyzed by trained neurologists to provide supports for therapeutic decisions. Manual reviews can be laborious and error prone. Automatic and accurate seizure/non-seizure classification methods are desirable. A critical challenge is that seizure morphologies exhibit considerable variabilities. In order to capture essential seizure features, this paper leverages an emerging deep learning model, the independently recurrent neural network (IndRNN), to construct a new approach for the seizure/non-seizure classification. This new approach gradually expands the time scales, thereby extracting temporal and spatial features from the local time duration to the entire record. Evaluations are conducted with cross-validation experiments across subjects over the noisy data of CHB-MIT. Experimental results demonstrate that the proposed approach outperforms the current state-of-the-art methods. In addition, we explore how the segment length affects the classification performance. Thirteen different segment lengths are assessed, showing that the classification performance varies over the segment lengths, and the maximal fluctuating margin is more than 4%. Thus, the segment length is an important factor influencing the classification performance.
Tasks EEG
Published 2019-03-22
URL http://arxiv.org/abs/1903.09326v1
PDF http://arxiv.org/pdf/1903.09326v1.pdf
PWC https://paperswithcode.com/paper/a-novel-independent-rnn-approach-to
Repo https://github.com/gabi-a/EEG-Literature
Framework none

SleepEEGNet: Automated Sleep Stage Scoring with Sequence to Sequence Deep Learning Approach

Title SleepEEGNet: Automated Sleep Stage Scoring with Sequence to Sequence Deep Learning Approach
Authors Sajad Mousavi, Fatemeh Afghah, U. Rajendra Acharya
Abstract Electroencephalogram (EEG) is a common base signal used to monitor brain activity and diagnose sleep disorders. Manual sleep stage scoring is a time-consuming task for sleep experts and is limited by inter-rater reliability. In this paper, we propose an automatic sleep stage annotation method called SleepEEGNet using a single-channel EEG signal. The SleepEEGNet is composed of deep convolutional neural networks (CNNs) to extract time-invariant features, frequency information, and a sequence to sequence model to capture the complex and long short-term context dependencies between sleep epochs and scores. In addition, to reduce the effect of the class imbalance problem presented in the available sleep datasets, we applied novel loss functions to have an equal misclassified error for each sleep stage while training the network. We evaluated the proposed method on different single-EEG channels (i.e., Fpz-Cz and Pz-Oz EEG channels) from the Physionet Sleep-EDF datasets published in 2013 and 2018. The evaluation results demonstrate that the proposed method achieved the best annotation performance compared to current literature, with an overall accuracy of 84.26%, a macro F1-score of 79.66% and Cohen’s Kappa coefficient = 0.79. Our developed model is ready to test with more sleep EEG signals and aid the sleep specialists to arrive at an accurate diagnosis. The source code is available at https://github.com/SajadMo/SleepEEGNet.
Tasks EEG, Sleep Stage Detection
Published 2019-03-05
URL http://arxiv.org/abs/1903.02108v1
PDF http://arxiv.org/pdf/1903.02108v1.pdf
PWC https://paperswithcode.com/paper/sleepeegnet-automated-sleep-stage-scoring
Repo https://github.com/SajadMo/SleepEEGNet
Framework tf

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Title Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models
Authors Yuge Shi, N. Siddharth, Brooks Paige, Philip H. S. Torr
Abstract Learning generative models that span multiple data modalities, such as vision and language, is often motivated by the desire to learn more useful, generalisable representations that faithfully capture common underlying factors between the modalities. In this work, we characterise successful learning of such models as the fulfillment of four criteria: i) implicit latent decomposition into shared and private subspaces, ii) coherent joint generation over all modalities, iii) coherent cross-generation across individual modalities, and iv) improved model learning for individual modalities through multi-modal integration. Here, we propose a mixture-of-experts multimodal variational autoencoder (MMVAE) to learn generative models on different sets of modalities, including a challenging image-language dataset, and demonstrate its ability to satisfy all four criteria, both qualitatively and quantitatively.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03393v1
PDF https://arxiv.org/pdf/1911.03393v1.pdf
PWC https://paperswithcode.com/paper/variational-mixture-of-experts-autoencoders
Repo https://github.com/iffsid/mmvae
Framework pytorch

Kernelized Bayesian Softmax for Text Generation

Title Kernelized Bayesian Softmax for Text Generation
Authors Ning Miao, Hao Zhou, Chengqi Zhao, Wenxian Shi, Lei Li
Abstract Neural models for text generation require a softmax layer with proper token embeddings during the decoding phase. Most existing approaches adopt single point embedding for each token. However, a word may have multiple senses according to different context, some of which might be distinct. In this paper, we propose KerBS, a novel approach for learning better embeddings for text generation. KerBS embodies two advantages: (a) it employs a Bayesian composition of embeddings for words with multiple senses; (b) it is adaptive to semantic variances of words and robust to rare sentence context by imposing learned kernels to capture the closeness of words (senses) in the embedding space. Empirical studies show that KerBS significantly boosts the performance of several text generation tasks.
Tasks Text Generation
Published 2019-11-01
URL https://arxiv.org/abs/1911.00274v1
PDF https://arxiv.org/pdf/1911.00274v1.pdf
PWC https://paperswithcode.com/paper/kernelized-bayesian-softmax-for-text
Repo https://github.com/NingMiao/KerBS
Framework tf

The Multi-Lane Capsule Network (MLCN)

Title The Multi-Lane Capsule Network (MLCN)
Authors Vanderson Martins do Rosario, Edson Borin, Mauricio Breternitz Jr
Abstract We introduce Multi-Lane Capsule Networks (MLCN), which are a separable and resource efficient organization of Capsule Networks (CapsNet) that allows parallel processing, while achieving high accuracy at reduced cost. A MLCN is composed of a number of (distinct) parallel lanes, each contributing to a dimension of the result, trained using the routing-by-agreement organization of CapsNet. Our results indicate similar accuracy with a much reduced cost in number of parameters for the Fashion-MNIST and Cifar10 datsets. They also indicate that the MLCN outperforms the original CapsNet when using a proposed novel configuration for the lanes. MLCN also has faster training and inference times, being more than two-fold faster than the original CapsNet in the same accelerator.
Tasks
Published 2019-02-22
URL http://arxiv.org/abs/1902.08431v1
PDF http://arxiv.org/pdf/1902.08431v1.pdf
PWC https://paperswithcode.com/paper/the-multi-lane-capsule-network-mlcn
Repo https://github.com/vandersonmr/lanes-capsnet
Framework tf

Progressive Stochastic Binarization of Deep Networks

Title Progressive Stochastic Binarization of Deep Networks
Authors David Hartmann, Michael Wand
Abstract A plethora of recent research has focused on improving the memory footprint and inference speed of deep networks by reducing the complexity of (i) numerical representations (for example, by deterministic or stochastic quantization) and (ii) arithmetic operations (for example, by binarization of weights). We propose a stochastic binarization scheme for deep networks that allows for efficient inference on hardware by restricting itself to additions of small integers and fixed shifts. Unlike previous approaches, the underlying randomized approximation is progressive, thus permitting an adaptive control of the accuracy of each operation at run-time. In a low-precision setting, we match the accuracy of previous binarized approaches. Our representation is unbiased - it approaches continuous computation with increasing sample size. In a high-precision regime, the computational costs are competitive with previous quantization schemes. Progressive stochastic binarization also permits localized, dynamic accuracy control within a single network, thereby providing a new tool for adaptively focusing computational attention. We evaluate our method on networks of various architectures, already pretrained on ImageNet. With representational costs comparable to previous schemes, we obtain accuracies close to the original floating point implementation. This includes pruned networks, except the known special case of certain types of separated convolutions. By focusing computational attention using progressive sampling, we reduce inference costs on ImageNet further by a factor of up to 33% (before network pruning).
Tasks Network Pruning, Quantization
Published 2019-04-03
URL http://arxiv.org/abs/1904.02205v1
PDF http://arxiv.org/pdf/1904.02205v1.pdf
PWC https://paperswithcode.com/paper/progressive-stochastic-binarization-of-deep
Repo https://github.com/JGU-VC/progressive_stochastic_binarization
Framework tf

Amortized Population Gibbs Samplers with Neural Sufficient Statistics

Title Amortized Population Gibbs Samplers with Neural Sufficient Statistics
Authors Hao Wu, Heiko Zimmermann, Eli Sennesh, Tuan Anh Le, Jan-Willem van de Meent
Abstract Amortized variational methods have proven difficult to scale to structured problems, such as inferring positions of multiple objects from video images. We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frames structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can train highly structured deep generative models in an unsupervised manner, and achieve substantial improvements in inference accuracy relative to standard autoencoding variational methods.
Tasks
Published 2019-11-04
URL https://arxiv.org/abs/1911.01382v2
PDF https://arxiv.org/pdf/1911.01382v2.pdf
PWC https://paperswithcode.com/paper/amortized-population-gibbs-samplers-with
Repo https://github.com/hao-w/apg-samplers
Framework pytorch

Stability of Graph Scattering Transforms

Title Stability of Graph Scattering Transforms
Authors Fernando Gama, Joan Bruna, Alejandro Ribeiro
Abstract Scattering transforms are non-trainable deep convolutional architectures that exploit the multi-scale resolution of a wavelet filter bank to obtain an appropriate representation of data. More importantly, they are proven invariant to translations, and stable to perturbations that are close to translations. This stability property dons the scattering transform with a robustness to small changes in the metric domain of the data. When considering network data, regular convolutions do not hold since the data domain presents an irregular structure given by the network topology. In this work, we extend scattering transforms to network data by using multiresolution graph wavelets, whose computation can be obtained by means of graph convolutions. Furthermore, we prove that the resulting graph scattering transforms are stable to metric perturbations of the underlying network. This renders graph scattering transforms robust to changes on the network topology, making it particularly useful for cases of transfer learning, topology estimation or time-varying graphs.
Tasks Transfer Learning
Published 2019-06-11
URL https://arxiv.org/abs/1906.04784v1
PDF https://arxiv.org/pdf/1906.04784v1.pdf
PWC https://paperswithcode.com/paper/stability-of-graph-scattering-transforms
Repo https://github.com/alelab-upenn/graph-scattering-transforms
Framework pytorch

On Relativistic $f$-Divergences

Title On Relativistic $f$-Divergences
Authors Alexia Jolicoeur-Martineau
Abstract This paper provides a more rigorous look at Relativistic Generative Adversarial Networks (RGANs). We prove that the objective function of the discriminator is a statistical divergence for any concave function $f$ with minimal properties ($f(0)=0$, $f’(0) \neq 0$, $\sup_x f(x)>0$). We also devise a few variants of relativistic $f$-divergences. Wasserstein GAN was originally justified by the idea that the Wasserstein distance (WD) is most sensible because it is weak (i.e., it induces a weak topology). We show that the WD is weaker than $f$-divergences which are weaker than relativistic $f$-divergences. Given the good performance of RGANs, this suggests that WGAN does not performs well primarily because of the weak metric, but rather because of regularization and the use of a relativistic discriminator. We also take a closer look at estimators of relativistic $f$-divergences. We introduce the minimum-variance unbiased estimator (MVUE) for Relativistic paired GANs (RpGANs; originally called RGANs which could bring confusion) and show that it does not perform better. Furthermore, we show that the estimator of Relativistic average GANs (RaGANs) is only asymptotically unbiased, but that the finite-sample bias is small. Removing this bias does not improve performance.
Tasks
Published 2019-01-08
URL http://arxiv.org/abs/1901.02474v1
PDF http://arxiv.org/pdf/1901.02474v1.pdf
PWC https://paperswithcode.com/paper/on-relativistic-f-divergences
Repo https://github.com/AlexiaJM/relativistic-f-divergences
Framework tf
comments powered by Disqus