February 1, 2020

3126 words 15 mins read

Paper Group AWR 314

MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification. DAG-GNN: DAG Structure Learning with Graph Neural Networks. High-Fidelity Image Generation With Fewer Labels. Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction. Dual Graph Convolutional Network for Semantic Segmentation. Learning …

MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification


Title	MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification
Authors	Lukas Liebel, Marco Körner
Abstract	We introduce MultiDepth, a novel training strategy and convolutional neural network (CNN) architecture that allows approaching single-image depth estimation (SIDE) as a multi-task problem. SIDE is an important part of road scene understanding. It, thus, plays a vital role in advanced driver assistance systems and autonomous vehicles. Best results for the SIDE task so far have been achieved using deep CNNs. However, optimization of regression problems, such as estimating depth, is still a challenging task. For the related tasks of image classification and semantic segmentation, numerous CNN-based methods with robust training behavior have been proposed. Hence, in order to overcome the notorious instability and slow convergence of depth value regression during training, MultiDepth makes use of depth interval classification as an auxiliary task. The auxiliary task can be disabled at test-time to predict continuous depth values using the main regression branch more efficiently. We applied MultiDepth to road scenes and present results on the KITTI depth prediction dataset. In experiments, we were able to show that end-to-end multi-task learning with both, regression and classification, is able to considerably improve training and yield more accurate results.
Tasks	Autonomous Vehicles, Depth Estimation, Image Classification, Multi-Task Learning, Scene Understanding, Semantic Segmentation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11111v1
PDF	https://arxiv.org/pdf/1907.11111v1.pdf
PWC	https://paperswithcode.com/paper/multidepth-single-image-depth-estimation-via
Repo	https://github.com/lukasliebel/MultiDepth
Framework	tf

DAG-GNN: DAG Structure Learning with Graph Neural Networks


Title	DAG-GNN: DAG Structure Learning with Graph Neural Networks
Authors	Yue Yu, Jie Chen, Tian Gao, Mo Yu
Abstract	Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph nodes. A recent breakthrough formulates the problem as a continuous optimization with a structural constraint that ensures acyclicity (Zheng et al., 2018). The authors apply the approach to the linear structural equation model (SEM) and the least-squares loss function that are statistically well justified but nevertheless limited. Motivated by the widespread success of deep learning that is capable of capturing complex nonlinear mappings, in this work we propose a deep generative model and apply a variant of the structural constraint to learn the DAG. At the heart of the generative model is a variational autoencoder parameterized by a novel graph neural network architecture, which we coin DAG-GNN. In addition to the richer capacity, an advantage of the proposed model is that it naturally handles discrete variables as well as vector-valued ones. We demonstrate that on synthetic data sets, the proposed method learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima. The code is available at \url{https://github.com/fishmoon1234/DAG-GNN}.
Tasks
Published	2019-04-22
URL	http://arxiv.org/abs/1904.10098v1
PDF	http://arxiv.org/pdf/1904.10098v1.pdf
PWC	https://paperswithcode.com/paper/dag-gnn-dag-structure-learning-with-graph
Repo	https://github.com/fishmoon1234/DAG-GNN
Framework	pytorch

High-Fidelity Image Generation With Fewer Labels


Title	High-Fidelity Image Generation With Fewer Labels
Authors	Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly
Abstract	Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work we demonstrate how one can benefit from recent work on self- and semi-supervised learning to outperform the state of the art on both unsupervised ImageNet synthesis, as well as in the conditional setting. In particular, the proposed approach is able to match the sample quality (as measured by FID) of the current state-of-the-art conditional model BigGAN on ImageNet using only 10% of the labels and outperform it using 20% of the labels.
Tasks	Conditional Image Generation, Image Generation
Published	2019-03-06
URL	https://arxiv.org/abs/1903.02271v2
PDF	https://arxiv.org/pdf/1903.02271v2.pdf
PWC	https://paperswithcode.com/paper/high-fidelity-image-generation-with-fewer
Repo	https://github.com/google/compare_gan
Framework	tf

Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction


Title	Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction
Authors	Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, Liang Wang
Abstract	Click-through rate (CTR) prediction is an essential task in web applications such as online advertising and recommender systems, whose features are usually in multi-field form. The key of this task is to model feature interactions among different feature fields. Recently proposed deep learning based models follow a general paradigm: raw sparse input multi-filed features are first mapped into dense field embedding vectors, and then simply concatenated together to feed into deep neural networks (DNN) or other specifically designed networks to learn high-order feature interactions. However, the simple \emph{unstructured combination} of feature fields will inevitably limit the capability to model sophisticated interactions among different fields in a sufficiently flexible and explicit fashion. In this work, we propose to represent the multi-field features in a graph structure intuitively, where each node corresponds to a feature field and different fields can interact through edges. The task of modeling feature interactions can be thus converted to modeling node interactions on the corresponding graph. To this end, we design a novel model Feature Interaction Graph Neural Networks (Fi-GNN). Taking advantage of the strong representative power of graphs, our proposed model can not only model sophisticated feature interactions in a flexible and explicit fashion, but also provide good model explanations for CTR prediction. Experimental results on two real-world datasets show its superiority over the state-of-the-arts.
Tasks	Click-Through Rate Prediction, Recommendation Systems
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05552v1
PDF	https://arxiv.org/pdf/1910.05552v1.pdf
PWC	https://paperswithcode.com/paper/fi-gnn-modeling-feature-interactions-via
Repo	https://github.com/CRIPAC-DIG/Fi_GNNs
Framework	tf

Dual Graph Convolutional Network for Semantic Segmentation


Title	Dual Graph Convolutional Network for Semantic Segmentation
Authors	Li Zhang, Xiangtai Li, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, Philip H. S. Torr
Abstract	Exploiting long-range contextual information is key for pixel-wise prediction tasks such as semantic segmentation. In contrast to previous work that uses multi-scale feature fusion or dilated convolutions, we propose a novel graph-convolutional network (GCN) to address this problem. Our Dual Graph Convolutional Network (DGCNet) models the global context of the input feature by modelling two orthogonal graphs in a single framework. The first component models spatial relationships between pixels in the image, whilst the second models interdependencies along the channel dimensions of the network’s feature map. This is done efficiently by projecting the feature into a new, lower-dimensional space where all pairwise interactions can be modelled, before reprojecting into the original space. Our simple method provides substantial benefits over a strong baseline and achieves state-of-the-art results on both Cityscapes (82.0% mean IoU) and Pascal Context (53.7% mean IoU) datasets.
Tasks	Semantic Segmentation
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06121v2
PDF	https://arxiv.org/pdf/1909.06121v2.pdf
PWC	https://paperswithcode.com/paper/dual-graph-convolutional-network-for-semantic
Repo	https://github.com/lxtGH/GALD-Net
Framework	pytorch

Learning to Adapt for Stereo


Title	Learning to Adapt for Stereo
Authors	Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr
Abstract	Real world applications of stereo depth estimation require models that are robust to dynamic variations in the environment. Even though deep learning based stereo methods are successful, they often fail to generalize to unseen variations in the environment, making them less suitable for practical applications such as autonomous driving. In this work, we introduce a “learning-to-adapt” framework that enables deep stereo methods to continuously adapt to new target domains in an unsupervised manner. Specifically, our approach incorporates the adaptation procedure into the learning objective to obtain a base set of parameters that are better suited for unsupervised online adaptation. To further improve the quality of the adaptation, we learn a confidence measure that effectively masks the errors introduced during the unsupervised adaptation. We evaluate our method on synthetic and real-world stereo datasets and our experiments evidence that learning-to-adapt is, indeed beneficial for online adaptation on vastly different domains.
Tasks	Autonomous Driving, Depth Estimation, Stereo Depth Estimation
Published	2019-04-05
URL	http://arxiv.org/abs/1904.02957v1
PDF	http://arxiv.org/pdf/1904.02957v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-adapt-for-stereo
Repo	https://github.com/CVLAB-Unibo/Learning2AdaptForStereo
Framework	tf

A Novel Independent RNN Approach to Classification of Seizures against Non-seizures


Title	A Novel Independent RNN Approach to Classification of Seizures against Non-seizures
Authors	Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang
Abstract	In current clinical practices, electroencephalograms (EEG) are reviewed and analyzed by trained neurologists to provide supports for therapeutic decisions. Manual reviews can be laborious and error prone. Automatic and accurate seizure/non-seizure classification methods are desirable. A critical challenge is that seizure morphologies exhibit considerable variabilities. In order to capture essential seizure features, this paper leverages an emerging deep learning model, the independently recurrent neural network (IndRNN), to construct a new approach for the seizure/non-seizure classification. This new approach gradually expands the time scales, thereby extracting temporal and spatial features from the local time duration to the entire record. Evaluations are conducted with cross-validation experiments across subjects over the noisy data of CHB-MIT. Experimental results demonstrate that the proposed approach outperforms the current state-of-the-art methods. In addition, we explore how the segment length affects the classification performance. Thirteen different segment lengths are assessed, showing that the classification performance varies over the segment lengths, and the maximal fluctuating margin is more than 4%. Thus, the segment length is an important factor influencing the classification performance.
Tasks	EEG
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09326v1
PDF	http://arxiv.org/pdf/1903.09326v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-independent-rnn-approach-to
Repo	https://github.com/gabi-a/EEG-Literature
Framework	none

SleepEEGNet: Automated Sleep Stage Scoring with Sequence to Sequence Deep Learning Approach


Title	SleepEEGNet: Automated Sleep Stage Scoring with Sequence to Sequence Deep Learning Approach
Authors	Sajad Mousavi, Fatemeh Afghah, U. Rajendra Acharya
Abstract	Electroencephalogram (EEG) is a common base signal used to monitor brain activity and diagnose sleep disorders. Manual sleep stage scoring is a time-consuming task for sleep experts and is limited by inter-rater reliability. In this paper, we propose an automatic sleep stage annotation method called SleepEEGNet using a single-channel EEG signal. The SleepEEGNet is composed of deep convolutional neural networks (CNNs) to extract time-invariant features, frequency information, and a sequence to sequence model to capture the complex and long short-term context dependencies between sleep epochs and scores. In addition, to reduce the effect of the class imbalance problem presented in the available sleep datasets, we applied novel loss functions to have an equal misclassified error for each sleep stage while training the network. We evaluated the proposed method on different single-EEG channels (i.e., Fpz-Cz and Pz-Oz EEG channels) from the Physionet Sleep-EDF datasets published in 2013 and 2018. The evaluation results demonstrate that the proposed method achieved the best annotation performance compared to current literature, with an overall accuracy of 84.26%, a macro F1-score of 79.66% and Cohen’s Kappa coefficient = 0.79. Our developed model is ready to test with more sleep EEG signals and aid the sleep specialists to arrive at an accurate diagnosis. The source code is available at https://github.com/SajadMo/SleepEEGNet.
Tasks	EEG, Sleep Stage Detection
Published	2019-03-05
URL	http://arxiv.org/abs/1903.02108v1
PDF	http://arxiv.org/pdf/1903.02108v1.pdf
PWC	https://paperswithcode.com/paper/sleepeegnet-automated-sleep-stage-scoring
Repo	https://github.com/SajadMo/SleepEEGNet
Framework	tf


Title	Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models
Authors	Yuge Shi, N. Siddharth, Brooks Paige, Philip H. S. Torr
Abstract	Learning generative models that span multiple data modalities, such as vision and language, is often motivated by the desire to learn more useful, generalisable representations that faithfully capture common underlying factors between the modalities. In this work, we characterise successful learning of such models as the fulfillment of four criteria: i) implicit latent decomposition into shared and private subspaces, ii) coherent joint generation over all modalities, iii) coherent cross-generation across individual modalities, and iv) improved model learning for individual modalities through multi-modal integration. Here, we propose a mixture-of-experts multimodal variational autoencoder (MMVAE) to learn generative models on different sets of modalities, including a challenging image-language dataset, and demonstrate its ability to satisfy all four criteria, both qualitatively and quantitatively.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03393v1
PDF	https://arxiv.org/pdf/1911.03393v1.pdf
PWC	https://paperswithcode.com/paper/variational-mixture-of-experts-autoencoders
Repo	https://github.com/iffsid/mmvae
Framework	pytorch

Kernelized Bayesian Softmax for Text Generation


Title	Kernelized Bayesian Softmax for Text Generation
Authors	Ning Miao, Hao Zhou, Chengqi Zhao, Wenxian Shi, Lei Li
Abstract	Neural models for text generation require a softmax layer with proper token embeddings during the decoding phase. Most existing approaches adopt single point embedding for each token. However, a word may have multiple senses according to different context, some of which might be distinct. In this paper, we propose KerBS, a novel approach for learning better embeddings for text generation. KerBS embodies two advantages: (a) it employs a Bayesian composition of embeddings for words with multiple senses; (b) it is adaptive to semantic variances of words and robust to rare sentence context by imposing learned kernels to capture the closeness of words (senses) in the embedding space. Empirical studies show that KerBS significantly boosts the performance of several text generation tasks.
Tasks	Text Generation
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00274v1
PDF	https://arxiv.org/pdf/1911.00274v1.pdf
PWC	https://paperswithcode.com/paper/kernelized-bayesian-softmax-for-text
Repo	https://github.com/NingMiao/KerBS
Framework	tf

The Multi-Lane Capsule Network (MLCN)


Title	The Multi-Lane Capsule Network (MLCN)
Authors	Vanderson Martins do Rosario, Edson Borin, Mauricio Breternitz Jr
Abstract	We introduce Multi-Lane Capsule Networks (MLCN), which are a separable and resource efficient organization of Capsule Networks (CapsNet) that allows parallel processing, while achieving high accuracy at reduced cost. A MLCN is composed of a number of (distinct) parallel lanes, each contributing to a dimension of the result, trained using the routing-by-agreement organization of CapsNet. Our results indicate similar accuracy with a much reduced cost in number of parameters for the Fashion-MNIST and Cifar10 datsets. They also indicate that the MLCN outperforms the original CapsNet when using a proposed novel configuration for the lanes. MLCN also has faster training and inference times, being more than two-fold faster than the original CapsNet in the same accelerator.
Tasks
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08431v1
PDF	http://arxiv.org/pdf/1902.08431v1.pdf
PWC	https://paperswithcode.com/paper/the-multi-lane-capsule-network-mlcn
Repo	https://github.com/vandersonmr/lanes-capsnet
Framework	tf

Progressive Stochastic Binarization of Deep Networks


Title	Progressive Stochastic Binarization of Deep Networks
Authors	David Hartmann, Michael Wand
Abstract	A plethora of recent research has focused on improving the memory footprint and inference speed of deep networks by reducing the complexity of (i) numerical representations (for example, by deterministic or stochastic quantization) and (ii) arithmetic operations (for example, by binarization of weights). We propose a stochastic binarization scheme for deep networks that allows for efficient inference on hardware by restricting itself to additions of small integers and fixed shifts. Unlike previous approaches, the underlying randomized approximation is progressive, thus permitting an adaptive control of the accuracy of each operation at run-time. In a low-precision setting, we match the accuracy of previous binarized approaches. Our representation is unbiased - it approaches continuous computation with increasing sample size. In a high-precision regime, the computational costs are competitive with previous quantization schemes. Progressive stochastic binarization also permits localized, dynamic accuracy control within a single network, thereby providing a new tool for adaptively focusing computational attention. We evaluate our method on networks of various architectures, already pretrained on ImageNet. With representational costs comparable to previous schemes, we obtain accuracies close to the original floating point implementation. This includes pruned networks, except the known special case of certain types of separated convolutions. By focusing computational attention using progressive sampling, we reduce inference costs on ImageNet further by a factor of up to 33% (before network pruning).
Tasks	Network Pruning, Quantization
Published	2019-04-03
URL	http://arxiv.org/abs/1904.02205v1
PDF	http://arxiv.org/pdf/1904.02205v1.pdf
PWC	https://paperswithcode.com/paper/progressive-stochastic-binarization-of-deep
Repo	https://github.com/JGU-VC/progressive_stochastic_binarization
Framework	tf

Amortized Population Gibbs Samplers with Neural Sufficient Statistics


Title	Amortized Population Gibbs Samplers with Neural Sufficient Statistics
Authors	Hao Wu, Heiko Zimmermann, Eli Sennesh, Tuan Anh Le, Jan-Willem van de Meent
Abstract	Amortized variational methods have proven difficult to scale to structured problems, such as inferring positions of multiple objects from video images. We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frames structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can train highly structured deep generative models in an unsupervised manner, and achieve substantial improvements in inference accuracy relative to standard autoencoding variational methods.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01382v2
PDF	https://arxiv.org/pdf/1911.01382v2.pdf
PWC	https://paperswithcode.com/paper/amortized-population-gibbs-samplers-with
Repo	https://github.com/hao-w/apg-samplers
Framework	pytorch

Stability of Graph Scattering Transforms


Title	Stability of Graph Scattering Transforms
Authors	Fernando Gama, Joan Bruna, Alejandro Ribeiro
Abstract	Scattering transforms are non-trainable deep convolutional architectures that exploit the multi-scale resolution of a wavelet filter bank to obtain an appropriate representation of data. More importantly, they are proven invariant to translations, and stable to perturbations that are close to translations. This stability property dons the scattering transform with a robustness to small changes in the metric domain of the data. When considering network data, regular convolutions do not hold since the data domain presents an irregular structure given by the network topology. In this work, we extend scattering transforms to network data by using multiresolution graph wavelets, whose computation can be obtained by means of graph convolutions. Furthermore, we prove that the resulting graph scattering transforms are stable to metric perturbations of the underlying network. This renders graph scattering transforms robust to changes on the network topology, making it particularly useful for cases of transfer learning, topology estimation or time-varying graphs.
Tasks	Transfer Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04784v1
PDF	https://arxiv.org/pdf/1906.04784v1.pdf
PWC	https://paperswithcode.com/paper/stability-of-graph-scattering-transforms
Repo	https://github.com/alelab-upenn/graph-scattering-transforms
Framework	pytorch

On Relativistic $f$-Divergences


Title	On Relativistic $f$-Divergences
Authors	Alexia Jolicoeur-Martineau
Abstract	This paper provides a more rigorous look at Relativistic Generative Adversarial Networks (RGANs). We prove that the objective function of the discriminator is a statistical divergence for any concave function $f$ with minimal properties ($f(0)=0$, $f’(0) \neq 0$, $\sup_x f(x)>0$). We also devise a few variants of relativistic $f$-divergences. Wasserstein GAN was originally justified by the idea that the Wasserstein distance (WD) is most sensible because it is weak (i.e., it induces a weak topology). We show that the WD is weaker than $f$-divergences which are weaker than relativistic $f$-divergences. Given the good performance of RGANs, this suggests that WGAN does not performs well primarily because of the weak metric, but rather because of regularization and the use of a relativistic discriminator. We also take a closer look at estimators of relativistic $f$-divergences. We introduce the minimum-variance unbiased estimator (MVUE) for Relativistic paired GANs (RpGANs; originally called RGANs which could bring confusion) and show that it does not perform better. Furthermore, we show that the estimator of Relativistic average GANs (RaGANs) is only asymptotically unbiased, but that the finite-sample bias is small. Removing this bias does not improve performance.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02474v1
PDF	http://arxiv.org/pdf/1901.02474v1.pdf
PWC	https://paperswithcode.com/paper/on-relativistic-f-divergences
Repo	https://github.com/AlexiaJM/relativistic-f-divergences
Framework	tf