October 19, 2019

3033 words 15 mins read

Paper Group ANR 208

HMS-Net: Hierarchical Multi-scale Sparsity-invariant Network for Sparse Depth Completion. SPIGAN: Privileged Adversarial Learning from Simulation. Rain Removal By Image Quasi-Sparsity Priors. Removing rain streaks by a linear model. A Deep Tree-Structured Fusion Model for Single Image Deraining. Towards Semi-Supervised Learning for Deep Semantic Ro …

HMS-Net: Hierarchical Multi-scale Sparsity-invariant Network for Sparse Depth Completion


Title	HMS-Net: Hierarchical Multi-scale Sparsity-invariant Network for Sparse Depth Completion
Authors	Zixuan Huang, Junming Fan, Shenggan Cheng, Shuai Yi, Xiaogang Wang, Hongsheng Li
Abstract	Dense depth cues are important and have wide applications in various computer vision tasks. In autonomous driving, LIDAR sensors are adopted to acquire depth measurements around the vehicle to perceive the surrounding environments. However, depth maps obtained by LIDAR are generally sparse because of its hardware limitation. The task of depth completion attracts increasing attention, which aims at generating a dense depth map from an input sparse depth map. To effectively utilize multi-scale features, we propose three novel sparsity-invariant operations, based on which, a sparsity-invariant multi-scale encoder-decoder network (HMS-Net) for handling sparse inputs and sparse feature maps is also proposed. Additional RGB features could be incorporated to further improve the depth completion performance. Our extensive experiments and component analysis on two public benchmarks, KITTI depth completion benchmark and NYU-depth-v2 dataset, demonstrate the effectiveness of the proposed approach. As of Aug. 12th, 2018, on KITTI depth completion leaderboard, our proposed model without RGB guidance ranks first among all peer-reviewed methods without using RGB information, and our model with RGB guidance ranks second among all RGB-guided methods.
Tasks	Autonomous Driving, Depth Completion
Published	2018-08-27
URL	https://arxiv.org/abs/1808.08685v2
PDF	https://arxiv.org/pdf/1808.08685v2.pdf
PWC	https://paperswithcode.com/paper/hms-net-hierarchical-multi-scale-sparsity
Repo
Framework

SPIGAN: Privileged Adversarial Learning from Simulation


Title	SPIGAN: Privileged Adversarial Learning from Simulation
Authors	Kuan-Hui Lee, German Ros, Jie Li, Adrien Gaidon
Abstract	Deep Learning for Computer Vision depends mainly on the source of supervision.Photo-realistic simulators can generate large-scale automatically labeled syntheticdata, but introduce a domain gap negatively impacting performance. We propose anew unsupervised domain adaptation algorithm, called SPIGAN, relying on Sim-ulator Privileged Information (PI) and Generative Adversarial Networks (GAN).We use internal data from the simulator as PI during the training of a target tasknetwork. We experimentally evaluate our approach on semantic segmentation. Wetrain the networks on real-world Cityscapes and Vistas datasets, using only unla-beled real-world images and synthetic labeled data with z-buffer (depth) PI fromthe SYNTHIA dataset. Our method improves over no adaptation and state-of-the-art unsupervised domain adaptation techniques.
Tasks	Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation
Published	2018-10-09
URL	http://arxiv.org/abs/1810.03756v3
PDF	http://arxiv.org/pdf/1810.03756v3.pdf
PWC	https://paperswithcode.com/paper/spigan-privileged-adversarial-learning-from
Repo
Framework

Rain Removal By Image Quasi-Sparsity Priors


Title	Rain Removal By Image Quasi-Sparsity Priors
Authors	Yinglong Wang, Shuaicheng Liu, Chen Chen, Dehua Xie, Bing Zeng
Abstract	Rain streaks will inevitably be captured by some outdoor vision systems, which lowers the image visual quality and also interferes various computer vision applications. We present a novel rain removal method in this paper, which consists of two steps, i.e., detection of rain streaks and reconstruction of the rain-removed image. An accurate detection of rain streaks determines the quality of the overall performance. To this end, we first detect rain streaks according to pixel intensities, motivated by the observation that rain streaks often possess higher intensities compared to other neighboring image structures. Some mis-detected locations are then refined through a morphological processing and the principal component analysis (PCA) such that only locations corresponding to real rain streaks are retained. In the second step, we separate image gradients into a background layer and a rain streak layer, thanks to the image quasi-sparsity prior, so that a rain image can be decomposed into a background layer and a rain layer. We validate the effectiveness of our method through quantitative and qualitative evaluations. We show that our method can remove rain (even for some relatively bright rain) from images robustly and outperforms some state-of-the-art rain removal algorithms.
Tasks	Rain Removal
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08348v1
PDF	http://arxiv.org/pdf/1812.08348v1.pdf
PWC	https://paperswithcode.com/paper/rain-removal-by-image-quasi-sparsity-priors
Repo
Framework

Removing rain streaks by a linear model


Title	Removing rain streaks by a linear model
Authors	Yinglong Wang, Shuaicheng Liu, Bing Zeng
Abstract	Removing rain streaks from a single image continues to draw attentions today in outdoor vision systems. In this paper, we present an efficient method to remove rain streaks. First, the location map of rain pixels needs to be known as precisely as possible, to which we implement a relatively accurate detection of rain streaks by utilizing two characteristics of rain streaks.The key component of our method is to represent the intensity of each detected rain pixel using a linear model: $p=\alpha s + \beta$, where $p$ is the observed intensity of a rain pixel and $s$ represents the intensity of the background (i.e., before rain-affected). To solve $\alpha$ and $\beta$ for each detected rain pixel, we concentrate on a window centered around it and form an $L_2$-norm cost function by considering all detected rain pixels within the window, where the corresponding rain-removed intensity of each detected rain pixel is estimated by some neighboring non-rain pixels. By minimizing this cost function, we determine $\alpha$ and $\beta$ so as to construct the final rain-removed pixel intensity. Compared with several state-of-the-art works, our proposed method can remove rain streaks from a single color image much more efficiently - it offers not only a better visual quality but also a speed-up of several times to one degree of magnitude.
Tasks	Rain Removal
Published	2018-12-19
URL	http://arxiv.org/abs/1812.07870v1
PDF	http://arxiv.org/pdf/1812.07870v1.pdf
PWC	https://paperswithcode.com/paper/removing-rain-streaks-by-a-linear-model
Repo
Framework

A Deep Tree-Structured Fusion Model for Single Image Deraining


Title	A Deep Tree-Structured Fusion Model for Single Image Deraining
Authors	Xueyang Fu, Qi Qi, Yue Huang, Xinghao Ding, Feng Wu, John Paisley
Abstract	We propose a simple yet effective deep tree-structured fusion model based on feature aggregation for the deraining problem. We argue that by effectively aggregating features, a relatively simple network can still handle tough image deraining problems well. First, to capture the spatial structure of rain we use dilated convolutions as our basic network block. We then design a tree-structured fusion architecture which is deployed within each block (spatial information) and across all blocks (content information). Our method is based on the assumption that adjacent features contain redundant information. This redundancy obstructs generation of new representations and can be reduced by hierarchically fusing adjacent features. Thus, the proposed model is more compact and can effectively use spatial and content information. Experiments on synthetic and real-world datasets show that our network achieves better deraining results with fewer parameters.
Tasks	Rain Removal, Single Image Deraining
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08632v1
PDF	http://arxiv.org/pdf/1811.08632v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-tree-structured-fusion-model-for
Repo
Framework

Towards Semi-Supervised Learning for Deep Semantic Role Labeling


Title	Towards Semi-Supervised Learning for Deep Semantic Role Labeling
Authors	Sanket Vaibhav Mehta, Jay Yoon Lee, Jaime Carbonell
Abstract	Neural models have shown several state-of-the-art performances on Semantic Role Labeling (SRL). However, the neural models require an immense amount of semantic-role corpora and are thus not well suited for low-resource languages or domains. The paper proposes a semi-supervised semantic role labeling method that outperforms the state-of-the-art in limited SRL training corpora. The method is based on explicitly enforcing syntactic constraints by augmenting the training objective with a syntactic-inconsistency loss component and uses SRL-unlabeled instances to train a joint-objective LSTM. On CoNLL-2012 English section, the proposed semi-supervised training with 1%, 10% SRL-labeled data and varying amounts of SRL-unlabeled data achieves +1.58, +0.78 F1, respectively, over the pre-trained models that were trained on SOTA architecture with ELMo on the same SRL-labeled data. Additionally, by using the syntactic-inconsistency loss on inference time, the proposed model achieves +3.67, +2.1 F1 over pre-trained model on 1%, 10% SRL-labeled data, respectively.
Tasks	Semantic Role Labeling
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09543v1
PDF	http://arxiv.org/pdf/1808.09543v1.pdf
PWC	https://paperswithcode.com/paper/towards-semi-supervised-learning-for-deep
Repo
Framework

FBI-Pose: Towards Bridging the Gap between 2D Images and 3D Human Poses using Forward-or-Backward Information


Title	FBI-Pose: Towards Bridging the Gap between 2D Images and 3D Human Poses using Forward-or-Backward Information
Authors	Yulong Shi, Xiaoguang Han, Nianjuan Jiang, Kun Zhou, Kui Jia, Jiangbo Lu
Abstract	Although significant advances have been made in the area of human poses estimation from images using deep Convolutional Neural Network (ConvNet), it remains a big challenge to perform 3D pose inference in-the-wild. This is due to the difficulty to obtain 3D pose groundtruth for outdoor environments. In this paper, we propose a novel framework to tackle this problem by exploiting the information of each bone indicating if it is forward or backward with respect to the view of the camera(we term it Forwardor-Backward Information abbreviated as FBI). Our method firstly trains a ConvNet with two branches which maps an image of a human to both the 2D joint locations and the FBI of bones. These information is further fed into a deep regression network to predict the 3D positions of joints. To support the training, we also develop an annotation user interface and labeled such FBI for around 12K in-the-wild images which are randomly selected from MPII (a public dataset of 2D pose annotation). Our experimental results on the standard benchmarks demonstrate that our approach outperforms state-of-the-art methods both qualitatively and quantitatively.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09241v1
PDF	http://arxiv.org/pdf/1806.09241v1.pdf
PWC	https://paperswithcode.com/paper/fbi-pose-towards-bridging-the-gap-between-2d
Repo
Framework

On the Statistical and Information-theoretic Characteristics of Deep Network Representations


Title	On the Statistical and Information-theoretic Characteristics of Deep Network Representations
Authors	Daeyoung Choi, Kyungeun Lee, Changho Shin, Wonjong Rhee
Abstract	It has been common to argue or imply that a regularizer can be used to alter a statistical property of a hidden layer’s representation and thus improve generalization or performance of deep networks. For instance, dropout has been known to improve performance by reducing co-adaptation, and representational sparsity has been argued as a good characteristic because many data-generation processes have a small number of factors that are independent. In this work, we analytically and empirically investigate the popular characteristics of learned representations, including correlation, sparsity, dead unit, rank, and mutual information, and disprove many of the \textit{conventional wisdom}. We first show that infinitely many Identical Output Networks (IONs) can be constructed for any deep network with a linear layer, where any invertible affine transformation can be applied to alter the layer’s representation characteristics. The existence of ION proves that the correlation characteristics of representation is irrelevant to the performance. Extensions to ReLU layers are provided, too. Then, we consider sparsity, dead unit, and rank to show that only loose relationships exist among the three characteristics. It is shown that a higher sparsity or additional dead units do not imply a better or worse performance when the rank of representation is fixed. We also develop a rank regularizer and show that neither representation sparsity nor lower rank is helpful for improving performance even when the data-generation process has a small number of independent factors. Mutual information $I(\mathbf{z}_l;\mathbf{x})$ and $I(\mathbf{z}_l;\mathbf{y})$ are investigated, and we show that regularizers can affect $I(\mathbf{z}_l;\mathbf{x})$ and thus indirectly influence the performance. Finally, we explain how a rich set of regularizers can be used as a powerful tool for performance tuning.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03666v1
PDF	http://arxiv.org/pdf/1811.03666v1.pdf
PWC	https://paperswithcode.com/paper/on-the-statistical-and-information-theoretic
Repo
Framework

Aligning Points to Lines: Provable Approximations


Title	Aligning Points to Lines: Provable Approximations
Authors	Ibrahim Jubran, Dan Feldman
Abstract	We suggest a new optimization technique for minimizing the sum $\sum_{i=1}^n f_i(x)$ of $n$ non-convex real functions that satisfy a property that we call piecewise log-Lipschitz. This is by forging links between techniques in computational geometry, combinatorics and convex optimization. As an example application, we provide the first constant-factor approximation algorithms whose running-time is polynomial in $n$ for the fundamental problem of \emph{Points-to-Lines alignment}: Given $n$ points $p_1,\cdots,p_n$ and $n$ lines $\ell_1,\cdots,\ell_n$ on the plane and $z>0$, compute the matching $\pi:[n]\to[n]$ and alignment (rotation matrix $R$ and a translation vector $t$) that minimize the sum of Euclidean distances $\sum_{i=1}^n \mathrm{dist}(Rp_i-t,\ell_{\pi(i)})^z$ between each point to its corresponding line. This problem is non-trivial even if $z=1$ and the matching $\pi$ is given. If $\pi$ is given, the running time of our algorithms is $O(n^3)$, and even near-linear in $n$ using core-sets that support: streaming, dynamic, and distributed parallel computations in poly-logarithmic update time. Generalizations for handling e.g. outliers or pseudo-distances such as $M$-estimators for the problem are also provided. Experimental results and open source code show that our provable algorithms improve existing heuristics also in practice. A companion demonstration video in the context of Augmented Reality shows how such algorithms may be used in real-time systems.
Tasks
Published	2018-07-23
URL	https://arxiv.org/abs/1807.08446v3
PDF	https://arxiv.org/pdf/1807.08446v3.pdf
PWC	https://paperswithcode.com/paper/minimizing-sum-of-non-convex-but-piecewise
Repo
Framework

XGBoost: Scalable GPU Accelerated Learning


Title	XGBoost: Scalable GPU Accelerated Learning
Authors	Rory Mitchell, Andrey Adinets, Thejaswi Rao, Eibe Frank
Abstract	We describe the multi-GPU gradient boosting algorithm implemented in the XGBoost library (https://github.com/dmlc/xgboost). Our algorithm allows fast, scalable training on multi-GPU systems with all of the features of the XGBoost library. We employ data compression techniques to minimise the usage of scarce GPU memory while still allowing highly efficient implementation. Using our algorithm we show that it is possible to process 115 million training instances in under three minutes on a publicly available cloud computing instance. The algorithm is implemented using end-to-end GPU parallelism, with prediction, gradient calculation, feature quantisation, decision tree construction and evaluation phases all computed on device.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11248v1
PDF	http://arxiv.org/pdf/1806.11248v1.pdf
PWC	https://paperswithcode.com/paper/xgboost-scalable-gpu-accelerated-learning
Repo
Framework

Running on Fumes–Preventing Out-of-Gas Vulnerabilities in Ethereum Smart Contracts using Static Resource Analysis


Title	Running on Fumes–Preventing Out-of-Gas Vulnerabilities in Ethereum Smart Contracts using Static Resource Analysis
Authors	Elvira Albert, Pablo Gordillo, Albert Rubio, Ilya Sergey
Abstract	Gas is a measurement unit of the computational effort that it will take to execute every single operation that takes part in the Ethereum blockchain platform. Each instruction executed by the Ethereum Virtual Machine (EVM) has an associated gas consumption specified by Ethereum. If a transaction exceeds the amount of gas allotted by the user (known as gas limit), an out-of-gas exception is raised. There is a wide family of contract vulnerabilities due to out-of-gas behaviours. We report on the design and implementation of GASTAP, a Gas-Aware Smart contracT Analysis Platform, which takes as input a smart contract (either in EVM, disassembled EVM, or in Solidity source code) and automatically infers sound gas upper bounds for all its public functions. Our bounds ensure that if the gas limit paid by the user is higher than our inferred gas bounds, the contract is free of out-of-gas vulnerabilities.
Tasks
Published	2018-11-22
URL	https://arxiv.org/abs/1811.10403v2
PDF	https://arxiv.org/pdf/1811.10403v2.pdf
PWC	https://paperswithcode.com/paper/gastap-a-gas-analyzer-for-smart-contracts
Repo
Framework

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization


Title	Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization
Authors	Simon Leglaive, Laurent Girin, Radu Horaud
Abstract	In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of this supervised model are learned using the framework of variational autoencoders. The noisy recording environment is supposed to be unknown, so the noise spectro-temporal modeling remains unsupervised and is based on non-negative matrix factorization (NMF). We develop a Monte Carlo expectation-maximization algorithm and we experimentally show that the proposed approach outperforms its NMF-based counterpart, where speech is modeled using supervised NMF.
Tasks	Speech Enhancement
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06713v3
PDF	http://arxiv.org/pdf/1811.06713v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-multichannel-speech
Repo
Framework

Named-Entity Linking Using Deep Learning For Legal Documents: A Transfer Learning Approach


Title	Named-Entity Linking Using Deep Learning For Legal Documents: A Transfer Learning Approach
Authors	Ahmed Elnaggar, Robin Otto, Florian Matthes
Abstract	In the legal domain it is important to differentiate between words in general, and afterwards to link the occurrences of the same entities. The topic to solve these challenges is called Named-Entity Linking (NEL). Current supervised neural networks designed for NEL use publicly available datasets for training and testing. However, this paper focuses especially on the aspect of applying transfer learning approach using networks trained for NEL to legal documents. Experiments show consistent improvement in the legal datasets that were created from the European Union law in the scope of this research. Using transfer learning approach, we reached F1-score of 98.90% and 98.01% on the legal small and large test dataset.
Tasks	Entity Linking, Transfer Learning
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06673v1
PDF	http://arxiv.org/pdf/1810.06673v1.pdf
PWC	https://paperswithcode.com/paper/named-entity-linking-using-deep-learning-for
Repo
Framework

Expressive power of outer product manifolds on feed-forward neural networks


Title	Expressive power of outer product manifolds on feed-forward neural networks
Authors	Bálint Daróczy, Rita Aleksziev, András Benczúr
Abstract	Hierarchical neural networks are exponentially more efficient than their corresponding “shallow” counterpart with the same expressive power, but involve huge number of parameters and require tedious amounts of training. Our main idea is to mathematically understand and describe the hierarchical structure of feedforward neural networks by reparametrization invariant Riemannian metrics. By computing or approximating the tangent subspace, we better utilize the original network via sparse representations that enables switching to shallow networks after a very early training stage. Our experiments show that the proposed approximation of the metric improves and sometimes even surpasses the achievable performance of the original network significantly even after a few epochs of training the original feedforward network.
Tasks
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06630v1
PDF	http://arxiv.org/pdf/1807.06630v1.pdf
PWC	https://paperswithcode.com/paper/expressive-power-of-outer-product-manifolds
Repo
Framework

Why don’t the modules dominate - Investigating the Structure of a Well-Known Modularity-Inducing Problem Domain


Title	Why don’t the modules dominate - Investigating the Structure of a Well-Known Modularity-Inducing Problem Domain
Authors	Zhenyue Qin, Robert McKay, Tom Gedeon
Abstract	Wagner’s modularity inducing problem domain is a key contribution to the study of the evolution of modularity, including both evolutionary theory and evolutionary computation. We study its behavior under classical genetic algorithms. Unlike what we seem to observe in nature, the emergence of modularity is highly conditional and dependent, for example, on the eagerness of search. In nature, modular solutions generally dominate populations, whereas in this domain, modularity, when it emerges, is a relatively rare variant. Emergence of modularity depends heavily on random fluctuations in the fitness function, with a randomly varied but unchanging fitness function, modularity evolved far more rarely. Interestingly, high-fitness non-modular solutions could frequently be converted into even-higher-fitness modular solutions by manually removing all inter-module edges. Despite careful exploration, we do not yet have a full explanation of why the genetic algorithm was unable to find these better solutions.
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.05976v2
PDF	http://arxiv.org/pdf/1807.05976v2.pdf
PWC	https://paperswithcode.com/paper/why-dont-the-modules-dominate-investigating
Repo
Framework