January 28, 2020

3384 words 16 mins read

Paper Group ANR 871

Deep Learning on Mobile Devices - A Review. Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network. Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues. ThumbNet: One Thumbnail Image Contains All You Need for Recognition. Disentangling Redundancy for Multi-Task Pruning. Lsh-sampling Breaks the Computat …

Deep Learning on Mobile Devices - A Review


Title	Deep Learning on Mobile Devices - A Review
Authors	Yunbin Deng
Abstract	Recent breakthroughs in deep learning and artificial intelligence technologies have enabled numerous mobile applications. While traditional computation paradigms rely on mobile sensing and cloud computing, deep learning implemented on mobile devices provides several advantages. These advantages include low communication bandwidth, small cloud computing resource cost, quick response time, and improved data privacy. Research and development of deep learning on mobile and embedded devices has recently attracted much attention. This paper provides a timely review of this fast-paced field to give the researcher, engineer, practitioner, and graduate student a quick grasp on the recent advancements of deep learning on mobile devices. In this paper, we discuss hardware architectures for mobile deep learning, including Field Programmable Gate Arrays, Application Specific Integrated Circuit, and recent mobile Graphic Processing Units. We present Size, Weight, Area and Power considerations and their relation to algorithm optimizations, such as quantization, pruning, compression, and approximations that simplify computation while retaining performance accuracy. We cover existing systems and give a state-of-the-industry review of TensorFlow, MXNet, Mobile AI Compute Engine, and Paddle-mobile deep learning platform. We discuss resources for mobile deep learning practitioners, including tools, libraries, models, and performance benchmarks. We present applications of various mobile sensing modalities to industries, ranging from robotics, healthcare and multi-media, biometrics to autonomous drive and defense. We address the key deep learning challenges to overcome, including low quality data, and small training/adaptation data sets. In addition, the review provides numerous citations and links to existing code bases implementing various technologies.
Tasks	Quantization
Published	2019-03-21
URL	http://arxiv.org/abs/1904.09274v1
PDF	http://arxiv.org/pdf/1904.09274v1.pdf
PWC	https://paperswithcode.com/paper/190409274
Repo
Framework

Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network


Title	Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network
Authors	Kyung-Su Kim, Sae-Young Chung
Abstract	We consider the problem of sparse phase retrieval from Fourier transform magnitudes to recover the $k$-sparse signal vector and its support $\mathcal{T}$. We exploit extended support estimate $\mathcal{E}$ with size larger than $k$ satisfying $\mathcal{E} \supseteq \mathcal{T}$ and obtained by a trained deep neural network (DNN). To make the DNN learnable, it provides $\mathcal{E}$ as the union of equivalent solutions of $\mathcal{T}$ by utilizing modulo Fourier invariances. Set $\mathcal{E}$ can be estimated with short running time via the DNN, and support $\mathcal{T}$ can be determined from the DNN output rather than from the full index set by applying hard thresholding to $\mathcal{E}$. Thus, the DNN-based extended support estimation improves the reconstruction performance of the signal with a low complexity burden dependent on $k$. Numerical results verify that the proposed scheme has a superior performance with lower complexity compared to local search-based greedy sparse phase retrieval and a state-of-the-art variant of the Fienup method.
Tasks
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01821v3
PDF	https://arxiv.org/pdf/1904.01821v3.pdf
PWC	https://paperswithcode.com/paper/fourier-phase-retrieval-with-extended-support
Repo
Framework

Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues


Title	Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues
Authors	Natalia Neverova, James Thewlis, Rıza Alp Güler, Iasonas Kokkinos, Andrea Vedaldi
Abstract	DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection strategies. In particular, we demonstrate that if annotations are collected in video frames, their efficacy can be multiplied for free by using motion cues. To explore this idea, we introduce DensePose-Track, a dataset of videos where selected frames are annotated in the traditional DensePose manner. Then, building on geometric properties of the DensePose mapping, we use the video dynamic to propagate ground-truth annotations in time as well as to learn from Siamese equivariance constraints. Having performed exhaustive empirical evaluation of various data annotation and learning strategies, we demonstrate that doing so can deliver significantly improved pose estimation results over strong baselines. However, despite what is suggested by some recent works, we show that merely synthesizing motion patterns by applying geometric transformations to isolated frames is significantly less effective, and that motion cues help much more when they are extracted from videos.
Tasks	Pose Estimation
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05706v1
PDF	https://arxiv.org/pdf/1906.05706v1.pdf
PWC	https://paperswithcode.com/paper/slim-densepose-thrifty-learning-from-sparse-1
Repo
Framework

ThumbNet: One Thumbnail Image Contains All You Need for Recognition


Title	ThumbNet: One Thumbnail Image Contains All You Need for Recognition
Authors	Chen Zhao, Bernard Ghanem
Abstract	Although deep convolutional neural networks (CNNs) have achieved great success in the computer vision community, its real-world application is still impeded by its voracious demand of computational resources. Current works mostly seek to compress the network by reducing its parameters or parameter-incurred computation, neglecting the influence of the input image on the system complexity. Based on the fact that input images of a CNN contain much redundant spatial content, we propose in this paper an efficient and unified framework, dubbed as ThumbNet, to simultaneously accelerate and compress CNN models by enabling them to infer on one thumbnail image. We provide three effective strategies to train ThumbNet. In doing so, ThumbNet learns an inference network that performs equally well on small images as the original-input network on large images. With ThumbNet, not only do we obtain the thumbnail-input inference network that can drastically reduce computation and memory requirements, but also we obtain an image downscaler that can generate thumbnail images for generic classification tasks. Extensive experiments show the effectiveness of ThumbNet, and demonstrate that the thumbnail-input inference network learned by ThumbNet can adequately retain the accuracy of the original-input network even when the input images are downscaled 16 times.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05034v1
PDF	http://arxiv.org/pdf/1904.05034v1.pdf
PWC	https://paperswithcode.com/paper/thumbnet-one-thumbnail-image-contains-all-you
Repo
Framework

Disentangling Redundancy for Multi-Task Pruning


Title	Disentangling Redundancy for Multi-Task Pruning
Authors	Xiaoxi He, Dawei Gao, Zimu Zhou, Yongxin Tong, Lothar Thiele
Abstract	Can prior network pruning strategies eliminate redundancy in multiple correlated pre-trained deep neural networks? It seems a positive answer if multiple networks are first combined and then pruned. However, we argue that an arbitrarily combined network may lead to sub-optimal pruning performance because their intra- and inter-redundancy may not be minimised at the same time while retaining the inference accuracy in each task. In this paper, we define and analyse the redundancy in multi-task networks from an information theoretic perspective, and identify challenges for existing pruning methods to function effectively for multi-task pruning. We propose Redundancy-Disentangled Networks (RDNets), which decouples intra- and inter-redundancy such that all redundancy can be suppressed via previous network pruning schemes. A pruned RDNet also ensures minimal computation in any subset of tasks, a desirable feature for selective task execution. Moreover, a heuristic is devised to construct an RDNet from multiple pre-trained networks. Experiments on CelebA show that the same pruning method on an RDNet achieves at least 1:8x lower memory usage and 1:4x lower computation cost than on a multi-task network constructed by the state-of-the-art network merging scheme.
Tasks	Network Pruning
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09676v1
PDF	https://arxiv.org/pdf/1905.09676v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-redundancy-for-multi-task
Repo
Framework

Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation


Title	Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation
Authors	Beidi Chen, Yingchen Xu, Anshumali Shrivastava
Abstract	Stochastic Gradient Descent or SGD is the most popular optimization algorithm for large-scale problems. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch-wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribution for gradient estimation is more than calculating the full gradient itself, which we call the chicken-and-the-egg loop. As a result, the false impression of faster convergence in iterations, in reality, leads to slower convergence in time. In this paper, we break this barrier by providing the first demonstration of a scheme, Locality sensitive hashing (LSH) sampled Stochastic Gradient Descent (LGD), which leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling. Such an algorithm is possible due to the sampling view of LSH, which came to light recently. As a consequence of superior and fast estimation, we reduce the running time of all existing gradient descent algorithms, that relies on gradient estimates including Adam, Ada-grad, etc. We demonstrate the effectiveness of our proposal with experiments on linear models as well as the non-linear BERT, which is a recent popular deep learning based language representation model.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14162v1
PDF	https://arxiv.org/pdf/1910.14162v1.pdf
PWC	https://paperswithcode.com/paper/lsh-sampling-breaks-the-computation-chicken
Repo
Framework

Unsupervised Deep Learning Algorithm for PDE-based Forward and Inverse Problems


Title	Unsupervised Deep Learning Algorithm for PDE-based Forward and Inverse Problems
Authors	Leah Bar, Nir Sochen
Abstract	We propose a neural network-based algorithm for solving forward and inverse problems for partial differential equations in unsupervised fashion. The solution is approximated by a deep neural network which is the minimizer of a cost function, and satisfies the PDE, boundary conditions, and additional regularizations. The method is mesh free and can be easily applied to an arbitrary regular domain. We focus on 2D second order elliptical system with non-constant coefficients, with application to Electrical Impedance Tomography.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05417v1
PDF	http://arxiv.org/pdf/1904.05417v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-learning-algorithm-for-pde
Repo
Framework

Variational Regret Bounds for Reinforcement Learning


Title	Variational Regret Bounds for Reinforcement Learning
Authors	Pratik Gajane, Ronald Ortner, Peter Auer
Abstract	We consider undiscounted reinforcement learning in Markov decision processes (MDPs) where both the reward functions and the state-transition probabilities may vary (gradually or abruptly) over time. For this problem setting, we propose an algorithm and provide performance guarantees for the regret evaluated against the optimal non-stationary policy. The upper bound on the regret is given in terms of the total variation in the MDP. This is the first variational regret bound for the general reinforcement learning setting.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05857v3
PDF	https://arxiv.org/pdf/1905.05857v3.pdf
PWC	https://paperswithcode.com/paper/variational-regret-bounds-for-reinforcement
Repo
Framework

The Liar’s Walk: Detecting Deception with Gait and Gesture


Title	The Liar’s Walk: Detecting Deception with Gait and Gesture
Authors	Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha
Abstract	We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures. We conducted an elaborate user study, where we recorded many participants performing tasks involving deceptive walking. We extract the participants’ walking gaits as series of 3D poses. We annotate various gestures performed by participants during their tasks. Based on the gait and gesture data, we train an LSTM-based deep neural network to obtain deep features. Finally, we use a combination of psychology-based gait, gesture, and deep features to detect deceptive walking with an accuracy of 88.41%. This is an improvement of 10.6% over handcrafted gait and gesture features and an improvement of 4.7% and 9.2% over classifiers based on the state-of-the-art emotion and action classification algorithms, respectively. Additionally, we present a novel dataset, DeceptiveWalk, that contains gaits and gestures with their associated deception labels. To the best of our knowledge, ours is the first algorithm to detect deceptive behavior using non-verbal cues of gait and gesture.
Tasks	Action Classification
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06874v3
PDF	https://arxiv.org/pdf/1912.06874v3.pdf
PWC	https://paperswithcode.com/paper/the-liars-walk-detecting-deception-with-gait
Repo
Framework

Deep Green Function Convolution for Improving Saliency in Convolutional Neural Networks


Title	Deep Green Function Convolution for Improving Saliency in Convolutional Neural Networks
Authors	Dominique Beaini, Sofiane Achiche, Alexandre Duperré, Maxime Raison
Abstract	Current saliency methods require to learn large scale regional features using small convolutional kernels, which is not possible with a simple feed-forward network. Some methods solve this problem by using segmentation into superpixels while others downscale the image through the network and rescale it back to its original size. The objective of this paper is to show that saliency convolutional neural networks (CNN) can be improved by using a Green’s function convolution (GFC) to extrapolate edges features into salient regions. The GFC acts as a gradient integrator, allowing to produce saliency features by filling thin edges directly inside the CNN. Hence, we propose the gradient integration and sum (GIS) layer that combines the edges features with the saliency features. Using the HED and DSS architecture, we demonstrated that adding a GIS layer near the network’s output allows to reduce the sensitivity to the parameter initialization, to reduce the overfitting and to improve the repeatability of the training. By simply adding a GIS layer to the state-of-the-art DSS model, there is an absolute increase of 1.6% for the F-measure on the DUT-OMRON dataset, with only 10ms of additional computation time. The GIS layer further allows the network to perform significantly better in the case of highly noisy images or low-brightness images. In fact, we observed an F-measure improvement of 5.2% when noise was added to the dataset and 2.8% when the brightness was reduced. Since the GIS layer is model agnostic, it can be implemented into different fully convolutional networks. A major contribution of the current work is the first implementation of Green’s function convolution inside a neural network, which allows the network to operate in the feature domain and in the gradient domain at the same time, thus improving the regional representation via edge filling.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08331v2
PDF	https://arxiv.org/pdf/1908.08331v2.pdf
PWC	https://paperswithcode.com/paper/deep-green-function-convolution-for-improving
Repo
Framework

Three-Stage Subspace Clustering Framework with Graph-Based Transformation and Optimization


Title	Three-Stage Subspace Clustering Framework with Graph-Based Transformation and Optimization
Authors	Shuai Yang, Wenqi Zhu, Yuesheng Zhu
Abstract	Subspace clustering (SC) refers to the problem of clustering high-dimensional data into a union of low-dimensional subspaces. Based on spectral clustering, state-of-the-art approaches solve SC problem within a two-stage framework. In the first stage, data representation techniques are applied to draw an affinity matrix from the original data. In the second stage, spectral clustering is directly applied to the affinity matrix so that data can be grouped into different subspaces. However, the affinity matrix obtained in the first stage usually fails to reveal the authentic relationship between data points, which leads to inaccurate clustering results. In this paper, we propose a universal Three-Stage Subspace Clustering framework (3S-SC). Graph-Based Transformation and Optimization (GBTO) is added between data representation and spectral clustering. The affinity matrix is obtained in the first stage, then it goes through the second stage, where the proposed GBTO is applied to generate a reconstructed affinity matrix with more authentic similarity between data points. Spectral clustering is applied after GBTO, which is the third stage. We verify our 3S-SC framework with GBTO through theoretical analysis. Experiments on both synthetic data and the real-world data sets of handwritten digits and human faces demonstrate the universality of the proposed 3S-SC framework in improving the connectivity and accuracy of SC methods based on $\ell_0$, $\ell_1$, $\ell_2$ or nuclear norm regularization.
Tasks
Published	2019-05-02
URL	http://arxiv.org/abs/1905.01145v1
PDF	http://arxiv.org/pdf/1905.01145v1.pdf
PWC	https://paperswithcode.com/paper/three-stage-subspace-clustering-framework
Repo
Framework

Mutual Clustering on Comparative Texts via Heterogeneous Information Networks


Title	Mutual Clustering on Comparative Texts via Heterogeneous Information Networks
Authors	Jianping Cao, Senzhang Wang, Danyan Wen, Zhaohui Peng, Philip S. Yu, Fei-yue Wang
Abstract	Currently, many intelligence systems contain the texts from multi-sources, e.g., bulletin board system (BBS) posts, tweets and news. These texts can be comparative'' since they may be semantically correlated and thus provide us with different perspectives toward the same topics or events. To better organize the multi-sourced texts and obtain more comprehensive knowledge, we propose to study the novel problem of Mutual Clustering on Comparative Texts (MCCT), which aims to cluster the comparative texts simultaneously and collaboratively. The MCCT problem is difficult to address because 1) comparative texts usually present different data formats and structures and thus they are hard to organize, and 2) there lacks an effective method to connect the semantically correlated comparative texts to facilitate clustering them in an unified way. To this aim, in this paper we propose a Heterogeneous Information Network-based Text clustering framework HINT. HINT first models multi-sourced texts (e.g. news and tweets) as heterogeneous information networks by introducing the shared anchor texts’’ to connect the comparative texts. Next, two similarity matrices based on HINT as well as a transition matrix for cross-text-source knowledge transfer are constructed. Comparative texts clustering are then conducted by utilizing the constructed matrices. Finally, a mutual clustering algorithm is also proposed to further unify the separate clustering results of the comparative texts by introducing a clustering consistency constraint. We conduct extensive experimental on three tweets-news datasets, and the results demonstrate the effectiveness and robustness of the proposed method in addressing the MCCT problem.
Tasks	Text Clustering, Transfer Learning
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03762v1
PDF	http://arxiv.org/pdf/1903.03762v1.pdf
PWC	https://paperswithcode.com/paper/mutual-clustering-on-comparative-texts-via
Repo
Framework

Consistent Regression using Data-Dependent Coverings


Title	Consistent Regression using Data-Dependent Coverings
Authors	Vincent Margot, Jean-Patrick Baudry, Frédéric Guilloux, Olivier Wintenberger
Abstract	In this paper, we introduce a novel method to generate interpretable regression function estimators. The idea is based on called data-dependent coverings. The aim is to extract from the data a covering of the feature space instead of a partition. The estimator predicts the empirical conditional expectation over the cells of the partitions generated from the coverings. Thus, such estimator has the same form as those issued from data-dependent partitioning algorithms. We give sufficient conditions to ensure the consistency, avoiding the sufficient condition of shrinkage of the cells that appears in the former literature. Doing so, we reduce the number of covering elements. We show that such coverings are interpretable and each element of the covering is tagged as significant or insignificant. The proof of the consistency is based on a control of the error of the empirical estimation of conditional expectations which is interesting on its own.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02306v3
PDF	https://arxiv.org/pdf/1907.02306v3.pdf
PWC	https://paperswithcode.com/paper/consistent-regression-using-data-dependent
Repo
Framework

Deep Learning-Based Strategy for Macromolecules Classification with Imbalanced Data from Cellular Electron Cryotomography


Title	Deep Learning-Based Strategy for Macromolecules Classification with Imbalanced Data from Cellular Electron Cryotomography
Authors	Ziqian Luo, Xiangrui Zeng, Zhipeng Bao, Min Xu
Abstract	Deep learning model trained by imbalanced data may not work satisfactorily since it could be determined by major classes and thus may ignore the classes with small amount of data. In this paper, we apply deep learning based imbalanced data classification for the first time to cellular macromolecular complexes captured by Cryo-electron tomography (Cryo-ET). We adopt a range of strategies to cope with imbalanced data, including data sampling, bagging, boosting, Genetic Programming based method and. Particularly, inspired from Inception 3D network, we propose a multi-path CNN model combining focal loss and mixup on the Cryo-ET dataset to expand the dataset, where each path had its best performance corresponding to each type of data and let the network learn the combinations of the paths to improve the classification performance. In addition, extensive experiments have been conducted to show our proposed method is flexible enough to cope with different number of classes by adjusting the number of paths in our multi-path model. To our knowledge, this work is the first application of deep learning methods of dealing with imbalanced data to the internal tissue classification of cell macromolecular complexes, which opened up a new path for cell classification in the field of computational biology.
Tasks	Electron Tomography
Published	2019-08-27
URL	https://arxiv.org/abs/1908.09993v1
PDF	https://arxiv.org/pdf/1908.09993v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-strategy-for
Repo
Framework

Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients


Title	Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
Authors	Jun Sun, Tianyi Chen, Georgios B. Giannakis, Zaiyue Yang
Abstract	The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in `lazy’ worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms. \|
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07588v1
PDF	https://arxiv.org/pdf/1909.07588v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-learning-1
Repo
Framework