Paper Group ANR 871
Deep Learning on Mobile Devices - A Review. Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network. Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues. ThumbNet: One Thumbnail Image Contains All You Need for Recognition. Disentangling Redundancy for Multi-Task Pruning. Lsh-sampling Breaks the Computat …
Deep Learning on Mobile Devices - A Review
Title | Deep Learning on Mobile Devices - A Review |
Authors | Yunbin Deng |
Abstract | Recent breakthroughs in deep learning and artificial intelligence technologies have enabled numerous mobile applications. While traditional computation paradigms rely on mobile sensing and cloud computing, deep learning implemented on mobile devices provides several advantages. These advantages include low communication bandwidth, small cloud computing resource cost, quick response time, and improved data privacy. Research and development of deep learning on mobile and embedded devices has recently attracted much attention. This paper provides a timely review of this fast-paced field to give the researcher, engineer, practitioner, and graduate student a quick grasp on the recent advancements of deep learning on mobile devices. In this paper, we discuss hardware architectures for mobile deep learning, including Field Programmable Gate Arrays, Application Specific Integrated Circuit, and recent mobile Graphic Processing Units. We present Size, Weight, Area and Power considerations and their relation to algorithm optimizations, such as quantization, pruning, compression, and approximations that simplify computation while retaining performance accuracy. We cover existing systems and give a state-of-the-industry review of TensorFlow, MXNet, Mobile AI Compute Engine, and Paddle-mobile deep learning platform. We discuss resources for mobile deep learning practitioners, including tools, libraries, models, and performance benchmarks. We present applications of various mobile sensing modalities to industries, ranging from robotics, healthcare and multi-media, biometrics to autonomous drive and defense. We address the key deep learning challenges to overcome, including low quality data, and small training/adaptation data sets. In addition, the review provides numerous citations and links to existing code bases implementing various technologies. |
Tasks | Quantization |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1904.09274v1 |
http://arxiv.org/pdf/1904.09274v1.pdf | |
PWC | https://paperswithcode.com/paper/190409274 |
Repo | |
Framework | |
Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network
Title | Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network |
Authors | Kyung-Su Kim, Sae-Young Chung |
Abstract | We consider the problem of sparse phase retrieval from Fourier transform magnitudes to recover the $k$-sparse signal vector and its support $\mathcal{T}$. We exploit extended support estimate $\mathcal{E}$ with size larger than $k$ satisfying $\mathcal{E} \supseteq \mathcal{T}$ and obtained by a trained deep neural network (DNN). To make the DNN learnable, it provides $\mathcal{E}$ as the union of equivalent solutions of $\mathcal{T}$ by utilizing modulo Fourier invariances. Set $\mathcal{E}$ can be estimated with short running time via the DNN, and support $\mathcal{T}$ can be determined from the DNN output rather than from the full index set by applying hard thresholding to $\mathcal{E}$. Thus, the DNN-based extended support estimation improves the reconstruction performance of the signal with a low complexity burden dependent on $k$. Numerical results verify that the proposed scheme has a superior performance with lower complexity compared to local search-based greedy sparse phase retrieval and a state-of-the-art variant of the Fienup method. |
Tasks | |
Published | 2019-04-03 |
URL | https://arxiv.org/abs/1904.01821v3 |
https://arxiv.org/pdf/1904.01821v3.pdf | |
PWC | https://paperswithcode.com/paper/fourier-phase-retrieval-with-extended-support |
Repo | |
Framework | |
Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues
Title | Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues |
Authors | Natalia Neverova, James Thewlis, Rıza Alp Güler, Iasonas Kokkinos, Andrea Vedaldi |
Abstract | DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection strategies. In particular, we demonstrate that if annotations are collected in video frames, their efficacy can be multiplied for free by using motion cues. To explore this idea, we introduce DensePose-Track, a dataset of videos where selected frames are annotated in the traditional DensePose manner. Then, building on geometric properties of the DensePose mapping, we use the video dynamic to propagate ground-truth annotations in time as well as to learn from Siamese equivariance constraints. Having performed exhaustive empirical evaluation of various data annotation and learning strategies, we demonstrate that doing so can deliver significantly improved pose estimation results over strong baselines. However, despite what is suggested by some recent works, we show that merely synthesizing motion patterns by applying geometric transformations to isolated frames is significantly less effective, and that motion cues help much more when they are extracted from videos. |
Tasks | Pose Estimation |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05706v1 |
https://arxiv.org/pdf/1906.05706v1.pdf | |
PWC | https://paperswithcode.com/paper/slim-densepose-thrifty-learning-from-sparse-1 |
Repo | |
Framework | |
ThumbNet: One Thumbnail Image Contains All You Need for Recognition
Title | ThumbNet: One Thumbnail Image Contains All You Need for Recognition |
Authors | Chen Zhao, Bernard Ghanem |
Abstract | Although deep convolutional neural networks (CNNs) have achieved great success in the computer vision community, its real-world application is still impeded by its voracious demand of computational resources. Current works mostly seek to compress the network by reducing its parameters or parameter-incurred computation, neglecting the influence of the input image on the system complexity. Based on the fact that input images of a CNN contain much redundant spatial content, we propose in this paper an efficient and unified framework, dubbed as ThumbNet, to simultaneously accelerate and compress CNN models by enabling them to infer on one thumbnail image. We provide three effective strategies to train ThumbNet. In doing so, ThumbNet learns an inference network that performs equally well on small images as the original-input network on large images. With ThumbNet, not only do we obtain the thumbnail-input inference network that can drastically reduce computation and memory requirements, but also we obtain an image downscaler that can generate thumbnail images for generic classification tasks. Extensive experiments show the effectiveness of ThumbNet, and demonstrate that the thumbnail-input inference network learned by ThumbNet can adequately retain the accuracy of the original-input network even when the input images are downscaled 16 times. |
Tasks | |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05034v1 |
http://arxiv.org/pdf/1904.05034v1.pdf | |
PWC | https://paperswithcode.com/paper/thumbnet-one-thumbnail-image-contains-all-you |
Repo | |
Framework | |
Disentangling Redundancy for Multi-Task Pruning
Title | Disentangling Redundancy for Multi-Task Pruning |
Authors | Xiaoxi He, Dawei Gao, Zimu Zhou, Yongxin Tong, Lothar Thiele |
Abstract | Can prior network pruning strategies eliminate redundancy in multiple correlated pre-trained deep neural networks? It seems a positive answer if multiple networks are first combined and then pruned. However, we argue that an arbitrarily combined network may lead to sub-optimal pruning performance because their intra- and inter-redundancy may not be minimised at the same time while retaining the inference accuracy in each task. In this paper, we define and analyse the redundancy in multi-task networks from an information theoretic perspective, and identify challenges for existing pruning methods to function effectively for multi-task pruning. We propose Redundancy-Disentangled Networks (RDNets), which decouples intra- and inter-redundancy such that all redundancy can be suppressed via previous network pruning schemes. A pruned RDNet also ensures minimal computation in any subset of tasks, a desirable feature for selective task execution. Moreover, a heuristic is devised to construct an RDNet from multiple pre-trained networks. Experiments on CelebA show that the same pruning method on an RDNet achieves at least 1:8x lower memory usage and 1:4x lower computation cost than on a multi-task network constructed by the state-of-the-art network merging scheme. |
Tasks | Network Pruning |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09676v1 |
https://arxiv.org/pdf/1905.09676v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-redundancy-for-multi-task |
Repo | |
Framework | |
Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation
Title | Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation |
Authors | Beidi Chen, Yingchen Xu, Anshumali Shrivastava |
Abstract | Stochastic Gradient Descent or SGD is the most popular optimization algorithm for large-scale problems. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch-wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribution for gradient estimation is more than calculating the full gradient itself, which we call the chicken-and-the-egg loop. As a result, the false impression of faster convergence in iterations, in reality, leads to slower convergence in time. In this paper, we break this barrier by providing the first demonstration of a scheme, Locality sensitive hashing (LSH) sampled Stochastic Gradient Descent (LGD), which leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling. Such an algorithm is possible due to the sampling view of LSH, which came to light recently. As a consequence of superior and fast estimation, we reduce the running time of all existing gradient descent algorithms, that relies on gradient estimates including Adam, Ada-grad, etc. We demonstrate the effectiveness of our proposal with experiments on linear models as well as the non-linear BERT, which is a recent popular deep learning based language representation model. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14162v1 |
https://arxiv.org/pdf/1910.14162v1.pdf | |
PWC | https://paperswithcode.com/paper/lsh-sampling-breaks-the-computation-chicken |
Repo | |
Framework | |
Unsupervised Deep Learning Algorithm for PDE-based Forward and Inverse Problems
Title | Unsupervised Deep Learning Algorithm for PDE-based Forward and Inverse Problems |
Authors | Leah Bar, Nir Sochen |
Abstract | We propose a neural network-based algorithm for solving forward and inverse problems for partial differential equations in unsupervised fashion. The solution is approximated by a deep neural network which is the minimizer of a cost function, and satisfies the PDE, boundary conditions, and additional regularizations. The method is mesh free and can be easily applied to an arbitrary regular domain. We focus on 2D second order elliptical system with non-constant coefficients, with application to Electrical Impedance Tomography. |
Tasks | |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05417v1 |
http://arxiv.org/pdf/1904.05417v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-learning-algorithm-for-pde |
Repo | |
Framework | |
Variational Regret Bounds for Reinforcement Learning
Title | Variational Regret Bounds for Reinforcement Learning |
Authors | Pratik Gajane, Ronald Ortner, Peter Auer |
Abstract | We consider undiscounted reinforcement learning in Markov decision processes (MDPs) where both the reward functions and the state-transition probabilities may vary (gradually or abruptly) over time. For this problem setting, we propose an algorithm and provide performance guarantees for the regret evaluated against the optimal non-stationary policy. The upper bound on the regret is given in terms of the total variation in the MDP. This is the first variational regret bound for the general reinforcement learning setting. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05857v3 |
https://arxiv.org/pdf/1905.05857v3.pdf | |
PWC | https://paperswithcode.com/paper/variational-regret-bounds-for-reinforcement |
Repo | |
Framework | |
The Liar’s Walk: Detecting Deception with Gait and Gesture
Title | The Liar’s Walk: Detecting Deception with Gait and Gesture |
Authors | Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha |
Abstract | We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures. We conducted an elaborate user study, where we recorded many participants performing tasks involving deceptive walking. We extract the participants’ walking gaits as series of 3D poses. We annotate various gestures performed by participants during their tasks. Based on the gait and gesture data, we train an LSTM-based deep neural network to obtain deep features. Finally, we use a combination of psychology-based gait, gesture, and deep features to detect deceptive walking with an accuracy of 88.41%. This is an improvement of 10.6% over handcrafted gait and gesture features and an improvement of 4.7% and 9.2% over classifiers based on the state-of-the-art emotion and action classification algorithms, respectively. Additionally, we present a novel dataset, DeceptiveWalk, that contains gaits and gestures with their associated deception labels. To the best of our knowledge, ours is the first algorithm to detect deceptive behavior using non-verbal cues of gait and gesture. |
Tasks | Action Classification |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06874v3 |
https://arxiv.org/pdf/1912.06874v3.pdf | |
PWC | https://paperswithcode.com/paper/the-liars-walk-detecting-deception-with-gait |
Repo | |
Framework | |
Deep Green Function Convolution for Improving Saliency in Convolutional Neural Networks
Title | Deep Green Function Convolution for Improving Saliency in Convolutional Neural Networks |
Authors | Dominique Beaini, Sofiane Achiche, Alexandre Duperré, Maxime Raison |
Abstract | Current saliency methods require to learn large scale regional features using small convolutional kernels, which is not possible with a simple feed-forward network. Some methods solve this problem by using segmentation into superpixels while others downscale the image through the network and rescale it back to its original size. The objective of this paper is to show that saliency convolutional neural networks (CNN) can be improved by using a Green’s function convolution (GFC) to extrapolate edges features into salient regions. The GFC acts as a gradient integrator, allowing to produce saliency features by filling thin edges directly inside the CNN. Hence, we propose the gradient integration and sum (GIS) layer that combines the edges features with the saliency features. Using the HED and DSS architecture, we demonstrated that adding a GIS layer near the network’s output allows to reduce the sensitivity to the parameter initialization, to reduce the overfitting and to improve the repeatability of the training. By simply adding a GIS layer to the state-of-the-art DSS model, there is an absolute increase of 1.6% for the F-measure on the DUT-OMRON dataset, with only 10ms of additional computation time. The GIS layer further allows the network to perform significantly better in the case of highly noisy images or low-brightness images. In fact, we observed an F-measure improvement of 5.2% when noise was added to the dataset and 2.8% when the brightness was reduced. Since the GIS layer is model agnostic, it can be implemented into different fully convolutional networks. A major contribution of the current work is the first implementation of Green’s function convolution inside a neural network, which allows the network to operate in the feature domain and in the gradient domain at the same time, thus improving the regional representation via edge filling. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08331v2 |
https://arxiv.org/pdf/1908.08331v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-green-function-convolution-for-improving |
Repo | |
Framework | |
Three-Stage Subspace Clustering Framework with Graph-Based Transformation and Optimization
Title | Three-Stage Subspace Clustering Framework with Graph-Based Transformation and Optimization |
Authors | Shuai Yang, Wenqi Zhu, Yuesheng Zhu |
Abstract | Subspace clustering (SC) refers to the problem of clustering high-dimensional data into a union of low-dimensional subspaces. Based on spectral clustering, state-of-the-art approaches solve SC problem within a two-stage framework. In the first stage, data representation techniques are applied to draw an affinity matrix from the original data. In the second stage, spectral clustering is directly applied to the affinity matrix so that data can be grouped into different subspaces. However, the affinity matrix obtained in the first stage usually fails to reveal the authentic relationship between data points, which leads to inaccurate clustering results. In this paper, we propose a universal Three-Stage Subspace Clustering framework (3S-SC). Graph-Based Transformation and Optimization (GBTO) is added between data representation and spectral clustering. The affinity matrix is obtained in the first stage, then it goes through the second stage, where the proposed GBTO is applied to generate a reconstructed affinity matrix with more authentic similarity between data points. Spectral clustering is applied after GBTO, which is the third stage. We verify our 3S-SC framework with GBTO through theoretical analysis. Experiments on both synthetic data and the real-world data sets of handwritten digits and human faces demonstrate the universality of the proposed 3S-SC framework in improving the connectivity and accuracy of SC methods based on $\ell_0$, $\ell_1$, $\ell_2$ or nuclear norm regularization. |
Tasks | |
Published | 2019-05-02 |
URL | http://arxiv.org/abs/1905.01145v1 |
http://arxiv.org/pdf/1905.01145v1.pdf | |
PWC | https://paperswithcode.com/paper/three-stage-subspace-clustering-framework |
Repo | |
Framework | |
Mutual Clustering on Comparative Texts via Heterogeneous Information Networks
Title | Mutual Clustering on Comparative Texts via Heterogeneous Information Networks |
Authors | Jianping Cao, Senzhang Wang, Danyan Wen, Zhaohui Peng, Philip S. Yu, Fei-yue Wang |
Abstract | Currently, many intelligence systems contain the texts from multi-sources, e.g., bulletin board system (BBS) posts, tweets and news. These texts can be comparative'' since they may be semantically correlated and thus provide us with different perspectives toward the same topics or events. To better organize the multi-sourced texts and obtain more comprehensive knowledge, we propose to study the novel problem of Mutual Clustering on Comparative Texts (MCCT), which aims to cluster the comparative texts simultaneously and collaboratively. The MCCT problem is difficult to address because 1) comparative texts usually present different data formats and structures and thus they are hard to organize, and 2) there lacks an effective method to connect the semantically correlated comparative texts to facilitate clustering them in an unified way. To this aim, in this paper we propose a Heterogeneous Information Network-based Text clustering framework HINT. HINT first models multi-sourced texts (e.g. news and tweets) as heterogeneous information networks by introducing the shared anchor texts’’ to connect the comparative texts. Next, two similarity matrices based on HINT as well as a transition matrix for cross-text-source knowledge transfer are constructed. Comparative texts clustering are then conducted by utilizing the constructed matrices. Finally, a mutual clustering algorithm is also proposed to further unify the separate clustering results of the comparative texts by introducing a clustering consistency constraint. We conduct extensive experimental on three tweets-news datasets, and the results demonstrate the effectiveness and robustness of the proposed method in addressing the MCCT problem. |
Tasks | Text Clustering, Transfer Learning |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03762v1 |
http://arxiv.org/pdf/1903.03762v1.pdf | |
PWC | https://paperswithcode.com/paper/mutual-clustering-on-comparative-texts-via |
Repo | |
Framework | |
Consistent Regression using Data-Dependent Coverings
Title | Consistent Regression using Data-Dependent Coverings |
Authors | Vincent Margot, Jean-Patrick Baudry, Frédéric Guilloux, Olivier Wintenberger |
Abstract | In this paper, we introduce a novel method to generate interpretable regression function estimators. The idea is based on called data-dependent coverings. The aim is to extract from the data a covering of the feature space instead of a partition. The estimator predicts the empirical conditional expectation over the cells of the partitions generated from the coverings. Thus, such estimator has the same form as those issued from data-dependent partitioning algorithms. We give sufficient conditions to ensure the consistency, avoiding the sufficient condition of shrinkage of the cells that appears in the former literature. Doing so, we reduce the number of covering elements. We show that such coverings are interpretable and each element of the covering is tagged as significant or insignificant. The proof of the consistency is based on a control of the error of the empirical estimation of conditional expectations which is interesting on its own. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02306v3 |
https://arxiv.org/pdf/1907.02306v3.pdf | |
PWC | https://paperswithcode.com/paper/consistent-regression-using-data-dependent |
Repo | |
Framework | |
Deep Learning-Based Strategy for Macromolecules Classification with Imbalanced Data from Cellular Electron Cryotomography
Title | Deep Learning-Based Strategy for Macromolecules Classification with Imbalanced Data from Cellular Electron Cryotomography |
Authors | Ziqian Luo, Xiangrui Zeng, Zhipeng Bao, Min Xu |
Abstract | Deep learning model trained by imbalanced data may not work satisfactorily since it could be determined by major classes and thus may ignore the classes with small amount of data. In this paper, we apply deep learning based imbalanced data classification for the first time to cellular macromolecular complexes captured by Cryo-electron tomography (Cryo-ET). We adopt a range of strategies to cope with imbalanced data, including data sampling, bagging, boosting, Genetic Programming based method and. Particularly, inspired from Inception 3D network, we propose a multi-path CNN model combining focal loss and mixup on the Cryo-ET dataset to expand the dataset, where each path had its best performance corresponding to each type of data and let the network learn the combinations of the paths to improve the classification performance. In addition, extensive experiments have been conducted to show our proposed method is flexible enough to cope with different number of classes by adjusting the number of paths in our multi-path model. To our knowledge, this work is the first application of deep learning methods of dealing with imbalanced data to the internal tissue classification of cell macromolecular complexes, which opened up a new path for cell classification in the field of computational biology. |
Tasks | Electron Tomography |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.09993v1 |
https://arxiv.org/pdf/1908.09993v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-strategy-for |
Repo | |
Framework | |
Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
Title | Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients |
Authors | Jun Sun, Tianyi Chen, Georgios B. Giannakis, Zaiyue Yang |
Abstract | The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in `lazy’ worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms. | |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07588v1 |
https://arxiv.org/pdf/1909.07588v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-distributed-learning-1 |
Repo | |
Framework | |