Paper Group ANR 396
Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning. Neural Network Applications in Earthquake Prediction (1994-2019): Meta-Analytic Insight on their Limitations. A gray-box approach for curriculum learning. Towards Building a Real Time Mobile Device Bird Counting System Through Synthetic Data Training and Model Compression. De …
Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning
Title | Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning |
Authors | Seul-Ki Yeom, Philipp Seegerer, Sebastian Lapuschkin, Simon Wiedemann, Klaus-Robert Müller, Wojciech Samek |
Abstract | The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the weights of various layers while at the same time aiming to not sacrifice performance. In this paper, we propose a novel criterion for CNN pruning inspired by neural network interpretability: The most relevant elements, i.e. weights or filters, are automatically found using their relevance score in the sense of explainable AI (XAI). By that we for the first time link the two disconnected lines of interpretability and model compression research. We show in particular that our proposed method can efficiently prune transfer-learned CNN models where networks pre-trained on large corpora are adapted to specialized tasks. To this end, the method is evaluated on a broad range of computer vision datasets. Notably, our novel criterion is not only competitive or better compared to state-of-the-art pruning criteria when successive retraining is performed, but clearly outperforms these previous criteria in the common application setting where the data of the task to be transferred to are very scarce and no retraining is possible. Our method can iteratively compress the model while maintaining or even improving accuracy. At the same time, it has a computational cost in the order of gradient computation and is comparatively simple to apply without the need for tuning hyperparameters for pruning. |
Tasks | Model Compression, Network Pruning |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08881v1 |
https://arxiv.org/pdf/1912.08881v1.pdf | |
PWC | https://paperswithcode.com/paper/pruning-by-explaining-a-novel-criterion-for |
Repo | |
Framework | |
Neural Network Applications in Earthquake Prediction (1994-2019): Meta-Analytic Insight on their Limitations
Title | Neural Network Applications in Earthquake Prediction (1994-2019): Meta-Analytic Insight on their Limitations |
Authors | Arnaud Mignan, Marco Broccardo |
Abstract | In the last few years, deep learning has solved seemingly intractable problems, boosting the hope to find approximate solutions to problems that now are considered unsolvable. Earthquake prediction, the Grail of Seismology, is, in this context of continuous exciting discoveries, an obvious choice for deep learning exploration. We review the entire literature of artificial neural network (ANN) applications for earthquake prediction (77 articles, 1994-2019 period) and find two emerging trends: an increasing interest in this domain, and a complexification of ANN models over time, towards deep learning. Despite apparent positive results observed in this corpus, we demonstrate that simpler models seem to offer similar predictive powers, if not better ones. Due to the structured, tabulated nature of earthquake catalogues, and the limited number of features so far considered, simpler and more transparent machine learning models seem preferable at the present stage of research. Those baseline models follow first physical principles and are consistent with the known empirical laws of Statistical Seismology, which have minimal abilities to predict large earthquakes. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01178v1 |
https://arxiv.org/pdf/1910.01178v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-applications-in-earthquake |
Repo | |
Framework | |
A gray-box approach for curriculum learning
Title | A gray-box approach for curriculum learning |
Authors | Francesco Foglino, Matteo Leonetti, Simone Sagratella, Ruggiero Seccia |
Abstract | Curriculum learning is often employed in deep reinforcement learning to let the agent progress more quickly towards better behaviors. Numerical methods for curriculum learning in the literature provides only initial heuristic solutions, with little to no guarantee on their quality. We define a new gray-box function that, including a suitable scheduling problem, can be effectively used to reformulate the curriculum learning problem. We propose different efficient numerical methods to address this gray-box reformulation. Preliminary numerical results on a benchmark task in the curriculum learning literature show the viability of the proposed approach. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.06812v1 |
https://arxiv.org/pdf/1906.06812v1.pdf | |
PWC | https://paperswithcode.com/paper/a-gray-box-approach-for-curriculum-learning |
Repo | |
Framework | |
Towards Building a Real Time Mobile Device Bird Counting System Through Synthetic Data Training and Model Compression
Title | Towards Building a Real Time Mobile Device Bird Counting System Through Synthetic Data Training and Model Compression |
Authors | Runde Yang |
Abstract | Counting the number of birds in an open sky setting has been an challenging problem due to the large number of bird flocks and the birds can overlap. Another difficulty is the lack of accurate training samples since the cost of labeling images of bird flocks can be extremely high and each sample picture can contain thousands of birds in a high resolution image. Inspired by recent work on training with synthetic data to perform crowd counting, we design a mechanism to generate synthetic bird dataset with precise bird count and the corresponding density maps. We then train a Unet model on the synthetic dataset to perform density map estimation that produces the count for each input. Our method is able to achieve MSE of approximately 12.4 on real dataset. In order to build a scalable system for fast bird counting under storage and computational constraints, we use model compression techniques and efficient model structures to increase the inference speed and save storage cost. We are able to reduce storage cost from 55MB to less than 5MB for the model with minimum loss of accuracy. This paper describes the pipelines of building an efficient bird counting system. |
Tasks | Crowd Counting, Model Compression |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.07106v2 |
https://arxiv.org/pdf/1912.07106v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-building-a-real-time-mobile-device |
Repo | |
Framework | |
Deep Model Compression via Deep Reinforcement Learning
Title | Deep Model Compression via Deep Reinforcement Learning |
Authors | Huixin Zhan, Yongcan Cao |
Abstract | Besides accuracy, the storage of convolutional neural networks (CNN) models is another important factor considering limited hardware resources in practical applications. For example, autonomous driving requires the design of accurate yet fast CNN for low latency in object detection and classification. To fulfill the need, we aim at obtaining CNN models with both high testing accuracy and small size/storage to address resource constraints in many embedded systems. In particular, this paper focuses on proposing a generic reinforcement learning based model compression approach in a two-stage compression pipeline: pruning and quantization. The first stage of compression, i.e., pruning, is achieved via exploiting deep reinforcement learning (DRL) to co-learn the accuracy of CNN models updated after layer-wise channel pruning on a testing dataset and the FLOPs, number of floating point operations in each layer, updated after kernel-wise variational pruning using information dropout. Layer-wise channel pruning is to remove unimportant kernels from the input channel dimension while kernel-wise variational pruning is to remove unimportant kernels from the 2D-kernel dimensions, namely, height and width. The second stage, i.e., quantization, is achieved via a similar DRL approach but focuses on obtaining the optimal weight bits for individual layers. We further conduct experimental results on CIFAR-10 and ImageNet datasets. For the CIFAR-10 dataset, the proposed method can reduce the size of VGGNet by 9x from 20.04MB to 2.2MB with 0.2% accuracy increase. For the ImageNet dataset, the proposed method can reduce the size of VGG-16 by 33x from 138MB to 4.14MB with no accuracy loss. |
Tasks | Autonomous Driving, Model Compression, Object Detection, Quantization |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02254v1 |
https://arxiv.org/pdf/1912.02254v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-model-compression-via-deep-reinforcement |
Repo | |
Framework | |
Pruning at a Glance: Global Neural Pruning for Model Compression
Title | Pruning at a Glance: Global Neural Pruning for Model Compression |
Authors | Abdullah Salama, Oleksiy Ostapenko, Tassilo Klein, Moin Nabi |
Abstract | Deep Learning models have become the dominant approach in several areas due to their high performance. Unfortunately, the size and hence computational requirements of operating such models can be considerably high. Therefore, this constitutes a limitation for deployment on memory and battery constrained devices such as mobile phones or embedded systems. To address these limitations, we propose a novel and simple pruning method that compresses neural networks by removing entire filters and neurons according to a global threshold across the network without any pre-calculation of layer sensitivity. The resulting model is compact, non-sparse, with the same accuracy as the non-compressed model, and most importantly requires no special infrastructure for deployment. We prove the viability of our method by producing highly compressed models, namely VGG-16, ResNet-56, and ResNet-110 respectively on CIFAR10 without losing any performance compared to the baseline, as well as ResNet-34 and ResNet-50 on ImageNet without a significant loss of accuracy. We also provide a well-retrained 30% compressed ResNet-50 that slightly surpasses the base model accuracy. Additionally, compressing more than 56% and 97% of AlexNet and LeNet-5 respectively. Interestingly, the resulted models’ pruning patterns are highly similar to the other methods using layer sensitivity pre-calculation step. Our method does not only exhibit good performance but what is more also easy to implement. |
Tasks | Model Compression |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00200v2 |
https://arxiv.org/pdf/1912.00200v2.pdf | |
PWC | https://paperswithcode.com/paper/pruning-at-a-glance-global-neural-pruning-for |
Repo | |
Framework | |
Algorithmic decision-making in AVs: Understanding ethical and technical concerns for smart cities
Title | Algorithmic decision-making in AVs: Understanding ethical and technical concerns for smart cities |
Authors | Hazel Si Min Lim, Araz Taeihagh |
Abstract | Autonomous Vehicles (AVs) are increasingly embraced around the world to advance smart mobility and more broadly, smart, and sustainable cities. Algorithms form the basis of decision-making in AVs, allowing them to perform driving tasks autonomously, efficiently, and more safely than human drivers and offering various economic, social, and environmental benefits. However, algorithmic decision-making in AVs can also introduce new issues that create new safety risks and perpetuate discrimination. We identify bias, ethics, and perverse incentives as key ethical issues in the AV algorithms’ decision-making that can create new safety risks and discriminatory outcomes. Technical issues in the AVs’ perception, decision-making and control algorithms, limitations of existing AV testing and verification methods, and cybersecurity vulnerabilities can also undermine the performance of the AV system. This article investigates the ethical and technical concerns surrounding algorithmic decision-making in AVs by exploring how driving decisions can perpetuate discrimination and create new safety risks for the public. We discuss steps taken to address these issues, highlight the existing research gaps and the need to mitigate these issues through the design of AV’s algorithms and of policies and regulations to fully realise AVs’ benefits for smart and sustainable cities. |
Tasks | Autonomous Vehicles, Decision Making |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13122v1 |
https://arxiv.org/pdf/1910.13122v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-decision-making-in-avs |
Repo | |
Framework | |
Data-Driven Compression of Convolutional Neural Networks
Title | Data-Driven Compression of Convolutional Neural Networks |
Authors | Ramit Pahwa, Manoj Ghuhan Arivazhagan, Ankur Garg, Siddarth Krishnamoorthy, Rohit Saxena, Sunav Choudhary |
Abstract | Deploying trained convolutional neural networks (CNNs) to mobile devices is a challenging task because of the simultaneous requirements of the deployed model to be fast, lightweight and accurate. Designing and training a CNN architecture that does well on all three metrics is highly non-trivial and can be very time-consuming if done by hand. One way to solve this problem is to compress the trained CNN models before deploying to mobile devices. This work asks and answers three questions on compressing CNN models automatically: a) How to control the trade-off between speed, memory and accuracy during model compression? b) In practice, a deployed model may not see all classes and/or may not need to produce all class labels. Can this fact be used to improve the trade-off? c) How to scale the compression algorithm to execute within a reasonable amount of time for many deployments? The paper demonstrates that a model compression algorithm utilizing reinforcement learning with architecture search and knowledge distillation can answer these questions in the affirmative. Experimental results are provided for current state-of-the-art CNN model families for image feature extraction like VGG and ResNet with CIFAR datasets. |
Tasks | Model Compression |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12740v1 |
https://arxiv.org/pdf/1911.12740v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-compression-of-convolutional |
Repo | |
Framework | |
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences
Title | ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences |
Authors | Zhizhong Han, Chao Chen, Yu-Shen Liu, Matthias Zwicker |
Abstract | 3D shape captioning is a challenging application in 3D shape understanding. Captions from recent multi-view based methods reveal that they cannot capture part-level characteristics of 3D shapes. This leads to a lack of detailed part-level description in captions, which human tend to focus on. To resolve this issue, we propose ShapeCaptioner, a generative caption network, to perform 3D shape captioning from semantic parts detected in multiple views. Our novelty lies in learning the knowledge of part detection in multiple views from 3D shape segmentations and transferring this knowledge to facilitate learning the mapping from 3D shapes to sentences. Specifically, ShapeCaptioner aggregates the parts detected in multiple colored views using our novel part class specific aggregation to represent a 3D shape, and then, employs a sequence to sequence model to generate the caption. Our outperforming results show that ShapeCaptioner can learn 3D shape features with more detailed part characteristics to facilitate better 3D shape captioning than previous work. |
Tasks | |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1908.00120v1 |
https://arxiv.org/pdf/1908.00120v1.pdf | |
PWC | https://paperswithcode.com/paper/shapecaptioner-generative-caption-network-for |
Repo | |
Framework | |
Implementation of a modified Nesterov’s Accelerated quasi-Newton Method on Tensorflow
Title | Implementation of a modified Nesterov’s Accelerated quasi-Newton Method on Tensorflow |
Authors | S. Indrapriyadarsini, Shahrzad Mahboubi, Hiroshi Ninomiya, Hideki Asai |
Abstract | Recent studies incorporate Nesterov’s accelerated gradient method for the acceleration of gradient based training. The Nesterov’s Accelerated Quasi-Newton (NAQ) method has shown to drastically improve the convergence speed compared to the conventional quasi-Newton method. This paper implements NAQ for non-convex optimization on Tensorflow. Two modifications have been proposed to the original NAQ algorithm to ensure global convergence and eliminate linesearch. The performance of the proposed algorithm - mNAQ is evaluated on standard non-convex function approximation benchmark problems and microwave circuit modelling problems. The results show that the improved algorithm converges better and faster compared to first order optimizers such as AdaGrad, RMSProp, Adam, and the second order methods such as the quasi-Newton method. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09158v1 |
https://arxiv.org/pdf/1910.09158v1.pdf | |
PWC | https://paperswithcode.com/paper/implementation-of-a-modified-nesterovs |
Repo | |
Framework | |
Repetitive Reprediction Deep Decipher for Semi-Supervised Learning
Title | Repetitive Reprediction Deep Decipher for Semi-Supervised Learning |
Authors | Guo-Hua Wang, Jianxin Wu |
Abstract | Most recent semi-supervised deep learning (deep SSL) methods used a similar paradigm: use network predictions to update pseudo-labels and use pseudo-labels to update network parameters iteratively. However, they lack theoretical support and cannot explain why predictions are good candidates for pseudo-labels. In this paper, we propose a principled end-to-end framework named deep decipher (D2) for SSL. Within the D2 framework, we prove that pseudo-labels are related to network predictions by an exponential link function, which gives a theoretical support for using predictions as pseudo-labels. Furthermore, we demonstrate that updating pseudo-labels by network predictions will make them uncertain. To mitigate this problem, we propose a training strategy called repetitive reprediction (R2). Finally, the proposed R2-D2 method is tested on the large-scale ImageNet dataset and outperforms state-of-the-art methods by 5 percentage points. |
Tasks | |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.04345v2 |
https://arxiv.org/pdf/1908.04345v2.pdf | |
PWC | https://paperswithcode.com/paper/repetitive-reprediction-deep-decipher-for |
Repo | |
Framework | |
Learning User Preferences for Trajectories from Brain Signals
Title | Learning User Preferences for Trajectories from Brain Signals |
Authors | Henrich Kolkhorst, Wolfram Burgard, Michael Tangermann |
Abstract | Robot motions in the presence of humans should not only be feasible and safe, but also conform to human preferences. This, however, requires user feedback on the robot’s behavior. In this work, we propose a novel approach to leverage the user’s brain signals as a feedback modality in order to decode the judgment of robot trajectories and rank them according to the user’s preferences. We show that brain signals measured using electroencephalography during observation of a robotic arm’s trajectory as well as in response to preference statements are informative regarding the user’s preference. Furthermore, we demonstrate that user feedback from brain signals can be used to reliably infer pairwise trajectory preferences as well as to retrieve the preferred observed trajectories of the user with a performance comparable to explicit behavioral feedback. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01039v2 |
https://arxiv.org/pdf/1909.01039v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-user-preferences-for-trajectories |
Repo | |
Framework | |
LiDAR-Flow: Dense Scene Flow Estimation from Sparse LiDAR and Stereo Images
Title | LiDAR-Flow: Dense Scene Flow Estimation from Sparse LiDAR and Stereo Images |
Authors | Ramy Battrawy, René Schuster, Oliver Wasenmüller, Qing Rao, Didier Stricker |
Abstract | We propose a new approach called LiDAR-Flow to robustly estimate a dense scene flow by fusing a sparse LiDAR with stereo images. We take the advantage of the high accuracy of LiDAR to resolve the lack of information in some regions of stereo images due to textureless objects, shadows, ill-conditioned light environment and many more. Additionally, this fusion can overcome the difficulty of matching unstructured 3D points between LiDAR-only scans. Our LiDAR-Flow approach consists of three main steps; each of them exploits LiDAR measurements. First, we build strong seeds from LiDAR to enhance the robustness of matches between stereo images. The imagery part seeks the motion matches and increases the density of scene flow estimation. Then, a consistency check employs LiDAR seeds to remove the possible mismatches. Finally, LiDAR measurements constraint the edge-preserving interpolation method to fill the remaining gaps. In our evaluation we investigate the individual processing steps of our LiDAR-Flow approach and demonstrate the superior performance compared to image-only approach. |
Tasks | Scene Flow Estimation |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14453v2 |
https://arxiv.org/pdf/1910.14453v2.pdf | |
PWC | https://paperswithcode.com/paper/lidar-flow-dense-scene-flow-estimation-from |
Repo | |
Framework | |
Lung Cancer Detection and Classification based on Image Processing and Statistical Learning
Title | Lung Cancer Detection and Classification based on Image Processing and Statistical Learning |
Authors | Md Rashidul Hasan, Muntasir Al Kabir |
Abstract | Lung cancer is one of the death threatening diseases among human beings. Early and accurate detection of lung cancer can increase the survival rate from lung cancer. Computed Tomography (CT) images are commonly used for detecting the lung cancer.Using a data set of thousands of high-resolution lung scans collected from Kaggle competition [1], we will develop algorithms that accurately determine in the lungs are cancerous or not. The proposed system promises better result than the existing systems, which would be beneficial for the radiologist for the accurate and early detection of cancer. The method has been tested on 198 slices of CT images of various stages of cancer obtained from Kaggle dataset[1] and is found satisfactory results. The accuracy of the proposed method in this dataset is 72.2% |
Tasks | Computed Tomography (CT) |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10654v1 |
https://arxiv.org/pdf/1911.10654v1.pdf | |
PWC | https://paperswithcode.com/paper/lung-cancer-detection-and-classification |
Repo | |
Framework | |
On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals
Title | On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals |
Authors | Ricardo Darío Pérez Principi, Cristina Palmero, Julio C. S. Jacques Junior, Sergio Escalera |
Abstract | Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05568v2 |
https://arxiv.org/pdf/1909.05568v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effect-of-observed-subject-biases-in |
Repo | |
Framework | |