Paper Group ANR 69
Convolutional Neural Networks for User Identificationbased on Motion Sensors Represented as Image. Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD. Feature Engineering and Forecasting via Derivative-free Optimization and Ensemble of Sequence-to-sequence Networks with Applications in Renewable Energy. A Universally Optimal M …
Convolutional Neural Networks for User Identificationbased on Motion Sensors Represented as Image
Title | Convolutional Neural Networks for User Identificationbased on Motion Sensors Represented as Image |
Authors | Cezara Benegui, Radu Tudor Ionescu |
Abstract | In this paper, we propose a deep learning approach for smartphone user identification based on analyzing motion signals recorded by the accelerometer and the gyroscope, during a single tap gesture performed by the user on the screen. We transform the discrete 3-axis signals from the motion sensors into a gray-scale image representation which is provided as input to a convolutional neural network (CNN) that is pre-trained for multi-class user classification. In the pre-training stage, we benefit from different users and multiple samples per user. After pre-training, we use our CNN as feature extractor, generating an embedding associated to each single tap on the screen. The resulting embeddings are used to train a Support Vector Machines (SVM) model in a few-shot user identification setting, i.e. requiring only 20 taps on the screen during the registration phase. We compare our identification system based on CNN features with two baseline systems, one that employs handcrafted features and another that employs recurrent neural network (RNN) features. All systems are based on the same classifier, namely SVM. To pre-train the CNN and the RNN models for multi-class user classification, we use a different set of users than the set used for few-shot user identification, ensuring a realistic scenario. The empirical results demonstrate that our CNN model yields a top accuracy of 89.75% in multi-class user classification and a top accuracy of 96.72% in few-shot user identification. In conclusion, we believe that our system is ready for practical use, having a better generalization capacity than both baselines. |
Tasks | |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.03760v2 |
https://arxiv.org/pdf/1912.03760v2.pdf | |
PWC | https://paperswithcode.com/paper/a-convolutional-neural-network-for-user |
Repo | |
Framework | |
Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD
Title | Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD |
Authors | Rosa Candela, Giulio Franzese, Maurizio Filippone, Pietro Michiardi |
Abstract | Large scale machine learning is increasingly relying on distributed optimization, whereby several machines contribute to the training process of a statistical model. While there exist a large literature on stochastic gradient descent (SGD) and variants, the study of countermeasures to mitigate problems arising in asynchronous distributed settings are still in their infancy. The key question of this work is whether sparsification, a technique predominantly used to reduce communication overheads, can also mitigate the staleness problem that affects asynchronous SGD. We study the role of sparsification both theoretically and empirically. Our theory indicates that, in an asynchronous, non-convex setting, the ergodic convergence rate of sparsified SGD matches the known result $\mathcal{O} \left( 1/\sqrt{T} \right)$ of non-convex SGD. We then carry out an empirical study to complement our theory and show that, in practice, sparsification consistently improves over vanilla SGD and current alternatives to mitigate the effects of staleness. |
Tasks | Distributed Optimization |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09466v1 |
https://arxiv.org/pdf/1910.09466v1.pdf | |
PWC | https://paperswithcode.com/paper/sparsification-as-a-remedy-for-staleness-in |
Repo | |
Framework | |
Feature Engineering and Forecasting via Derivative-free Optimization and Ensemble of Sequence-to-sequence Networks with Applications in Renewable Energy
Title | Feature Engineering and Forecasting via Derivative-free Optimization and Ensemble of Sequence-to-sequence Networks with Applications in Renewable Energy |
Authors | Mohammad Pirhooshyaran, Katya Scheinberg, Lawrence V. Snyder |
Abstract | This study introduces a framework for the forecasting, reconstruction and feature engineering of multivariate processes along with its renewable energy applications. We integrate derivative-free optimization with an ensemble of sequence-to-sequence networks and design a new resampling technique called additive resampling, which, along with Bootstrap aggregating (bagging) resampling, are applied to initialize the ensemble structure. Moreover, we explore the proposed framework performance on three renewable energy sources—wind, solar and ocean wave—and conduct several short- to long-term forecasts showing the superiority of the proposed method compared to numerous machine learning techniques. The findings indicate that the introduced method performs more accurately when the forecasting horizon becomes longer. In addition, we modify the framework for automated feature selection. The model represents a clear interpretation of the selected features. Furthermore, we investigate the effects of different environmental and marine factors on the wind speed and ocean output power, respectively, and report the selected features. Finally, we explore the online forecasting setting and illustrate that the model outperforms alternatives through different measurement errors. |
Tasks | Feature Engineering, Feature Selection |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05447v3 |
https://arxiv.org/pdf/1909.05447v3.pdf | |
PWC | https://paperswithcode.com/paper/feature-engineering-and-forecasting-via |
Repo | |
Framework | |
A Universally Optimal Multistage Accelerated Stochastic Gradient Method
Title | A Universally Optimal Multistage Accelerated Stochastic Gradient Method |
Authors | Necdet Serhat Aybat, Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar |
Abstract | We study the problem of minimizing a strongly convex, smooth function when we have noisy estimates of its gradient. We propose a novel multistage accelerated algorithm that is universally optimal in the sense that it achieves the optimal rate both in the deterministic and stochastic case and operates without knowledge of noise characteristics. The algorithm consists of stages that use a stochastic version of Nesterov’s method with a specific restart and parameters selected to achieve the fastest reduction in the bias-variance terms in the convergence rate bounds. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.08022v3 |
https://arxiv.org/pdf/1901.08022v3.pdf | |
PWC | https://paperswithcode.com/paper/a-universally-optimal-multistage-accelerated |
Repo | |
Framework | |
Towards Reliable Evaluation of Road Network Reconstructions
Title | Towards Reliable Evaluation of Road Network Reconstructions |
Authors | Leonardo Citraro, Mateusz Koziński, Pascal Fua |
Abstract | Existing performance measures rank delineation algorithms inconsistently, which makes it difficult to decide which one is best in any given situation. We show that these inconsistencies stem from design flaws that make the metrics insensitive to whole classes of errors. To provide more reliable evaluation, we design three new metrics that are far more consistent even though they use very different approaches to comparing ground-truth and reconstructed road networks. We use both synthetic and real data to demonstrate this and advocate the use of these corrected metrics as a tool to gauge future progress. |
Tasks | |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12467v1 |
https://arxiv.org/pdf/1911.12467v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-reliable-evaluation-of-road-network |
Repo | |
Framework | |
A Deep Learning Approach for Robust Corridor Following
Title | A Deep Learning Approach for Robust Corridor Following |
Authors | Vishnu Sashank Dorbala, A. H. Abdul Hafez, C. V. Jawahar |
Abstract | For an autonomous corridor following task where the environment is continuously changing, several forms of environmental noise prevent an automated feature extraction procedure from performing reliably. Moreover, in cases where pre-defined features are absent from the captured data, a well defined control signal for performing the servoing task fails to get produced. In order to overcome these drawbacks, we present in this work, using a convolutional neural network (CNN) to directly estimate the required control signal from an image, encompassing feature extraction and control law computation into one single end-to-end framework. In particular, we study the task of autonomous corridor following using a CNN and present clear advantages in cases where a traditional method used for performing the same task fails to give a reliable outcome. We evaluate the performance of our method on this task on a Wheelchair Platform developed at our institute for this purpose. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07896v1 |
https://arxiv.org/pdf/1911.07896v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-for-robust-corridor |
Repo | |
Framework | |
Non-Gaussianity of Stochastic Gradient Noise
Title | Non-Gaussianity of Stochastic Gradient Noise |
Authors | Abhishek Panigrahi, Raghav Somani, Navin Goyal, Praneeth Netrapalli |
Abstract | What enables Stochastic Gradient Descent (SGD) to achieve better generalization than Gradient Descent (GD) in Neural Network training? This question has attracted much attention. In this paper, we study the distribution of the Stochastic Gradient Noise (SGN) vectors during the training. We observe that for batch sizes 256 and above, the distribution is best described as Gaussian at-least in the early phases of training. This holds across data-sets, architectures, and other choices. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09626v2 |
https://arxiv.org/pdf/1910.09626v2.pdf | |
PWC | https://paperswithcode.com/paper/non-gaussianity-of-stochastic-gradient-noise |
Repo | |
Framework | |
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
Title | SCAFFOLD: Stochastic Controlled Averaging for Federated Learning |
Authors | Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh |
Abstract | Federated Averaging (FedAvg) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FedAvg and prove that it suffers from client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence. As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the client-drift’ in its local updates. We prove that SCAFFOLD requires significantly fewer communication rounds and is not affected by data heterogeneity or client sampling. Further, we show that (for quadratics) SCAFFOLD can take advantage of similarity in the client’s data yielding even faster convergence. The latter is the first result to quantify the usefulness of local-steps in distributed optimization. |
Tasks | Distributed Optimization |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06378v2 |
https://arxiv.org/pdf/1910.06378v2.pdf | |
PWC | https://paperswithcode.com/paper/scaffold-stochastic-controlled-averaging-for |
Repo | |
Framework | |
Late fusion of deep learning and hand-crafted features for Achilles tendon healing monitoring
Title | Late fusion of deep learning and hand-crafted features for Achilles tendon healing monitoring |
Authors | Norbert Kapinski, Jedrzej M. Nowosielski, Maciej E. Marchwiany, Jakub Zielinski, Beata Ciszkowska-Lyson, Bartosz A. Borucki, Tomasz Trzcinski, Krzysztof S. Nowinski |
Abstract | Healing process assessment of the Achilles tendon is usually a complex procedure that relies on a combination of biomechanical and medical imaging tests. As a result, diagnostics remains a tedious and long-lasting task. Recently, a novel method for the automatic assessment of tendon healing based on Magnetic Resonance Imaging and deep learning was introduced. The method assesses six parameters related to the treatment progress utilizing a modified pre-trained network, PCA-reduced space, and linear regression. In this paper, we propose to improve this approach by incorporating hand-crafted features. We first perform a feature selection in order to obtain optimal sets of mixed hand-crafted and deep learning predictors. With the use of approx. 20,000 MRI slices, we then train a meta-regression algorithm that performs the tendon healing assessment. Finally, we evaluate the method against scores given by an experienced radiologist. In comparison with the previous baseline method, our approach significantly improves correlation in all of the six parameters assessed. Furthermore, our method uses only one MRI protocol and saves up to 60% of the time needed for data acquisition. |
Tasks | Feature Selection |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05687v1 |
https://arxiv.org/pdf/1909.05687v1.pdf | |
PWC | https://paperswithcode.com/paper/late-fusion-of-deep-learning-and-hand-crafted |
Repo | |
Framework | |
Bootstrapping Conditional GANs for Video Game Level Generation
Title | Bootstrapping Conditional GANs for Video Game Level Generation |
Authors | Ruben Rodriguez Torrado, Ahmed Khalifa, Michael Cerny Green, Niels Justesen, Sebastian Risi, Julian Togelius |
Abstract | Generative Adversarial Networks (GANs) have shown im-pressive results for image generation. However, GANs facechallenges in generating contents with certain types of con-straints, such as game levels. Specifically, it is difficult togenerate levels that have aesthetic appeal and are playable atthe same time. Additionally, because training data usually islimited, it is challenging to generate unique levels with cur-rent GANs. In this paper, we propose a new GAN architec-ture namedConditional Embedding Self-Attention Genera-tive Adversarial Network(CESAGAN) and a new bootstrap-ping training procedure. The CESAGAN is a modification ofthe self-attention GAN that incorporates an embedding fea-ture vector input to condition the training of the discriminatorand generator. This allows the network to model non-localdependency between game objects, and to count objects. Ad-ditionally, to reduce the number of levels necessary to trainthe GAN, we propose a bootstrapping mechanism in whichplayable generated levels are added to the training set. Theresults demonstrate that the new approach does not only gen-erate a larger number of levels that are playable but also gen-erates fewer duplicate levels compared to a standard GAN. |
Tasks | Image Generation |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01603v1 |
https://arxiv.org/pdf/1910.01603v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-conditional-gans-for-video-game |
Repo | |
Framework | |
Image Generation and Recognition (Emotions)
Title | Image Generation and Recognition (Emotions) |
Authors | Hanne Carlsson, Dimitrios Kollias |
Abstract | Generative Adversarial Networks (GANs) were proposed in 2014 by Goodfellow et al., and have since been extended into multiple computer vision applications. This report provides a thorough survey of recent GAN research, outlining the various architectures and applications, as well as methods for training GANs and dealing with latent space. This is followed by a discussion of potential areas for future GAN research, including: evaluating GANs, better understanding GANs, and techniques for training GANs. The second part of this report outlines the compilation of a dataset of images `in the wild’ representing each of the 7 basic human emotions, and analyses experiments done when training a StarGAN on this dataset combined with the FER2013 dataset. | |
Tasks | Image Generation |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05774v2 |
https://arxiv.org/pdf/1910.05774v2.pdf | |
PWC | https://paperswithcode.com/paper/image-generation-and-recognition-emotions |
Repo | |
Framework | |
Picking groups instead of samples: A close look at Static Pool-based Meta-Active Learning
Title | Picking groups instead of samples: A close look at Static Pool-based Meta-Active Learning |
Authors | Ignasi Mas, Josep Ramon Morros, Veronica Vilaplana |
Abstract | Active Learning techniques are used to tackle learning problems where obtaining training labels is costly. In this work we use Meta-Active Learning to learn to select a subset of samples from a pool of unsupervised input for further annotation. This scenario is called Static Pool-based Meta- Active Learning. We propose to extend existing approaches by performing the selection in a manner that, unlike previous works, can handle the selection of each sample based on the whole selected subset. |
Tasks | Active Learning |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00314v1 |
https://arxiv.org/pdf/1911.00314v1.pdf | |
PWC | https://paperswithcode.com/paper/picking-groups-instead-of-samples-a-close |
Repo | |
Framework | |
On the Strong Equivalences of LPMLN Programs
Title | On the Strong Equivalences of LPMLN Programs |
Authors | Bin Wang, Jun Shen, Shutao Zhang, Zhizheng Zhang |
Abstract | By incorporating the methods of Answer Set Programming (ASP) and Markov Logic Networks (MLN), LPMLN becomes a powerful tool for non-monotonic, inconsistent and uncertain knowledge representation and reasoning. To facilitate the applications and extend the understandings of LPMLN, we investigate the strong equivalences between LPMLN programs in this paper, which is regarded as an important property in the field of logic programming. In the field of ASP, two programs P and Q are strongly equivalent, iff for any ASP program R, the programs P and Q extended by R have the same stable models. In other words, an ASP program can be replaced by one of its strong equivalent without considering its context, which helps us to simplify logic programs, enhance inference engines, construct human-friendly knowledge bases etc. Since LPMLN is a combination of ASP and MLN, the notions of strong equivalences in LPMLN is quite different from that in ASP. Firstly, we present the notions of p-strong and w-strong equivalences between LPMLN programs. Secondly, we present a characterization of the notions by generalizing the SE-model approach in ASP. Finally, we show the use of strong equivalences in simplifying LPMLN programs, and present a sufficient and necessary syntactic condition that guarantees the strong equivalence between a single LPMLN rule and the empty program. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08238v1 |
https://arxiv.org/pdf/1909.08238v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-strong-equivalences-of-lpmln-programs |
Repo | |
Framework | |
Privacy Preserving Adjacency Spectral Embedding on Stochastic Blockmodels
Title | Privacy Preserving Adjacency Spectral Embedding on Stochastic Blockmodels |
Authors | Li Chen |
Abstract | For graphs generated from stochastic blockmodels, adjacency spectral embedding is asymptotically consistent. Further, adjacency spectral embedding composed with universally consistent classifiers is universally consistent to achieve the Bayes error. However when the graph contains private or sensitive information, treating the data as non-private can potentially leak privacy and incur disclosure risks. In this paper, we propose a differentially private adjacency spectral embedding algorithm for stochastic blockmodels. We demonstrate that our proposed methodology can estimate the latent positions close to, in Frobenius norm, the latent positions by adjacency spectral embedding and achieve comparable accuracy at desired privacy parameters in simulated and real world networks. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.07065v1 |
https://arxiv.org/pdf/1905.07065v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-adjacency-spectral |
Repo | |
Framework | |
Integrated Neural Network and Machine Vision Approach For Leather Defect Classification
Title | Integrated Neural Network and Machine Vision Approach For Leather Defect Classification |
Authors | Sze-Teng Liong, Y. S. Gan, Yen-Chang Huang, Kun-Hong Liu, Wei-Chuen Yau |
Abstract | Leather is a type of natural, durable, flexible, soft, supple and pliable material with smooth texture. It is commonly used as a raw material to manufacture luxury consumer goods for high-end customers. To ensure good quality control on the leather products, one of the critical processes is the visual inspection step to spot the random defects on the leather surfaces and it is usually conducted by experienced experts. This paper presents an automatic mechanism to perform the leather defect classification. In particular, we focus on detecting tick-bite defects on a specific type of calf leather. Both the handcrafted feature extractors (i.e., edge detectors and statistical approach) and data-driven (i.e., artificial neural network) methods are utilized to represent the leather patches. Then, multiple classifiers (i.e., decision trees, Support Vector Machines, nearest neighbour and ensemble classifiers) are exploited to determine whether the test sample patches contain defective segments. Using the proposed method, we managed to get a classification accuracy rate of 84% from a sample of approximately 2500 pieces of 400 * 400 leather patches. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11731v1 |
https://arxiv.org/pdf/1905.11731v1.pdf | |
PWC | https://paperswithcode.com/paper/integrated-neural-network-and-machine-vision |
Repo | |
Framework | |