Paper Group ANR 1573
SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training. Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping. Learning the optimal state-feedback via supervised imitation learning. A Characteristic Function Approach to Deep Implicit Generative Modeling. N …
SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training
Title | SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training |
Authors | Ahmed T. Elthakeb, Prannoy Pilligundla, Hadi Esmaeilzadeh |
Abstract | Deep quantization of neural networks (below eight bits) offers significant promise in reducing their compute and storage cost. Albeit alluring, without special techniques for training and optimization, deep quantization results in significant accuracy loss. To further mitigate this loss, we propose a novel sinusoidal regularization, called SinReQ1, for deep quantized training. SinReQ adds a periodic term to the original objective function of the underlying training algorithm. SinReQ exploits the periodicity, differentiability, and the desired convexity profile in sinusoidal functions to automatically propel weights towards values that are inherently closer to quantization levels. Since, this technique does not require invasive changes to the training procedure, SinReQ can harmoniously enhance quantized training algorithms. SinReQ offers generality and flexibility as it is not limited to a certain bitwidth or a uniform assignment of bitwidths across layers. We carry out experimentation using the AlexNet, CIFAR-10, ResNet-18, ResNet-20, SVHN, and VGG-11 DNNs with three to five bits for quantization and show the versatility of SinReQ in enhancing multiple quantized training algorithms, DoReFa [32] and WRPN [24]. Averaging across all the bit configurations shows that SinReQ closes the accuracy gap between these two techniques and the full-precision runs by 32.4% and 27.5%, respectively. That is improving the absolute accuracy of DoReFa and WRPN by 2.8% and 2.1%, respectively. |
Tasks | Quantization |
Published | 2019-05-04 |
URL | https://arxiv.org/abs/1905.01416v3 |
https://arxiv.org/pdf/1905.01416v3.pdf | |
PWC | https://paperswithcode.com/paper/sinreq-generalized-sinusoidal-regularization |
Repo | |
Framework | |
Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping
Title | Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping |
Authors | Melih C. Yesilli, Firas A. Khasawneh, Andreas Otto |
Abstract | Chatter detection from sensor signals has been an active field of research. While some success has been reported using several featurization tools and machine learning algorithms, existing methods have several drawbacks such as manual preprocessing and requiring a large data set. In this paper, we present an alternative approach for chatter detection based on K-Nearest Neighbor (kNN) algorithm for classification and the Dynamic Time Warping (DTW) as a time series similarity measure. The used time series are the acceleration signals acquired from the tool holder in a series of turning experiments. Our results, show that this approach achieves detection accuracies that in most cases outperform existing methods. We compare our results to the traditional methods based on Wavelet Packet Transform (WPT) and the Ensemble Empirical Mode Decomposition (EEMD), as well as to the more recent Topological Data Analysis (TDA) based approach. We show that in three out of four cutting configurations our DTW-based approach attains the highest average classification rate reaching in one case as high as 99% accuracy. Our approach does not require feature extraction, is capable of reusing a classifier across different cutting configurations, and it uses reasonably sized training sets. Although the resulting high accuracy in our approach is associated with high computational cost, this is specific to the DTW implementation that we used. Specifically, we highlight available, very fast DTW implementations that can even be implemented on small consumer electronics. Therefore, further code optimization and the significantly reduced computational effort during the implementation phase make our approach a viable option for in-process chatter detection. |
Tasks | Time Series, Topological Data Analysis |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01678v1 |
https://arxiv.org/pdf/1908.01678v1.pdf | |
PWC | https://paperswithcode.com/paper/chatter-detection-in-turning-using-machine |
Repo | |
Framework | |
Learning the optimal state-feedback via supervised imitation learning
Title | Learning the optimal state-feedback via supervised imitation learning |
Authors | Dharmesh Tailor, Dario Izzo |
Abstract | Imitation learning is a control design paradigm that seeks to learn a control policy reproducing demonstrations from expert agents. By substituting expert demonstrations for optimal behaviours, the same paradigm leads to the design of control policies closely approximating the optimal state-feedback. This approach requires training a machine learning algorithm (in our case deep neural networks) directly on state-control pairs originating from optimal trajectories. We have shown in previous work that, when restricted to low-dimensional state and control spaces, this approach is very successful in several deterministic, non-linear problems in continuous-time. In this work, we refine our previous studies using as a test case a simple quadcopter model with quadratic and time-optimal objective functions. We describe in detail the best learning pipeline we have developed, that is able to approximate via deep neural networks the state-feedback map to a very high accuracy. We introduce the use of the softplus activation function in the hidden units of neural networks showing that it results in a smoother control profile whilst retaining the benefits of rectifiers. We show how to evaluate the optimality of the trained state-feedback, and find that already with two layers the objective function reached and its optimal value differ by less than one percent. We later consider also an additional metric linked to the system asymptotic behaviour - time taken to converge to the policy’s fixed point. With respect to these metrics, we show that improvements in the mean absolute error do not necessarily correspond to better policies. |
Tasks | Imitation Learning |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.02369v2 |
http://arxiv.org/pdf/1901.02369v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-the-optimal-state-feedback-via |
Repo | |
Framework | |
A Characteristic Function Approach to Deep Implicit Generative Modeling
Title | A Characteristic Function Approach to Deep Implicit Generative Modeling |
Authors | Abdul Fatir Ansari, Jonathan Scarlett, Harold Soh |
Abstract | In this paper, we formulate the problem of learning an Implicit Generative Model (IGM) as minimizing the expected distance between characteristic functions. Specifically, we match the characteristic functions of the real and generated data distributions under a suitably-chosen weighting distribution. This distance measure, which we term as the characteristic function distance (CFD), can be (approximately) computed with linear time-complexity in the number of samples, compared to the quadratic-time Maximum Mean Discrepancy (MMD). By replacing the discrepancy measure in the critic of a GAN with the CFD, we obtain a model that is simple to implement and stable to train; the proposed metric enjoys desirable theoretical properties including continuity and differentiability with respect to generator parameters, and continuity in the weak topology. We further propose a variation of the CFD in which the weighting distribution parameters are also optimized during training; this obviates the need for manual tuning and leads to an improvement in test power relative to CFD. Experiments show that our proposed method outperforms WGAN and MMD-GAN variants on a variety of unsupervised image generation benchmark datasets. |
Tasks | Image Generation |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07425v1 |
https://arxiv.org/pdf/1909.07425v1.pdf | |
PWC | https://paperswithcode.com/paper/a-characteristic-function-approach-to-deep |
Repo | |
Framework | |
Neural Density Estimation and Likelihood-free Inference
Title | Neural Density Estimation and Likelihood-free Inference |
Authors | George Papamakarios |
Abstract | I consider two problems in machine learning and statistics: the problem of estimating the joint probability density of a collection of random variables, known as density estimation, and the problem of inferring model parameters when their likelihood is intractable, known as likelihood-free inference. The contribution of the thesis is a set of new methods for addressing these problems that are based on recent advances in neural networks and deep learning. |
Tasks | Density Estimation |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13233v1 |
https://arxiv.org/pdf/1910.13233v1.pdf | |
PWC | https://paperswithcode.com/paper/191013233 |
Repo | |
Framework | |
Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains
Title | Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains |
Authors | Navid Rekabsaz, Nikolaos Pappas, James Henderson, Banriskhem K. Khonglah, Srikanth Madikeri |
Abstract | Neural language modeling (LM) has led to significant improvements in several applications, including Automatic Speech Recognition. However, they typically require large amounts of training data, which is not available for many domains and languages. In this study, we propose a multilingual neural language model architecture, trained jointly on the domain-specific data of several low-resource languages. The proposed multilingual LM consists of language specific word embeddings in the encoder and decoder, and one language specific LSTM layer, plus two LSTM layers with shared parameters across the languages. This multilingual LM model facilitates transfer learning across the languages, acting as an extra regularizer in very low-resource scenarios. We integrate our proposed multilingual approach with a state-of-the-art highly-regularized neural LM, and evaluate on the conversational data domain for four languages over a range of training data sizes. Compared to monolingual LMs, the results show significant improvements of our proposed multilingual LM when the amount of available training data is limited, indicating the advantages of cross-lingual parameter sharing in very low-resource language modeling. |
Tasks | Language Modelling, Speech Recognition, Transfer Learning, Word Embeddings |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1906.01496v1 |
https://arxiv.org/pdf/1906.01496v1.pdf | |
PWC | https://paperswithcode.com/paper/190601496 |
Repo | |
Framework | |
Is Free Choice Permission Admissible in Classical Deontic Logic?
Title | Is Free Choice Permission Admissible in Classical Deontic Logic? |
Authors | Guido Governatori, Antonino Rotolo |
Abstract | In this paper, we explore how, and if, free choice permission (FCP) can be accepted when we consider deontic conflicts between certain types of permissions and obligations. As is well known, FCP can license, under some minimal conditions, the derivation of an indefinite number of permissions. We discuss this and other drawbacks and present six Hilbert-style classical deontic systems admitting a guarded version of FCP. The systems that we present are not too weak from the inferential viewpoint, as far as permission is concerned, and do not commit to weakening any specific logic for obligations. |
Tasks | |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07696v2 |
https://arxiv.org/pdf/1905.07696v2.pdf | |
PWC | https://paperswithcode.com/paper/is-free-choice-permission-admissible-in |
Repo | |
Framework | |
Refactoring Neural Networks for Verification
Title | Refactoring Neural Networks for Verification |
Authors | David Shriver, Dong Xu, Sebastian Elbaum, Matthew B. Dwyer |
Abstract | Deep neural networks (DNN) are growing in capability and applicability. Their effectiveness has led to their use in safety critical and autonomous systems, yet there is a dearth of cost-effective methods available for reasoning about the behavior of a DNN. In this paper, we seek to expand the applicability and scalability of existing DNN verification techniques through DNN refactoring. A DNN refactoring defines (a) the transformation of the DNN’s architecture, i.e., the number and size of its layers, and (b) the distillation of the learned relationships between the input features and function outputs of the original to train the transformed network. Unlike with traditional code refactoring, DNN refactoring does not guarantee functional equivalence of the two networks, but rather it aims to preserve the accuracy of the original network while producing a simpler network that is amenable to more efficient property verification. We present an automated framework for DNN refactoring, and demonstrate its potential effectiveness through three case studies on networks used in autonomous systems. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.08026v1 |
https://arxiv.org/pdf/1908.08026v1.pdf | |
PWC | https://paperswithcode.com/paper/refactoring-neural-networks-for-verification |
Repo | |
Framework | |
A Survey on Neural Network Language Models
Title | A Survey on Neural Network Language Models |
Authors | Kun Jing, Jungang Xu |
Abstract | As the core component of Natural Language Processing (NLP) system, Language Model (LM) can provide word representation and probability indication of word sequences. Neural Network Language Models (NNLMs) overcome the curse of dimensionality and improve the performance of traditional LMs. A survey on NNLMs is performed in this paper. The structure of classic NNLMs is described firstly, and then some major improvements are introduced and analyzed. We summarize and compare corpora and toolkits of NNLMs. Further, some research directions of NNLMs are discussed. |
Tasks | Language Modelling |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03591v2 |
https://arxiv.org/pdf/1906.03591v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-neural-network-language-models |
Repo | |
Framework | |
Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjects
Title | Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjects |
Authors | Igor Koval, Stéphanie Allassonnière, Stanley Durrleman |
Abstract | The ability to predict the progression of biomarkers, notably in NDD, is limited by the size of the longitudinal data sets, in terms of number of patients, number of visits per patients and total follow-up time. To this end, we introduce a data augmentation technique that is able to reproduce the variability seen in a longitudinal training data set and simulate continuous biomarkers trajectories for any number of virtual patients. Thanks to this simulation framework, we propose to transform the training set into a simulated data set with more patients, more time-points per patient and longer follow-up duration. We illustrate this approach on the prediction of the MMSE of MCI subjects of the ADNI data set. We show that it allows to reach predictions with errors comparable to the noise in the data, estimated in test/retest studies, achieving a improvement of 37% of the mean absolute error compared to the same non-augmented model. |
Tasks | Data Augmentation |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.02921v1 |
http://arxiv.org/pdf/1904.02921v1.pdf | |
PWC | https://paperswithcode.com/paper/simulation-of-virtual-cohorts-increases |
Repo | |
Framework | |
Monitoring stance towards vaccination in Twitter messages
Title | Monitoring stance towards vaccination in Twitter messages |
Authors | Florian Kunneman, Mattijs Lambooij, Albert Wong, Antal van den Bosch, Liesbeth Mollema |
Abstract | We developed a system to automatically classify stance towards vaccination in Twitter messages, with a focus on messages with a negative stance. Such a system makes it possible to monitor the ongoing stream of messages on social media, offering actionable insights into public hesitance with respect to vaccination. For Dutch Twitter messages that mention vaccination-related key terms, we annotated their stance and feeling in relation to vaccination (provided that they referred to this topic). Subsequently, we used these coded data to train and test different machine learning set-ups. With the aim to best identify messages with a negative stance towards vaccination, we compared set-ups at an increasing dataset size and decreasing reliability, at an increasing number of categories to distinguish, and with different classification algorithms. We found that Support Vector Machines trained on a combination of strictly and laxly labeled data with a more fine-grained labeling yielded the best result, at an F1-score of 0.36 and an Area under the ROC curve of 0.66, outperforming a rule-based sentiment analysis baseline that yielded an F1-score of 0.25 and an Area under the ROC curve of 0.57. The outcomes of our study indicate that stance prediction by a computerized system only is a challenging task. Our analysis of the data and behavior of our system suggests that an approach is needed in which the use of a larger training dataset is combined with a setting in which a human-in-the-loop provides the system with feedback on its predictions. |
Tasks | Sentiment Analysis |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00338v1 |
https://arxiv.org/pdf/1909.00338v1.pdf | |
PWC | https://paperswithcode.com/paper/monitoring-stance-towards-vaccination-in |
Repo | |
Framework | |
Radar Emitter Classification with Attribute-specific Recurrent Neural Networks
Title | Radar Emitter Classification with Attribute-specific Recurrent Neural Networks |
Authors | Paolo Notaro, Magdalini Paschali, Carsten Hopke, David Wittmann, Nassir Navab |
Abstract | Radar pulse streams exhibit increasingly complex temporal patterns and can no longer rely on a purely value-based analysis of the pulse attributes for the purpose of emitter classification. In this paper, we employ Recurrent Neural Networks (RNNs) to efficiently model and exploit the temporal dependencies present inside pulse streams. With the purpose of enhancing the network prediction capability, we introduce two novel techniques: a per-sequence normalization, able to mine the useful temporal patterns; and attribute-specific RNN processing, capable of processing the extracted information effectively. The new techniques are evaluated with an ablation study and the proposed solution is compared to previous Deep Learning (DL) approaches. Finally, a comparative study on the robustness of the same approaches is conducted and its results are presented. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07683v2 |
https://arxiv.org/pdf/1911.07683v2.pdf | |
PWC | https://paperswithcode.com/paper/radar-emitter-classification-with-attribute |
Repo | |
Framework | |
Learning to Think Outside the Box: Wide-Baseline Light Field Depth Estimation with EPI-Shift
Title | Learning to Think Outside the Box: Wide-Baseline Light Field Depth Estimation with EPI-Shift |
Authors | Titus Leistner, Hendrik Schilling, Radek Mackowiak, Stefan Gumhold, Carsten Rother |
Abstract | We propose a method for depth estimation from light field data, based on a fully convolutional neural network architecture. Our goal is to design a pipeline which achieves highly accurate results for small- and wide-baseline light fields. Since light field training data is scarce, all learning-based approaches use a small receptive field and operate on small disparity ranges. In order to work with wide-baseline light fields, we introduce the idea of EPI-Shift: To virtually shift the light field stack which enables to retain a small receptive field, independent of the disparity range. In this way, our approach “learns to think outside the box of the receptive field”. Our network performs joint classification of integer disparities and regression of disparity-offsets. A U-Net component provides excellent long-range smoothing. EPI-Shift considerably outperforms the state-of-the-art learning-based approaches and is on par with hand-crafted methods. We demonstrate this on a publicly available, synthetic, small-baseline benchmark and on large-baseline real-world recordings. |
Tasks | Depth Estimation |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09059v1 |
https://arxiv.org/pdf/1909.09059v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-think-outside-the-box-wide |
Repo | |
Framework | |
A geometry-inspired decision-based attack
Title | A geometry-inspired decision-based attack |
Authors | Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard |
Abstract | Deep neural networks have recently achieved tremendous success in image classification. Recent studies have however shown that they are easily misled into incorrect classification decisions by adversarial examples. Adversaries can even craft attacks by querying the model in black-box settings, where no information about the model is released except its final decision. Such decision-based attacks usually require lots of queries, while real-world image recognition systems might actually restrict the number of queries. In this paper, we propose qFool, a novel decision-based attack algorithm that can generate adversarial examples using a small number of queries. The qFool method can drastically reduce the number of queries compared to previous decision-based attacks while reaching the same quality of adversarial examples. We also enhance our method by constraining adversarial perturbations in low-frequency subspace, which can make qFool even more computationally efficient. Altogether, we manage to fool commercial image recognition systems with a small number of queries, which demonstrates the actual effectiveness of our new algorithm in practice. |
Tasks | Image Classification |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10826v1 |
http://arxiv.org/pdf/1903.10826v1.pdf | |
PWC | https://paperswithcode.com/paper/a-geometry-inspired-decision-based-attack |
Repo | |
Framework | |
Towards Automated Biometric Identification of Sea Turtles (Chelonia mydas)
Title | Towards Automated Biometric Identification of Sea Turtles (Chelonia mydas) |
Authors | Irwandi Hipiny, Hamimah Ujir, Aazani Mujahid, Nurhartini Kamalia Yahya |
Abstract | Passive biometric identification enables wildlife monitoring with minimal disturbance. Using a motion-activated camera placed at an elevated position and facing downwards, we collected images of sea turtle carapace, each belonging to one of sixteen Chelonia mydas juveniles. We then learned co-variant and robust image descriptors from these images, enabling indexing and retrieval. In this work, we presented several classification results of sea turtle carapaces using the learned image descriptors. We found that a template-based descriptor, i.e., Histogram of Oriented Gradients (HOG) performed exceedingly better during classification than keypoint-based descriptors. For our dataset, a high-dimensional descriptor is a must due to the minimal gradient and color information inside the carapace images. Using HOG, we obtained an average classification accuracy of 65%. |
Tasks | |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11277v1 |
https://arxiv.org/pdf/1909.11277v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-automated-biometric-identification-of |
Repo | |
Framework | |