January 26, 2020

3101 words 15 mins read

Paper Group ANR 1573

SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training. Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping. Learning the optimal state-feedback via supervised imitation learning. A Characteristic Function Approach to Deep Implicit Generative Modeling. N …

SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training


Title	SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training
Authors	Ahmed T. Elthakeb, Prannoy Pilligundla, Hadi Esmaeilzadeh
Abstract	Deep quantization of neural networks (below eight bits) offers significant promise in reducing their compute and storage cost. Albeit alluring, without special techniques for training and optimization, deep quantization results in significant accuracy loss. To further mitigate this loss, we propose a novel sinusoidal regularization, called SinReQ1, for deep quantized training. SinReQ adds a periodic term to the original objective function of the underlying training algorithm. SinReQ exploits the periodicity, differentiability, and the desired convexity profile in sinusoidal functions to automatically propel weights towards values that are inherently closer to quantization levels. Since, this technique does not require invasive changes to the training procedure, SinReQ can harmoniously enhance quantized training algorithms. SinReQ offers generality and flexibility as it is not limited to a certain bitwidth or a uniform assignment of bitwidths across layers. We carry out experimentation using the AlexNet, CIFAR-10, ResNet-18, ResNet-20, SVHN, and VGG-11 DNNs with three to five bits for quantization and show the versatility of SinReQ in enhancing multiple quantized training algorithms, DoReFa [32] and WRPN [24]. Averaging across all the bit configurations shows that SinReQ closes the accuracy gap between these two techniques and the full-precision runs by 32.4% and 27.5%, respectively. That is improving the absolute accuracy of DoReFa and WRPN by 2.8% and 2.1%, respectively.
Tasks	Quantization
Published	2019-05-04
URL	https://arxiv.org/abs/1905.01416v3
PDF	https://arxiv.org/pdf/1905.01416v3.pdf
PWC	https://paperswithcode.com/paper/sinreq-generalized-sinusoidal-regularization
Repo
Framework

Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping


Title	Chatter Detection in Turning Using Machine Learning and Similarity Measures of Time Series via Dynamic Time Warping
Authors	Melih C. Yesilli, Firas A. Khasawneh, Andreas Otto
Abstract	Chatter detection from sensor signals has been an active field of research. While some success has been reported using several featurization tools and machine learning algorithms, existing methods have several drawbacks such as manual preprocessing and requiring a large data set. In this paper, we present an alternative approach for chatter detection based on K-Nearest Neighbor (kNN) algorithm for classification and the Dynamic Time Warping (DTW) as a time series similarity measure. The used time series are the acceleration signals acquired from the tool holder in a series of turning experiments. Our results, show that this approach achieves detection accuracies that in most cases outperform existing methods. We compare our results to the traditional methods based on Wavelet Packet Transform (WPT) and the Ensemble Empirical Mode Decomposition (EEMD), as well as to the more recent Topological Data Analysis (TDA) based approach. We show that in three out of four cutting configurations our DTW-based approach attains the highest average classification rate reaching in one case as high as 99% accuracy. Our approach does not require feature extraction, is capable of reusing a classifier across different cutting configurations, and it uses reasonably sized training sets. Although the resulting high accuracy in our approach is associated with high computational cost, this is specific to the DTW implementation that we used. Specifically, we highlight available, very fast DTW implementations that can even be implemented on small consumer electronics. Therefore, further code optimization and the significantly reduced computational effort during the implementation phase make our approach a viable option for in-process chatter detection.
Tasks	Time Series, Topological Data Analysis
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01678v1
PDF	https://arxiv.org/pdf/1908.01678v1.pdf
PWC	https://paperswithcode.com/paper/chatter-detection-in-turning-using-machine
Repo
Framework

Learning the optimal state-feedback via supervised imitation learning


Title	Learning the optimal state-feedback via supervised imitation learning
Authors	Dharmesh Tailor, Dario Izzo
Abstract	Imitation learning is a control design paradigm that seeks to learn a control policy reproducing demonstrations from expert agents. By substituting expert demonstrations for optimal behaviours, the same paradigm leads to the design of control policies closely approximating the optimal state-feedback. This approach requires training a machine learning algorithm (in our case deep neural networks) directly on state-control pairs originating from optimal trajectories. We have shown in previous work that, when restricted to low-dimensional state and control spaces, this approach is very successful in several deterministic, non-linear problems in continuous-time. In this work, we refine our previous studies using as a test case a simple quadcopter model with quadratic and time-optimal objective functions. We describe in detail the best learning pipeline we have developed, that is able to approximate via deep neural networks the state-feedback map to a very high accuracy. We introduce the use of the softplus activation function in the hidden units of neural networks showing that it results in a smoother control profile whilst retaining the benefits of rectifiers. We show how to evaluate the optimality of the trained state-feedback, and find that already with two layers the objective function reached and its optimal value differ by less than one percent. We later consider also an additional metric linked to the system asymptotic behaviour - time taken to converge to the policy’s fixed point. With respect to these metrics, we show that improvements in the mean absolute error do not necessarily correspond to better policies.
Tasks	Imitation Learning
Published	2019-01-07
URL	http://arxiv.org/abs/1901.02369v2
PDF	http://arxiv.org/pdf/1901.02369v2.pdf
PWC	https://paperswithcode.com/paper/learning-the-optimal-state-feedback-via
Repo
Framework

A Characteristic Function Approach to Deep Implicit Generative Modeling


Title	A Characteristic Function Approach to Deep Implicit Generative Modeling
Authors	Abdul Fatir Ansari, Jonathan Scarlett, Harold Soh
Abstract	In this paper, we formulate the problem of learning an Implicit Generative Model (IGM) as minimizing the expected distance between characteristic functions. Specifically, we match the characteristic functions of the real and generated data distributions under a suitably-chosen weighting distribution. This distance measure, which we term as the characteristic function distance (CFD), can be (approximately) computed with linear time-complexity in the number of samples, compared to the quadratic-time Maximum Mean Discrepancy (MMD). By replacing the discrepancy measure in the critic of a GAN with the CFD, we obtain a model that is simple to implement and stable to train; the proposed metric enjoys desirable theoretical properties including continuity and differentiability with respect to generator parameters, and continuity in the weak topology. We further propose a variation of the CFD in which the weighting distribution parameters are also optimized during training; this obviates the need for manual tuning and leads to an improvement in test power relative to CFD. Experiments show that our proposed method outperforms WGAN and MMD-GAN variants on a variety of unsupervised image generation benchmark datasets.
Tasks	Image Generation
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07425v1
PDF	https://arxiv.org/pdf/1909.07425v1.pdf
PWC	https://paperswithcode.com/paper/a-characteristic-function-approach-to-deep
Repo
Framework

Neural Density Estimation and Likelihood-free Inference


Title	Neural Density Estimation and Likelihood-free Inference
Authors	George Papamakarios
Abstract	I consider two problems in machine learning and statistics: the problem of estimating the joint probability density of a collection of random variables, known as density estimation, and the problem of inferring model parameters when their likelihood is intractable, known as likelihood-free inference. The contribution of the thesis is a set of new methods for addressing these problems that are based on recent advances in neural networks and deep learning.
Tasks	Density Estimation
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13233v1
PDF	https://arxiv.org/pdf/1910.13233v1.pdf
PWC	https://paperswithcode.com/paper/191013233
Repo
Framework

Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains


Title	Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains
Authors	Navid Rekabsaz, Nikolaos Pappas, James Henderson, Banriskhem K. Khonglah, Srikanth Madikeri
Abstract	Neural language modeling (LM) has led to significant improvements in several applications, including Automatic Speech Recognition. However, they typically require large amounts of training data, which is not available for many domains and languages. In this study, we propose a multilingual neural language model architecture, trained jointly on the domain-specific data of several low-resource languages. The proposed multilingual LM consists of language specific word embeddings in the encoder and decoder, and one language specific LSTM layer, plus two LSTM layers with shared parameters across the languages. This multilingual LM model facilitates transfer learning across the languages, acting as an extra regularizer in very low-resource scenarios. We integrate our proposed multilingual approach with a state-of-the-art highly-regularized neural LM, and evaluate on the conversational data domain for four languages over a range of training data sizes. Compared to monolingual LMs, the results show significant improvements of our proposed multilingual LM when the amount of available training data is limited, indicating the advantages of cross-lingual parameter sharing in very low-resource language modeling.
Tasks	Language Modelling, Speech Recognition, Transfer Learning, Word Embeddings
Published	2019-05-29
URL	https://arxiv.org/abs/1906.01496v1
PDF	https://arxiv.org/pdf/1906.01496v1.pdf
PWC	https://paperswithcode.com/paper/190601496
Repo
Framework

Is Free Choice Permission Admissible in Classical Deontic Logic?


Title	Is Free Choice Permission Admissible in Classical Deontic Logic?
Authors	Guido Governatori, Antonino Rotolo
Abstract	In this paper, we explore how, and if, free choice permission (FCP) can be accepted when we consider deontic conflicts between certain types of permissions and obligations. As is well known, FCP can license, under some minimal conditions, the derivation of an indefinite number of permissions. We discuss this and other drawbacks and present six Hilbert-style classical deontic systems admitting a guarded version of FCP. The systems that we present are not too weak from the inferential viewpoint, as far as permission is concerned, and do not commit to weakening any specific logic for obligations.
Tasks
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07696v2
PDF	https://arxiv.org/pdf/1905.07696v2.pdf
PWC	https://paperswithcode.com/paper/is-free-choice-permission-admissible-in
Repo
Framework

Refactoring Neural Networks for Verification


Title	Refactoring Neural Networks for Verification
Authors	David Shriver, Dong Xu, Sebastian Elbaum, Matthew B. Dwyer
Abstract	Deep neural networks (DNN) are growing in capability and applicability. Their effectiveness has led to their use in safety critical and autonomous systems, yet there is a dearth of cost-effective methods available for reasoning about the behavior of a DNN. In this paper, we seek to expand the applicability and scalability of existing DNN verification techniques through DNN refactoring. A DNN refactoring defines (a) the transformation of the DNN’s architecture, i.e., the number and size of its layers, and (b) the distillation of the learned relationships between the input features and function outputs of the original to train the transformed network. Unlike with traditional code refactoring, DNN refactoring does not guarantee functional equivalence of the two networks, but rather it aims to preserve the accuracy of the original network while producing a simpler network that is amenable to more efficient property verification. We present an automated framework for DNN refactoring, and demonstrate its potential effectiveness through three case studies on networks used in autonomous systems.
Tasks
Published	2019-08-06
URL	https://arxiv.org/abs/1908.08026v1
PDF	https://arxiv.org/pdf/1908.08026v1.pdf
PWC	https://paperswithcode.com/paper/refactoring-neural-networks-for-verification
Repo
Framework

A Survey on Neural Network Language Models


Title	A Survey on Neural Network Language Models
Authors	Kun Jing, Jungang Xu
Abstract	As the core component of Natural Language Processing (NLP) system, Language Model (LM) can provide word representation and probability indication of word sequences. Neural Network Language Models (NNLMs) overcome the curse of dimensionality and improve the performance of traditional LMs. A survey on NNLMs is performed in this paper. The structure of classic NNLMs is described firstly, and then some major improvements are introduced and analyzed. We summarize and compare corpora and toolkits of NNLMs. Further, some research directions of NNLMs are discussed.
Tasks	Language Modelling
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03591v2
PDF	https://arxiv.org/pdf/1906.03591v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-neural-network-language-models
Repo
Framework

Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjects


Title	Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjects
Authors	Igor Koval, Stéphanie Allassonnière, Stanley Durrleman
Abstract	The ability to predict the progression of biomarkers, notably in NDD, is limited by the size of the longitudinal data sets, in terms of number of patients, number of visits per patients and total follow-up time. To this end, we introduce a data augmentation technique that is able to reproduce the variability seen in a longitudinal training data set and simulate continuous biomarkers trajectories for any number of virtual patients. Thanks to this simulation framework, we propose to transform the training set into a simulated data set with more patients, more time-points per patient and longer follow-up duration. We illustrate this approach on the prediction of the MMSE of MCI subjects of the ADNI data set. We show that it allows to reach predictions with errors comparable to the noise in the data, estimated in test/retest studies, achieving a improvement of 37% of the mean absolute error compared to the same non-augmented model.
Tasks	Data Augmentation
Published	2019-04-05
URL	http://arxiv.org/abs/1904.02921v1
PDF	http://arxiv.org/pdf/1904.02921v1.pdf
PWC	https://paperswithcode.com/paper/simulation-of-virtual-cohorts-increases
Repo
Framework

Monitoring stance towards vaccination in Twitter messages


Title	Monitoring stance towards vaccination in Twitter messages
Authors	Florian Kunneman, Mattijs Lambooij, Albert Wong, Antal van den Bosch, Liesbeth Mollema
Abstract	We developed a system to automatically classify stance towards vaccination in Twitter messages, with a focus on messages with a negative stance. Such a system makes it possible to monitor the ongoing stream of messages on social media, offering actionable insights into public hesitance with respect to vaccination. For Dutch Twitter messages that mention vaccination-related key terms, we annotated their stance and feeling in relation to vaccination (provided that they referred to this topic). Subsequently, we used these coded data to train and test different machine learning set-ups. With the aim to best identify messages with a negative stance towards vaccination, we compared set-ups at an increasing dataset size and decreasing reliability, at an increasing number of categories to distinguish, and with different classification algorithms. We found that Support Vector Machines trained on a combination of strictly and laxly labeled data with a more fine-grained labeling yielded the best result, at an F1-score of 0.36 and an Area under the ROC curve of 0.66, outperforming a rule-based sentiment analysis baseline that yielded an F1-score of 0.25 and an Area under the ROC curve of 0.57. The outcomes of our study indicate that stance prediction by a computerized system only is a challenging task. Our analysis of the data and behavior of our system suggests that an approach is needed in which the use of a larger training dataset is combined with a setting in which a human-in-the-loop provides the system with feedback on its predictions.
Tasks	Sentiment Analysis
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00338v1
PDF	https://arxiv.org/pdf/1909.00338v1.pdf
PWC	https://paperswithcode.com/paper/monitoring-stance-towards-vaccination-in
Repo
Framework

Radar Emitter Classification with Attribute-specific Recurrent Neural Networks


Title	Radar Emitter Classification with Attribute-specific Recurrent Neural Networks
Authors	Paolo Notaro, Magdalini Paschali, Carsten Hopke, David Wittmann, Nassir Navab
Abstract	Radar pulse streams exhibit increasingly complex temporal patterns and can no longer rely on a purely value-based analysis of the pulse attributes for the purpose of emitter classification. In this paper, we employ Recurrent Neural Networks (RNNs) to efficiently model and exploit the temporal dependencies present inside pulse streams. With the purpose of enhancing the network prediction capability, we introduce two novel techniques: a per-sequence normalization, able to mine the useful temporal patterns; and attribute-specific RNN processing, capable of processing the extracted information effectively. The new techniques are evaluated with an ablation study and the proposed solution is compared to previous Deep Learning (DL) approaches. Finally, a comparative study on the robustness of the same approaches is conducted and its results are presented.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07683v2
PDF	https://arxiv.org/pdf/1911.07683v2.pdf
PWC	https://paperswithcode.com/paper/radar-emitter-classification-with-attribute
Repo
Framework

Learning to Think Outside the Box: Wide-Baseline Light Field Depth Estimation with EPI-Shift


Title	Learning to Think Outside the Box: Wide-Baseline Light Field Depth Estimation with EPI-Shift
Authors	Titus Leistner, Hendrik Schilling, Radek Mackowiak, Stefan Gumhold, Carsten Rother
Abstract	We propose a method for depth estimation from light field data, based on a fully convolutional neural network architecture. Our goal is to design a pipeline which achieves highly accurate results for small- and wide-baseline light fields. Since light field training data is scarce, all learning-based approaches use a small receptive field and operate on small disparity ranges. In order to work with wide-baseline light fields, we introduce the idea of EPI-Shift: To virtually shift the light field stack which enables to retain a small receptive field, independent of the disparity range. In this way, our approach “learns to think outside the box of the receptive field”. Our network performs joint classification of integer disparities and regression of disparity-offsets. A U-Net component provides excellent long-range smoothing. EPI-Shift considerably outperforms the state-of-the-art learning-based approaches and is on par with hand-crafted methods. We demonstrate this on a publicly available, synthetic, small-baseline benchmark and on large-baseline real-world recordings.
Tasks	Depth Estimation
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09059v1
PDF	https://arxiv.org/pdf/1909.09059v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-think-outside-the-box-wide
Repo
Framework

A geometry-inspired decision-based attack


Title	A geometry-inspired decision-based attack
Authors	Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
Abstract	Deep neural networks have recently achieved tremendous success in image classification. Recent studies have however shown that they are easily misled into incorrect classification decisions by adversarial examples. Adversaries can even craft attacks by querying the model in black-box settings, where no information about the model is released except its final decision. Such decision-based attacks usually require lots of queries, while real-world image recognition systems might actually restrict the number of queries. In this paper, we propose qFool, a novel decision-based attack algorithm that can generate adversarial examples using a small number of queries. The qFool method can drastically reduce the number of queries compared to previous decision-based attacks while reaching the same quality of adversarial examples. We also enhance our method by constraining adversarial perturbations in low-frequency subspace, which can make qFool even more computationally efficient. Altogether, we manage to fool commercial image recognition systems with a small number of queries, which demonstrates the actual effectiveness of our new algorithm in practice.
Tasks	Image Classification
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10826v1
PDF	http://arxiv.org/pdf/1903.10826v1.pdf
PWC	https://paperswithcode.com/paper/a-geometry-inspired-decision-based-attack
Repo
Framework

Towards Automated Biometric Identification of Sea Turtles (Chelonia mydas)


Title	Towards Automated Biometric Identification of Sea Turtles (Chelonia mydas)
Authors	Irwandi Hipiny, Hamimah Ujir, Aazani Mujahid, Nurhartini Kamalia Yahya
Abstract	Passive biometric identification enables wildlife monitoring with minimal disturbance. Using a motion-activated camera placed at an elevated position and facing downwards, we collected images of sea turtle carapace, each belonging to one of sixteen Chelonia mydas juveniles. We then learned co-variant and robust image descriptors from these images, enabling indexing and retrieval. In this work, we presented several classification results of sea turtle carapaces using the learned image descriptors. We found that a template-based descriptor, i.e., Histogram of Oriented Gradients (HOG) performed exceedingly better during classification than keypoint-based descriptors. For our dataset, a high-dimensional descriptor is a must due to the minimal gradient and color information inside the carapace images. Using HOG, we obtained an average classification accuracy of 65%.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11277v1
PDF	https://arxiv.org/pdf/1909.11277v1.pdf
PWC	https://paperswithcode.com/paper/towards-automated-biometric-identification-of
Repo
Framework