January 29, 2020

3253 words 16 mins read

Paper Group ANR 592

Overparameterized Neural Networks Can Implement Associative Memory. Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition. Design and Interpretation of Universal Adversarial Patches in Face Detection. Automobile Theft Detection by Clustering Owner Driver Data. Mitigation of Adversarial Examples in RF Deep Classifiers …

Overparameterized Neural Networks Can Implement Associative Memory


Title	Overparameterized Neural Networks Can Implement Associative Memory
Authors	Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
Abstract	Identifying computational mechanisms for memorization and retrieval is a long-standing problem at the intersection of machine learning and neuroscience. In this work, we demonstrate empirically that overparameterized deep neural networks trained using standard optimization methods provide a mechanism for memorization and retrieval of real-valued data. In particular, we show that overparameterized autoencoders store training examples as attractors, and thus, can be viewed as implementations of associative memory with the retrieval mechanism given by iterating the map. We study this phenomenon under a variety of common architectures and optimization methods and construct a network that can recall 500 real-valued images without any apparent spurious attractor states. Lastly, we demonstrate how the same mechanism allows encoding sequences, including movies and audio, instead of individual examples. Interestingly, this appears to provide an even more efficient mechanism for storage and retrieval than autoencoding single instances.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12362v1
PDF	https://arxiv.org/pdf/1909.12362v1.pdf
PWC	https://paperswithcode.com/paper/overparameterized-neural-networks-can
Repo
Framework

Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition


Title	Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition
Authors	Suraj Tripathi, Abhay Kumar, Abhiram Ramesh, Chirag Singh, Promod Yenigalla
Abstract	This paper proposes a Residual Convolutional Neural Network (ResNet) based on speech features and trained under Focal Loss to recognize emotion in speech. Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCCs) have shown the ability to characterize emotion better than just plain text. Further Focal Loss, first used in One-Stage Object Detectors, has shown the ability to focus the training process more towards hard-examples and down-weight the loss assigned to well-classified examples, thus preventing the model from being overwhelmed by easily classifiable examples.
Tasks	Emotion Recognition, Speech Emotion Recognition
Published	2019-06-11
URL	https://arxiv.org/abs/1906.05682v1
PDF	https://arxiv.org/pdf/1906.05682v1.pdf
PWC	https://paperswithcode.com/paper/focal-loss-based-residual-convolutional
Repo
Framework

Design and Interpretation of Universal Adversarial Patches in Face Detection


Title	Design and Interpretation of Universal Adversarial Patches in Face Detection
Authors	Xiao Yang, Fangyun Wei, Hongyang Zhang, Xiang Ming, Jun Zhu
Abstract	We consider universal adversarial patches for faces - small visual elements whose addition to a face image reliably destroys the performance of face detectors. Unlike previous work that mostly focused on the algorithmic design of adversarial examples in terms of improving the success rate as an attacker, in this work we show an interpretation of such patches that can prevent the state-of-the-art face detectors from detecting the real faces. We investigate a phenomenon: patches designed to suppress real face detection appear face-like. This phenomenon holds generally across different initialization, locations, scales of patches, backbones, and state-of-the-art face detection frameworks. We propose new optimization-based approaches to automatic design of universal adversarial patches for varying goals of the attack, including scenarios in which true positives are suppressed without introducing false positives. Our proposed algorithms perform well on real-world datasets, deceiving state-of-the-art face detectors in terms of multiple precision/recall metrics and transferring between different detection frameworks.
Tasks	Face Detection
Published	2019-11-30
URL	https://arxiv.org/abs/1912.05021v1
PDF	https://arxiv.org/pdf/1912.05021v1.pdf
PWC	https://paperswithcode.com/paper/design-and-interpretation-of-universal
Repo
Framework

Automobile Theft Detection by Clustering Owner Driver Data


Title	Automobile Theft Detection by Clustering Owner Driver Data
Authors	Yong Goo Kang, Kyung Ho Park, Huy Kang Kim
Abstract	As automobiles become intelligent, automobile theft methods are evolving intelligently. Therefore automobile theft detection has become a major research challenge. Data-mining, biometrics, and additional authentication methods have been proposed to address automobile theft, in previous studies. Among these methods, data-mining can be used to analyze driving characteristics and identify a driver comprehensively. However, it requires a labeled driving dataset to achieve high accuracy. It is impractical to use the actual automobile theft detection system because real theft driving data cannot be collected in advance. Hence, we propose a method to detect an automobile theft attempt using only owner driving data. We cluster the key features of the owner driving data using the k-means algorithm. After reconstructing the driving data into one of these clusters, theft is detected using an error from the original driving data. To validate the proposed models, we tested our actual driving data and obtained 99% accuracy from the best model. This result demonstrates that our proposed method can detect vehicle theft by using only the car owner’s driving data.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08929v1
PDF	https://arxiv.org/pdf/1909.08929v1.pdf
PWC	https://paperswithcode.com/paper/automobile-theft-detection-by-clustering
Repo
Framework

Mitigation of Adversarial Examples in RF Deep Classifiers Utilizing AutoEncoder Pre-training


Title	Mitigation of Adversarial Examples in RF Deep Classifiers Utilizing AutoEncoder Pre-training
Authors	Silvija Kokalj-Filipovic, Rob Miller, Nicholas Chang, Chi Leung Lau
Abstract	Adversarial examples in machine learning for images are widely publicized and explored. Illustrations of misclassifications caused by slightly perturbed inputs are abundant and commonly known (e.g., a picture of panda imperceptibly perturbed to fool the classifier into incorrectly labeling it as a gibbon). Similar attacks on deep learning (DL) for radio frequency (RF) signals and their mitigation strategies are scarcely addressed in the published work. Yet, RF adversarial examples (AdExs) with minimal waveform perturbations can cause drastic, targeted misclassification results, particularly against spectrum sensing/survey applications (e.g. BPSK is mistaken for 8-PSK). Our research on deep learning AdExs and proposed defense mechanisms are RF-centric, and incorporate physical world, over-the-air (OTA) effects. We herein present defense mechanisms based on pre-training the target classifier using an autoencoder. Our results validate this approach as a viable mitigation method to subvert adversarial attacks against deep learning-based communications and radar sensing systems.
Tasks
Published	2019-02-16
URL	http://arxiv.org/abs/1902.08034v1
PDF	http://arxiv.org/pdf/1902.08034v1.pdf
PWC	https://paperswithcode.com/paper/mitigation-of-adversarial-examples-in-rf-deep
Repo
Framework

Recurrent Neural Networks: An Embedded Computing Perspective


Title	Recurrent Neural Networks: An Embedded Computing Perspective
Authors	Nesma M. Rezk, Madhura Purnaprajna, Tomas Nordström, Zain Ul-Abdin
Abstract	Recurrent Neural Networks (RNNs) are a class of machine learning algorithms used for applications with time-series and sequential data. Recently, there has been a strong interest in executing RNNs on embedded devices. However, difficulties have arisen because RNN requires high computational capability and a large memory space. In this paper, we review existing implementations of RNN models on embedded platforms and discuss the methods adopted to overcome the limitations of embedded systems. We will define the objectives of mapping RNN algorithms on embedded platforms and the challenges facing their realization. Then, we explain the components of RNN models from an implementation perspective. We also discuss the optimizations applied to RNNs to run efficiently on embedded platforms. Finally, we compare the defined objectives with the implementations and highlight some open research questions and aspects currently not addressed for embedded RNNs. Overall, applying algorithmic optimizations to RNN models and decreasing the memory access overhead is vital to obtain high efficiency. To further increase the implementation efficiency, we point up the more promising optimizations that could be applied in future research. Additionally, this article observes that high performance has been targeted by many implementations, while flexibility has, as yet, been attempted less often. Thus, the article provides some guidelines for RNN hardware designers to support flexibility in a better manner.
Tasks	Time Series
Published	2019-07-23
URL	https://arxiv.org/abs/1908.07062v3
PDF	https://arxiv.org/pdf/1908.07062v3.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-an-embedded
Repo
Framework

Early Detection of Long Term Evaluation Criteria in Online Controlled Experiments


Title	Early Detection of Long Term Evaluation Criteria in Online Controlled Experiments
Authors	Yoni Schamroth, Liron Gat Kahlon, Boris Rabinovich, David Steinberg
Abstract	A common dilemma encountered by many upon implementing an optimization method or experiment, whether it be a reinforcement learning algorithm, or A/B testing, is deciding on what metric to optimize for. Very often short-term metrics, which are easier to measure are chosen over long term metrics which have undesirable time considerations and often a more complex calculation. In this paper, we argue the importance of choosing a metrics that focuses on long term effects. With this comes the necessity in the ability to measure significant differences between groups relatively early. We present here an efficient methodology for early detection of lifetime differences between groups based on bootstrap hypothesis testing of the lifetime forecast of the response. We present an application of this method in the domain of online advertising and we argue that approach not only allows one to focus on the ultimate metric of importance but also provides a means of accelerating the testing period.
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05959v1
PDF	https://arxiv.org/pdf/1906.05959v1.pdf
PWC	https://paperswithcode.com/paper/early-detection-of-long-term-evaluation
Repo
Framework

Rethinking Classification and Localization for Object Detection


Title	Rethinking Classification and Localization for Object Detection
Authors	Yue Wu, Yinpeng Chen, Lu Yuan, Zicheng Liu, Lijuan Wang, Hongzhi Li, Yun Fu
Abstract	Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures work for these two tasks. To address this issue, we perform a thorough analysis and find an interesting fact that the two head structures have opposite preferences towards the two tasks. Specifically, the fully connected head (fc-head) is more suitable for the classification task, while the convolution head (conv-head) is more suitable for the localization task. Furthermore, we examine the weight matrix in the fc-head and find that it learns spatial sensitive transformations. We believe that this allows fc-head to distinguish a complete object from part of an object, but is not robust to regress the whole object. Based upon these findings, we propose a Double-Head method, which has a fully connected head focusing on classification and a convolution head for bounding box regression. Without bells and whistles, our method gains +3.5 and +2.8 AP on MS COCO dataset from Feature Pyramid Network (FPN) baselines with ResNet-50 and ResNet-101 backbones, respectively.
Tasks	Object Detection
Published	2019-04-13
URL	https://arxiv.org/abs/1904.06493v3
PDF	https://arxiv.org/pdf/1904.06493v3.pdf
PWC	https://paperswithcode.com/paper/rethinking-classification-and-localization-in
Repo
Framework

Learning to Conceal: A Deep Learning Based Method for Preserving Privacy and Avoiding Prejudice


Title	Learning to Conceal: A Deep Learning Based Method for Preserving Privacy and Avoiding Prejudice
Authors	Moshe Hanukoglu, Nissan Goldberg, Aviv Rovshitz, Amos Azaria
Abstract	In this paper, we introduce a learning model able to conceals personal information (e.g. gender, age, ethnicity, etc.) from an image, while maintaining any additional information present in the image (e.g. smile, hair-style, brightness). Our trained model is not provided the information that it is concealing, and does not try learning it either. Namely, we created a variational autoencoder (VAE) model that is trained on a dataset including labels of the information one would like to conceal (e.g. gender, ethnicity, age). These labels are directly added to the VAE’s sampled latent vector. Due to the limited number of neurons in the latent vector and its appended noise, the VAE avoids learning any relation between the given images and the given labels, as those are given directly. Therefore, the encoded image lacks any of the information one wishes to conceal. The encoding may be decoded back into an image according to any provided properties (e.g. a 40 year old woman). The proposed architecture can be used as a mean for privacy preserving and can serve as an input to systems, which will become unbiased and not suffer from prejudice. We believe that privacy and discrimination are two of the most important aspects in which the community should try and develop methods to prevent misuse of technological advances.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09156v1
PDF	https://arxiv.org/pdf/1909.09156v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-conceal-a-deep-learning-based
Repo
Framework

Optimizing Design Verification using Machine Learning: Doing better than Random


Title	Optimizing Design Verification using Machine Learning: Doing better than Random
Authors	William Hughes, Sandeep Srinivasan, Rohit Suvarna, Maithilee Kulkarni
Abstract	As integrated circuits have become progressively more complex, constrained random stimulus has become ubiquitous as a means of stimulating a designs functionality and ensuring it fully meets expectations. In theory, random stimulus allows all possible combinations to be exercised given enough time, but in practice with highly complex designs a purely random approach will have difficulty in exercising all possible combinations in a timely fashion. As a result it is often necessary to steer the Design Verification (DV) environment to generate hard to hit combinations. The resulting constrained-random approach is powerful but often relies on extensive human expertise to guide the DV environment in order to fully exercise the design. As designs become more complex, the guidance aspect becomes progressively more challenging and time consuming often resulting in design schedules in which the verification time to hit all possible design coverage points is the dominant schedule limitation. This paper describes an approach which leverages existing constrained-random DV environment tools but which further enhances them using supervised learning and reinforcement learning techniques. This approach provides better than random results in a highly automated fashion thereby ensuring DV objectives of full design coverage can be achieved on an accelerated timescale and with fewer resources. Two hardware verification examples are presented, one of a Cache Controller design and one using the open-source RISCV-Ariane design and Google’s RISCV Random Instruction Generator. We demonstrate that a machine-learning based approach can perform significantly better on functional coverage and reaching complex hard-to-hit states than a random or constrained-random approach.
Tasks
Published	2019-09-28
URL	https://arxiv.org/abs/1909.13168v1
PDF	https://arxiv.org/pdf/1909.13168v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-design-verification-using-machine
Repo
Framework

User Evaluation of a Multi-dimensional Statistical Dialogue System


Title	User Evaluation of a Multi-dimensional Statistical Dialogue System
Authors	Simon Keizer, Ondřej Dušek, Xingkun Liu, Verena Rieser
Abstract	We present the first complete spoken dialogue system driven by a multi-dimensional statistical dialogue manager. This framework has been shown to substantially reduce data needs by leveraging domain-independent dimensions, such as social obligations or feedback, which (as we show) can be transferred between domains. In this paper, we conduct a user study and show that the performance of a multi-dimensional system, which can be adapted from a source domain, is equivalent to that of a one-dimensional baseline, which can only be trained from scratch.
Tasks
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02965v1
PDF	https://arxiv.org/pdf/1909.02965v1.pdf
PWC	https://paperswithcode.com/paper/user-evaluation-of-a-multi-dimensional
Repo
Framework

Improving Generalization by Incorporating Coverage in Natural Language Inference


Title	Improving Generalization by Incorporating Coverage in Natural Language Inference
Authors	Nafise Sadat Moosavi, Prasetya Ajie Utama, Andreas Rücklé, Iryna Gurevych
Abstract	The task of natural language inference (NLI) is to identify the relation between the given premise and hypothesis. While recent NLI models achieve very high performance on individual datasets, they fail to generalize across similar datasets. This indicates that they are solving NLI datasets instead of the task itself. In order to improve generalization, we propose to extend the input representations with an abstract view of the relation between the hypothesis and the premise, i.e., how well the individual words, or word n-grams, of the hypothesis are covered by the premise. Our experiments show that the use of this information considerably improves generalization across different NLI datasets without requiring any external knowledge or additional data. Finally, we show that using the coverage information is not only beneficial for improving the performance across different datasets of the same task. The resulting generalization improves the performance across datasets that belong to similar but not the same tasks.
Tasks	Natural Language Inference
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08940v1
PDF	https://arxiv.org/pdf/1909.08940v1.pdf
PWC	https://paperswithcode.com/paper/improving-generalization-by-incorporating
Repo
Framework

Uncertainty quantification of molecular property prediction with Bayesian neural networks


Title	Uncertainty quantification of molecular property prediction with Bayesian neural networks
Authors	Seongok Ryu, Yongchan Kwon, Woo Youn Kim
Abstract	Deep neural networks have outperformed existing machine learning models in various molecular applications. In practical applications, it is still difficult to make confident decisions because of the uncertainty in predictions arisen from insufficient quality and quantity of training data. Here, we show that Bayesian neural networks are useful to quantify the uncertainty of molecular property prediction with three numerical experiments. In particular, it enables us to decompose the predictive variance into the model- and data-driven uncertainties, which helps to elucidate the source of errors. In the logP predictions, we show that data noise affected the data-driven uncertainties more significantly than the model-driven ones. Based on this analysis, we were able to find unexpected errors in the Harvard Clean Energy Project dataset. Lastly, we show that the confidence of prediction is closely related to the predictive uncertainty by performing on bio-activity and toxicity classification problems.
Tasks	Molecular Property Prediction
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08375v1
PDF	http://arxiv.org/pdf/1903.08375v1.pdf
PWC	https://paperswithcode.com/paper/uncertainty-quantification-of-molecular
Repo
Framework

Hand Sign to Bangla Speech: A Deep Learning in Vision based system for Recognizing Hand Sign Digits and Generating Bangla Speech


Title	Hand Sign to Bangla Speech: A Deep Learning in Vision based system for Recognizing Hand Sign Digits and Generating Bangla Speech
Authors	Shahjalal Ahmed, Md. Rafiqul Islam, Jahid Hassan, Minhaz Uddin Ahmed, Bilkis Jamal Ferdosi, Sanjay Saha, Md. Shopon
Abstract	Recent advancements in the field of computer vision with the help of deep neural networks have led us to explore and develop many existing challenges that were once unattended due to the lack of necessary technologies. Hand Sign/Gesture Recognition is one of the significant areas where the deep neural network is making a substantial impact. In the last few years, a large number of researches has been conducted to recognize hand signs and hand gestures, which we aim to extend to our mother-tongue, Bangla (also known as Bengali). The primary goal of our work is to make an automated tool to aid the people who are unable to speak. We developed a system that automatically detects hand sign based digits and speaks out the result in Bangla language. According to the report of the World Health Organization (WHO), 15% of people in the world live with some kind of disabilities. Among them, individuals with communication impairment such as speech disabilities experience substantial barrier in social interaction. The proposed system can be invaluable to mitigate such a barrier. The core of the system is built with a deep learning model which is based on convolutional neural networks (CNN). The model classifies hand sign based digits with 92% accuracy over validation data which ensures it a highly trustworthy system. Upon classification of the digits, the resulting output is fed to the text to speech engine and the translator unit eventually which generates audio output in Bangla language. A web application to demonstrate our tool is available at http://bit.ly/signdigits2banglaspeech.
Tasks	Gesture Recognition
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05613v1
PDF	http://arxiv.org/pdf/1901.05613v1.pdf
PWC	https://paperswithcode.com/paper/hand-sign-to-bangla-speech-a-deep-learning-in
Repo
Framework

Generation High resolution 3D model from natural language by Generative Adversarial Network


Title	Generation High resolution 3D model from natural language by Generative Adversarial Network
Authors	Kentaro Fukamizu, Masaaki Kondo, Ryuichi Sakamoto
Abstract	We present a method of generating high resolution 3D shapes from natural language descriptions. To achieve this goal, we propose two steps that generating low resolution shapes which roughly reflect texts and generating high resolution shapes which reflect the detail of texts. In a previous paper, the authors have shown a method of generating low resolution shapes. We improve it to generate 3D shapes more faithful to natural language and test the effectiveness of the method. To generate high resolution 3D shapes, we use the framework of Conditional Wasserstein GAN. We propose two roles of Critic separately, which calculate the Wasserstein distance between two probability distribution, so that we achieve generating high quality shapes or acceleration of learning speed of model. To evaluate our approach, we performed quantitive evaluation with several numerical metrics for Critic models. Our method is first to realize the generation of high quality model by propagating text embedding information to high resolution task when generating 3D model.
Tasks
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07165v1
PDF	http://arxiv.org/pdf/1901.07165v1.pdf
PWC	https://paperswithcode.com/paper/generation-high-resolution-3d-model-from
Repo
Framework