Paper Group ANR 1325
A Formalization of Robustness for Deep Neural Networks. Attribution-driven Causal Analysis for Detection of Adversarial Examples. Convolutional Neural Network-based Topology Optimization (CNN-TO) By Estimating Sensitivity of Compliance from Material Distribution. Face Liveness Detection Based on Client Identity Using Siamese Network. Updates-Leak: …
A Formalization of Robustness for Deep Neural Networks
Title | A Formalization of Robustness for Deep Neural Networks |
Authors | Tommaso Dreossi, Shromona Ghosh, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia |
Abstract | Deep neural networks have been shown to lack robustness to small input perturbations. The process of generating the perturbations that expose the lack of robustness of neural networks is known as adversarial input generation. This process depends on the goals and capabilities of the adversary, In this paper, we propose a unifying formalization of the adversarial input generation process from a formal methods perspective. We provide a definition of robustness that is general enough to capture different formulations. The expressiveness of our formalization is shown by modeling and comparing a variety of adversarial attack techniques. |
Tasks | Adversarial Attack |
Published | 2019-03-24 |
URL | http://arxiv.org/abs/1903.10033v1 |
http://arxiv.org/pdf/1903.10033v1.pdf | |
PWC | https://paperswithcode.com/paper/a-formalization-of-robustness-for-deep-neural |
Repo | |
Framework | |
Attribution-driven Causal Analysis for Detection of Adversarial Examples
Title | Attribution-driven Causal Analysis for Detection of Adversarial Examples |
Authors | Susmit Jha, Sunny Raj, Steven Lawrence Fernandes, Sumit Kumar Jha, Somesh Jha, Gunjan Verma, Brian Jalaian, Ananthram Swami |
Abstract | Attribution methods have been developed to explain the decision of a machine learning model on a given input. We use the Integrated Gradient method for finding attributions to define the causal neighborhood of an input by incrementally masking high attribution features. We study the robustness of machine learning models on benign and adversarial inputs in this neighborhood. Our study indicates that benign inputs are robust to the masking of high attribution features but adversarial inputs generated by the state-of-the-art adversarial attack methods such as DeepFool, FGSM, CW and PGD, are not robust to such masking. Further, our study demonstrates that this concentration of high-attribution features responsible for the incorrect decision is more pronounced in physically realizable adversarial examples. This difference in attribution of benign and adversarial inputs can be used to detect adversarial examples. Such a defense approach is independent of training data and attack method, and we demonstrate its effectiveness on digital and physically realizable perturbations. |
Tasks | Adversarial Attack |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1903.05821v1 |
http://arxiv.org/pdf/1903.05821v1.pdf | |
PWC | https://paperswithcode.com/paper/attribution-driven-causal-analysis-for |
Repo | |
Framework | |
Convolutional Neural Network-based Topology Optimization (CNN-TO) By Estimating Sensitivity of Compliance from Material Distribution
Title | Convolutional Neural Network-based Topology Optimization (CNN-TO) By Estimating Sensitivity of Compliance from Material Distribution |
Authors | Yusuke Takahashi, Yoshiro Suzuki, Akira Todoroki |
Abstract | This paper proposes a new topology optimization method that applies a convolutional neural network (CNN), which is one deep learning technique for topology optimization problems. Using this method, we acquire a structure with a little higher performance that could not be obtained by the previous topology optimization method. In particular, in this paper, we solve a topology optimization problem aimed at maximizing stiffness with a mass constraint, which is a common type of topology optimization. In this paper, we first formulate the conventional topology optimization by the solid isotropic material with penalization method. Next, we formulate the topology optimization using CNN. Finally, we show the effectiveness of the proposed topology optimization method by solving a verification example, namely a topology optimization problem aimed at maximizing stiffness. In this research, as a result of solving the verification example for a small design area of 16x32 element, we obtain the solution different from the previous topology optimization method. This result suggests that stiffness information of structure can be extracted and analyzed for structural design by analyzing the density distribution using CNN like an image. This suggests that CNN technology can be utilized in the structural design and topology optimization. |
Tasks | |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/2001.00635v1 |
https://arxiv.org/pdf/2001.00635v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-based-topology |
Repo | |
Framework | |
Face Liveness Detection Based on Client Identity Using Siamese Network
Title | Face Liveness Detection Based on Client Identity Using Siamese Network |
Authors | Huiling Hao, Mingtao Pei |
Abstract | Face liveness detection is an essential prerequisite for face recognition applications. Previous face liveness detection methods usually train a binary classifier to differentiate between a fake face and a real face before face recognition. The client identity information is not utilized in previous face liveness detection methods. However, in practical face recognition applications, face spoofing attacks are always aimed at a specific client, and the client identity information can provide useful clues for face liveness detection. In this paper, we propose a face liveness detection method based on the client identity using Siamese network. We detect face liveness after face recognition instead of before face recognition, that is, we detect face liveness with the client identity information. We train a Siamese network with image pairs. Each image pair consists of two real face images or one real and one fake face images. The face images in each pair come from a same client. Given a test face image, the face image is firstly recognized by face recognition system, then the real face image of the identified client is retrieved to help the face liveness detection. Experiment results demonstrate the effectiveness of our method. |
Tasks | Face Recognition |
Published | 2019-03-13 |
URL | http://arxiv.org/abs/1903.05369v1 |
http://arxiv.org/pdf/1903.05369v1.pdf | |
PWC | https://paperswithcode.com/paper/face-liveness-detection-based-on-client |
Repo | |
Framework | |
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning
Title | Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning |
Authors | Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, Yang Zhang |
Abstract | Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML model owners updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results. In this paper, we investigate whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update, namely the updating set. This constitutes a new attack surface against black-box ML models and such information leakage may compromise the intellectual property and data privacy of the ML model owner. We propose four attacks following an encoder-decoder formulation, which allows inferring diverse information of the updating set. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (CBM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows reconstructing accurate samples. Our experiments show that the proposed attacks achieve strong performance. |
Tasks | |
Published | 2019-04-01 |
URL | https://arxiv.org/abs/1904.01067v2 |
https://arxiv.org/pdf/1904.01067v2.pdf | |
PWC | https://paperswithcode.com/paper/updates-leak-data-set-inference-and |
Repo | |
Framework | |
Attack Type Agnostic Perceptual Enhancement of Adversarial Images
Title | Attack Type Agnostic Perceptual Enhancement of Adversarial Images |
Authors | Bilgin Aksoy, Alptekin Temizel |
Abstract | Adversarial images are samples that are intentionally modified to deceive machine learning systems. They are widely used in applications such as CAPTHAs to help distinguish legitimate human users from bots. However, the noise introduced during the adversarial image generation process degrades the perceptual quality and introduces artificial colours; making it also difficult for humans to classify images and recognise objects. In this letter, we propose a method to enhance the perceptual quality of these adversarial images. The proposed method is attack type agnostic and could be used in association with the existing attacks in the literature. Our experiments show that the generated adversarial images have lower Euclidean distance values while maintaining the same adversarial attack performance. Distances are reduced by 5.88% to 41.27% with an average reduction of 22% over the different attack and network types. |
Tasks | Adversarial Attack, Image Generation |
Published | 2019-03-07 |
URL | https://arxiv.org/abs/1903.03029v3 |
https://arxiv.org/pdf/1903.03029v3.pdf | |
PWC | https://paperswithcode.com/paper/attack-type-agnostic-perceptual-enhancement |
Repo | |
Framework | |
Casimir effect with machine learning
Title | Casimir effect with machine learning |
Authors | M. N. Chernodub, Harold Erbin, I. V. Grishmanovskii, V. A. Goy, A. V. Molochkov |
Abstract | Vacuum fluctuations of quantum fields between physical objects depend on the shapes, positions, and internal composition of the latter. For objects of arbitrary shapes, even made from idealized materials, the calculation of the associated zero-point (Casimir) energy is an analytically intractable challenge. We propose a new numerical approach to this problem based on machine-learning techniques and illustrate the effectiveness of the method in a (2+1) dimensional scalar field theory. The Casimir energy is first calculated numerically using a Monte-Carlo algorithm for a set of the Dirichlet boundaries of various shapes. Then, a neural network is trained to compute this energy given the Dirichlet domain, treating the latter as black-and-white pixelated images. We show that after the learning phase, the neural network is able to quickly predict the Casimir energy for new boundaries of general shapes with reasonable accuracy. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07571v1 |
https://arxiv.org/pdf/1911.07571v1.pdf | |
PWC | https://paperswithcode.com/paper/casimir-effect-with-machine-learning |
Repo | |
Framework | |
Modeling Long-Range Context for Concurrent Dialogue Acts Recognition
Title | Modeling Long-Range Context for Concurrent Dialogue Acts Recognition |
Authors | Yue Yu, Siyao Peng, Grace Hui Yang |
Abstract | In dialogues, an utterance is a chain of consecutive sentences produced by one speaker which ranges from a short sentence to a thousand-word post. When studying dialogues at the utterance level, it is not uncommon that an utterance would serve multiple functions. For instance, “Thank you. It works great.” expresses both gratitude and positive feedback in the same utterance. Multiple dialogue acts (DA) for one utterance breeds complex dependencies across dialogue turns. Therefore, DA recognition challenges a model’s predictive power over long utterances and complex DA context. We term this problem Concurrent Dialogue Acts (CDA) recognition. Previous work on DA recognition either assumes one DA per utterance or fails to realize the sequential nature of dialogues. In this paper, we present an adapted Convolutional Recurrent Neural Network (CRNN) which models the interactions between utterances of long-range context. Our model significantly outperforms existing work on CDA recognition on a tech forum dataset. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00521v2 |
https://arxiv.org/pdf/1909.00521v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-long-range-context-for-concurrent |
Repo | |
Framework | |
The Efficacy of SHIELD under Different Threat Models
Title | The Efficacy of SHIELD under Different Threat Models |
Authors | Cory Cornelius, Nilaksh Das, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau |
Abstract | In this appraisal paper, we evaluate the efficacy of SHIELD, a compression-based defense framework for countering adversarial attacks on image classification models, which was published at KDD 2018. Here, we consider alternative threat models not studied in the original work, where we assume that an adaptive adversary is aware of the ensemble defense approach, the defensive pre-processing, and the architecture and weights of the models used in the ensemble. We define scenarios with varying levels of threat and empirically analyze the proposed defense by varying the degree of information available to the attacker, spanning from a full white-box attack to the gray-box threat model described in the original work. To evaluate the robustness of the defense against an adaptive attacker, we consider the targeted-attack success rate of the Projected Gradient Descent (PGD) attack, which is a strong gradient-based adversarial attack proposed in adversarial machine learning research. We also experiment with training the SHIELD ensemble from scratch, which is different from re-training using a pre-trained model as done in the original work. We find that the targeted PGD attack has a success rate of 64.3% against the original SHIELD ensemble in the full white box scenario, but this drops to 48.9% if the models used in the ensemble are trained from scratch instead of being retrained. Our experiments further reveal that an ensemble whose models are re-trained indeed have higher correlation in the cosine similarity space, and models that are trained from scratch are less vulnerable to targeted attacks in the white-box and gray-box scenarios. |
Tasks | Adversarial Attack, Image Classification |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00541v2 |
https://arxiv.org/pdf/1902.00541v2.pdf | |
PWC | https://paperswithcode.com/paper/the-efficacy-of-shield-under-different-threat |
Repo | |
Framework | |
DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances
Title | DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances |
Authors | Tianming Wang, Wenjie Lu, Zheng Yan, Dikai Liu |
Abstract | This paper presents an observer-integrated Reinforcement Learning (RL) approach, called Disturbance OBserver Network (DOB-Net), for robots operating in environments where disturbances are unknown and time-varying, and may frequently exceed robot control capabilities. The DOB-Net integrates a disturbance dynamics observer network and a controller network. Originated from conventional DOB mechanisms, the observer is built and enhanced via Recurrent Neural Networks (RNNs), encoding estimation of past values and prediction of future values of unknown disturbances in RNN hidden state. Such encoding allows the controller generate optimal control signals to actively reject disturbances, under the constraints of robot control capabilities. The observer and the controller are jointly learned within policy optimization by advantage actor critic. Numerical simulations on position regulation tasks have demonstrated that the proposed DOB-Net significantly outperforms a conventional feedback controller and classical RL algorithms. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04514v2 |
https://arxiv.org/pdf/1907.04514v2.pdf | |
PWC | https://paperswithcode.com/paper/dob-net-actively-rejecting-unknown-excessive |
Repo | |
Framework | |
Strong Black-box Adversarial Attacks on Unsupervised Machine Learning Models
Title | Strong Black-box Adversarial Attacks on Unsupervised Machine Learning Models |
Authors | Anshuman Chhabra, Abhishek Roy, Prasant Mohapatra |
Abstract | Machine Learning (ML) and Deep Learning (DL) models have achieved state-of-the-art performance on multiple learning tasks, from vision to natural language modelling. With the growing adoption of ML and DL to many areas of computer science, recent research has also started focusing on the security properties of these models. There has been a lot of work undertaken to understand if (deep) neural network architectures are resilient to black-box adversarial attacks which craft perturbed input samples that fool the classifier without knowing the architecture used. Recent work has also focused on the transferability of adversarial attacks and found that adversarial attacks are generally easily transferable between models, datasets, and techniques. However, such attacks and their analysis have not been covered from the perspective of unsupervised machine learning algorithms. In this paper, we seek to bridge this gap through multiple contributions. We first provide a strong (iterative) black-box adversarial attack that can craft adversarial samples which will be incorrectly clustered irrespective of the choice of clustering algorithm. We choose 4 prominent clustering algorithms, and a real-world dataset to show the working of the proposed adversarial algorithm. Using these clustering algorithms we also carry out a simple study of cross-technique adversarial attack transferability. |
Tasks | Adversarial Attack, Language Modelling |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09493v3 |
https://arxiv.org/pdf/1901.09493v3.pdf | |
PWC | https://paperswithcode.com/paper/strong-black-box-adversarial-attacks-on |
Repo | |
Framework | |
Semantic Noise Matters for Neural Natural Language Generation
Title | Semantic Noise Matters for Neural Natural Language Generation |
Authors | Ondřej Dušek, David M. Howcroft, Verena Rieser |
Abstract | Neural natural language generation (NNLG) systems are known for their pathological outputs, i.e. generating text which is unrelated to the input specification. In this paper, we show the impact of semantic noise on state-of-the-art NNLG models which implement different semantic control mechanisms. We find that cleaned data can improve semantic correctness by up to 97%, while maintaining fluency. We also find that the most common error is omitting information, rather than hallucination. |
Tasks | Text Generation |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03905v1 |
https://arxiv.org/pdf/1911.03905v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-noise-matters-for-neural-natural |
Repo | |
Framework | |
If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks
Title | If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks |
Authors | Arnu Pretorius, Elan van Biljon, Benjamin van Niekerk, Ryan Eloff, Matthew Reynard, Steve James, Benjamin Rosman, Herman Kamper, Steve Kroon |
Abstract | Recent work in signal propagation theory has shown that dropout limits the depth to which information can propagate through a neural network. In this paper, we investigate the effect of initialisation on training speed and generalisation for ReLU networks within this depth limit. We ask the following research question: given that critical initialisation is crucial for training at large depth, if dropout limits the depth at which networks are trainable, does initialising critically still matter? We conduct a large-scale controlled experiment, and perform a statistical analysis of over $12000$ trained networks. We find that (1) trainable networks show no statistically significant difference in performance over a wide range of non-critical initialisations; (2) for initialisations that show a statistically significant difference, the net effect on performance is small; (3) only extreme initialisations (very small or very large) perform worse than criticality. These findings also apply to standard ReLU networks of moderate depth as a special case of zero dropout. Our results therefore suggest that, in the shallow-to-moderate depth setting, critical initialisation provides zero performance gains when compared to off-critical initialisations and that searching for off-critical initialisations that might improve training speed or generalisation, is likely to be a fruitless endeavour. |
Tasks | |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05725v2 |
https://arxiv.org/pdf/1910.05725v2.pdf | |
PWC | https://paperswithcode.com/paper/if-dropout-limits-trainable-depth-does |
Repo | |
Framework | |
Functional Asplund’s metrics for pattern matching robust to variable lighting conditions
Title | Functional Asplund’s metrics for pattern matching robust to variable lighting conditions |
Authors | Guillaume Noyel, Michel Jourlin |
Abstract | In this paper, we propose a complete framework to process images captured under uncontrolled lighting and especially under low lighting. By taking advantage of the Logarithmic Image Processing (LIP) context, we study two novel functional metrics: i) the LIP-multiplicative Asplund’s metric which is robust to object absorption variations and ii) the LIP-additive Asplund’s metric which is robust to variations of source intensity and exposure-time. We introduce robust to noise versions of these metrics. We demonstrate that the maps of their corresponding distances between an image and a reference template are linked to Mathematical Morphology. This facilitates their implementation. We assess them in various situations with different lightings and movements. Results show that those maps of distances are robust to lighting variations. Importantly, they are efficient to detect patterns in low-contrast images with a template acquired under a different lighting. |
Tasks | |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01585v1 |
https://arxiv.org/pdf/1909.01585v1.pdf | |
PWC | https://paperswithcode.com/paper/functional-asplunds-metrics-for-pattern |
Repo | |
Framework | |
Do Facial Expressions Predict Ad Sharing? A Large-Scale Observational Study
Title | Do Facial Expressions Predict Ad Sharing? A Large-Scale Observational Study |
Authors | Daniel McDuff, Jonah Berger |
Abstract | People often share news and information with their social connections, but why do some advertisements get shared more than others? A large-scale test examines whether facial responses predict sharing. Facial expressions play a key role in emotional expression. Using scalable automated facial coding algorithms, we quantify the facial expressions of thousands of individuals in response to hundreds of advertisements. Results suggest that not all emotions expressed during viewing increase sharing, and that the relationship between emotion and transmission is more complex than mere valence alone. Facial actions linked to positive emotions (i.e., smiles) were associated with increased sharing. But while some actions associated with negative emotion (e.g., lip depressor, associated with sadness) were linked to decreased sharing, others (i.e., nose wrinkles, associated with disgust) were linked to increased sharing. The ability to quickly collect facial responses at scale in peoples’ natural environment has important implications for marketers and opens up a range of avenues for further research. |
Tasks | |
Published | 2019-12-21 |
URL | https://arxiv.org/abs/1912.10311v1 |
https://arxiv.org/pdf/1912.10311v1.pdf | |
PWC | https://paperswithcode.com/paper/do-facial-expressions-predict-ad-sharing-a |
Repo | |
Framework | |