Paper Group NANR 26
Evaluations and Methods for Explanation through Robustness Analysis. Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms. Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks. TOWARDS FEATURE SPACE ADVERSARIAL ATTACK. Credible Sample Elicitation by Deep Learning, for Deep Learn …
Evaluations and Methods for Explanation through Robustness Analysis
Title | Evaluations and Methods for Explanation through Robustness Analysis |
Authors | Anonymous |
Abstract | Among multiple ways of interpreting a machine learning model, measuring the importance of a set of features tied to a prediction is probably one of the most intuitive way to explain a model. In this paper, we establish the link between a set of features to a prediction with a new evaluation criteria, robustness analysis, which measures the minimum tolerance of adversarial perturbation. By measuring the tolerance level for an adversarial attack, we can extract a set of features that provides most robust support for a current prediction, and also can extract a set of features that contrasts the current prediction to a target class by setting a targeted adversarial attack. By applying this methodology to various prediction tasks across multiple domains, we observed the derived explanations are indeed capturing the significant feature set qualitatively and quantitatively. |
Tasks | Adversarial Attack |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Hye4KeSYDr |
https://openreview.net/pdf?id=Hye4KeSYDr | |
PWC | https://paperswithcode.com/paper/evaluations-and-methods-for-explanation |
Repo | |
Framework | |
Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms
Title | Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms |
Authors | Anonymous |
Abstract | Learning Mahalanobis metric spaces is an important problem that has found numerous applications. Several algorithms have been designed for this problem, including Information Theoretic Metric Learning (ITML) [Davis et al. 2007] and Large Margin Nearest Neighbor (LMNN) classification [Weinberger and Saul 2009]. We consider a formulation of Mahalanobis metric learning as an optimization problem,where the objective is to minimize the number of violated similarity/dissimilarity constraints. We show that for any fixed ambient dimension, there exists a fully polynomial time approximation scheme (FPTAS) with nearly-linear running time.This result is obtained using tools from the theory of linear programming in low dimensions. We also discuss improvements of the algorithm in practice, and present experimental results on synthetic and real-world data sets. Our algorithm is fully parallelizable and performs favorably in the presence of adversarial noise. |
Tasks | Metric Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SkluFgrFwH |
https://openreview.net/pdf?id=SkluFgrFwH | |
PWC | https://paperswithcode.com/paper/learning-mahalanobis-metric-spaces-via-1 |
Repo | |
Framework | |
Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
Title | Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks |
Authors | Anonymous |
Abstract | A number of techniques for interpretability have been presented for deep learning in computer vision, typically with the goal of understanding what it is that the networks have actually learned underneath a given classification decision. However, when it comes to deep video architectures, interpretability is still in its infancy and we do not yet have a clear concept of how we should decode spatiotemporal features. In this paper, we present a study comparing how 3D convolutional networks and convolutional LSTM networks respectively learn features across temporally dependent frames. This is the first comparison of two video models that both convolve to learn spatial features but that have principally different methods of modeling time. Additionally, we extend the concept of meaningful perturbation introduced by Fong & Vedaldi (2017) to the temporal dimension to search for the most meaningful part of a sequence for a classification decision. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1efAp4YvB |
https://openreview.net/pdf?id=S1efAp4YvB | |
PWC | https://paperswithcode.com/paper/interpreting-video-features-a-comparison-of |
Repo | |
Framework | |
TOWARDS FEATURE SPACE ADVERSARIAL ATTACK
Title | TOWARDS FEATURE SPACE ADVERSARIAL ATTACK |
Authors | Anonymous |
Abstract | We propose a new type of adversarial attack to Deep Neural Networks (DNNs) for image classification. Different from most existing attacks that directly perturb input pixels. Our attack focuses on perturbing abstract features, more specifically, features that denote styles, including interpretable styles such as vivid colors and sharp outlines, and uninterpretable ones. It induces model misclassfication by injecting style changes insensitive for humans, through an optimization procedure. We show that state-of-the-art pixel space adversarial attack detection and defense techniques are ineffective in guarding against feature space attacks. |
Tasks | Adversarial Attack, Image Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1eqj1SKvr |
https://openreview.net/pdf?id=S1eqj1SKvr | |
PWC | https://paperswithcode.com/paper/towards-feature-space-adversarial-attack |
Repo | |
Framework | |
Credible Sample Elicitation by Deep Learning, for Deep Learning
Title | Credible Sample Elicitation by Deep Learning, for Deep Learning |
Authors | Anonymous |
Abstract | It is important to collect credible training samples $(x,y)$ for building data-intensive learning systems (e.g., a deep learning system). In the literature, there is a line of studies on eliciting distributional information from self-interested agents who hold a relevant information. Asking people to report complex distribution $p(x)$, though theoretically viable, is challenging in practice. This is primarily due to the heavy cognitive loads required for human agents to reason and report this high dimensional information. Consider the example where we are interested in building an image classifier via first collecting a certain category of high-dimensional image data. While classical elicitation results apply to eliciting a complex and generative (and continuous) distribution $p(x)$ for this image data, we are interested in eliciting samples $x_i \sim p(x)$ from agents. This paper introduces a deep learning aided method to incentivize credible sample contributions from selfish and rational agents. The challenge to do so is to design an incentive-compatible score function to score each reported sample to induce truthful reports, instead of an arbitrary or even adversarial one. We show that with accurate estimation of a certain $f$-divergence function we are able to achieve approximate incentive compatibility in eliciting truthful samples. We then present an efficient estimator with theoretical guarantee via studying the variational forms of $f$-divergence function. Our work complements the literature of information elicitation via introducing the problem of \emph{sample elicitation}. We also show a connection between this sample elicitation problem and $f$-GAN, and how this connection can help reconstruct an estimator of the distribution based on collected samples. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SkgQwpVYwH |
https://openreview.net/pdf?id=SkgQwpVYwH | |
PWC | https://paperswithcode.com/paper/credible-sample-elicitation-by-deep-learning-1 |
Repo | |
Framework | |
Yet another but more efficient black-box adversarial attack: tiling and evolution strategies
Title | Yet another but more efficient black-box adversarial attack: tiling and evolution strategies |
Authors | Anonymous |
Abstract | We introduce a new black-box attack achieving state of the art performances. Our approach is based on a new objective function, borrowing ideas from $\ell_\infty$-white box attacks, and particularly designed to fit derivative-free optimization requirements. It only requires to have access to the logits of the classifier without any other information which is a more realistic scenario. Not only we introduce a new objective function, we extend previous works on black box adversarial attacks to a larger spectrum of evolution strategies and other derivative-free optimization methods. We also highlight a new intriguing property that deep neural networks are not robust to single shot tiled attacks. Our models achieve, with a budget limited to $10,000$ queries, results up to $99.2%$ of success rate against InceptionV3 classifier with $630$ queries to the network on average in the untargeted attacks setting, which is an improvement by $90$ queries of the current state of the art. In the targeted setting, we are able to reach, with a limited budget of $100,000$, $100%$ of success rate with a budget of $6,662$ queries on average, i.e. we need $800$ queries less than the current state of the art. |
Tasks | Adversarial Attack |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rygEokBKPS |
https://openreview.net/pdf?id=rygEokBKPS | |
PWC | https://paperswithcode.com/paper/yet-another-but-more-efficient-black-box |
Repo | |
Framework | |
Disentangled GANs for Controllable Generation of High-Resolution Images
Title | Disentangled GANs for Controllable Generation of High-Resolution Images |
Authors | Anonymous |
Abstract | Generative adversarial networks (GANs) have achieved great success at generating realistic samples. However, achieving disentangled and controllable generation still remains challenging for GANs, especially in the high-resolution image domain. Motivated by this, we introduce AC-StyleGAN, a combination of AC-GAN and StyleGAN, for demonstrating that the controllable generation of high-resolution images is possible with sufficient supervision. More importantly, only using 5% of the labelled data significantly improves the disentanglement quality. Inspired by the observed separation of fine and coarse styles in StyleGAN, we then extend AC-StyleGAN to a new image-to-image model called FC-StyleGAN for semantic manipulation of fine-grained factors in a high-resolution image. In experiments, we show that FC-StyleGAN performs well in only controlling fine-grained factors, with the use of instance normalization, and also demonstrate its good generalization ability to unseen images. Finally, we create two new datasets – Falcor3D and Isaac3D with higher resolution, more photorealism, and richer variation, as compared to existing disentanglement datasets. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SyezSCNYPB |
https://openreview.net/pdf?id=SyezSCNYPB | |
PWC | https://paperswithcode.com/paper/disentangled-gans-for-controllable-generation |
Repo | |
Framework | |
UWGAN: UNDERWATER GAN FOR REAL-WORLD UNDERWATER COLOR RESTORATION AND DEHAZING
Title | UWGAN: UNDERWATER GAN FOR REAL-WORLD UNDERWATER COLOR RESTORATION AND DEHAZING |
Authors | Anonymous |
Abstract | In real-world underwater environment, exploration of seabed resources, underwater archaeology, and underwater fishing rely on a variety of sensors, vision sensor is the most important one due to its high information content, non-intrusive, and passive nature. However, wavelength-dependent light attenuation and back-scattering result in color distortion and haze effect, which degrade the visibility of images. To address this problem, firstly, we proposed an unsupervised generative adversarial network (GAN) for generating realistic underwater images (color distortion and haze effect simulation) from in-air image and depth map pairs. Secondly, U-Net, which is trained efficiently using synthetic underwater dataset, is adopted for color restoration and de-hazing. Our model directly reconstructs underwater clear images using end-to-end autoencoder networks, while maintaining scene content structural similarity. The results obtained by our method were compared with existing methods qualitatively and quantitatively. Experimental results on open real-world underwater datasets demonstrate that the presented method performs well on different actual underwater scenes, and the processing speed can reach up to 125FPS on images running on one NVIDIA 1060 GPU. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HkgMxkHtPH |
https://openreview.net/pdf?id=HkgMxkHtPH | |
PWC | https://paperswithcode.com/paper/uwgan-underwater-gan-for-real-world |
Repo | |
Framework | |
Generating valid Euclidean distance matrices
Title | Generating valid Euclidean distance matrices |
Authors | Anonymous |
Abstract | Generating point clouds, e.g., molecular structures, in arbitrary rotations, translations, and enumerations remains a challenging task. Meanwhile, neural networks utilizing symmetry invariant layers have been shown to be able to optimize their training objective in a data-efficient way. In this spirit, we present an architecture which allows to produce valid Euclidean distance matrices, which by construction are already invariant under rotation and translation of the described object. Motivated by the goal to generate molecular structures in Cartesian space, we use this architecture to construct a Wasserstein GAN utilizing a permutation invariant critic network. This makes it possible to generate molecular structures in a one-shot fashion by producing Euclidean distance matrices which have a three- dimensional embedding. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Skl3SkSKDr |
https://openreview.net/pdf?id=Skl3SkSKDr | |
PWC | https://paperswithcode.com/paper/generating-valid-euclidean-distance-matrices |
Repo | |
Framework | |
From English to Foreign Languages: Transferring Pre-trained Language Models
Title | From English to Foreign Languages: Transferring Pre-trained Language Models |
Authors | Anonymous |
Abstract | Pre-trained models have demonstrated their effectiveness in many downstream natural language processing (NLP) tasks. The availability of multilingual pre-trained models enables zero-shot transfer of NLP tasks from high resource languages to low resource ones. However, recent research in improving pre-trained models focuses heavily on English. While it is possible to train the latest neural architectures for other languages from scratch, it is undesirable due to the required amount of compute. In this work, we tackle the problem of transferring an existing pre-trained model from English to other languages under a limited computational budget. With a single GPU, our approach can obtain a foreign BERT-base model within a day and a foreign BERT-large within two days. Furthermore, evaluating our models on six languages, we demonstrate that our models are better than multilingual BERT on two zero-shot tasks: natural language inference and dependency parsing. |
Tasks | Dependency Parsing, Natural Language Inference |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Bkle6T4YvB |
https://openreview.net/pdf?id=Bkle6T4YvB | |
PWC | https://paperswithcode.com/paper/from-english-to-foreign-languages |
Repo | |
Framework | |
Enforcing Physical Constraints in Neural Neural Networks through Differentiable PDE Layer
Title | Enforcing Physical Constraints in Neural Neural Networks through Differentiable PDE Layer |
Authors | Chiyu “Max” Jiang, Karthik Kashinath, Prabhat, Philip Marcus |
Abstract | Recent studies at the intersection of physics and deep learning have illustrated successes in the application of deep neural networks to partially or fully replace costly physics simulations. Enforcing physical constraints to solutions generated by neural networks remains a challenge, yet it is essential to the accuracy and trustworthiness of such model predictions. Many systems in the physical sciences are governed by Partial Differential Equations (PDEs). Enforcing these as hard constraints, we show, are inefficient in conventional frameworks due to the high dimensionality of the generated fields. To this end, we propose the use of a novel differentiable spectral projection layer for neural networks that efficiently enforces spatial PDE constraints using spectral methods, yet is fully differentiable, allowing for its use as a layer in neural networks that supports end-to-end training. We show that its computational cost is cheaper than a regular convolution layer. We apply it to an important class of physical systems – incompressible turbulent flows, where the divergence-free PDE constraint is required. We train a 3D Conditional Generative Adversarial Network (CGAN) for turbulent flow super-resolution efficiently, whilst guaranteeing the spatial PDE constraint of zero divergence. Furthermore, our empirical results show that the model produces realistic flow fields with more accurate flow statistics when trained with hard constraints imposed via the proposed novel differentiable spectral projection layer, as compared to soft constrained and unconstrained counterparts. |
Tasks | Super-Resolution |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1eyA3VFwS |
https://openreview.net/pdf?id=B1eyA3VFwS | |
PWC | https://paperswithcode.com/paper/enforcing-physical-constraints-in-neural |
Repo | |
Framework | |
Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning
Title | Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning |
Authors | Anonymous |
Abstract | Reinforcement Learning (RL) has demonstrated promising results across several sequential decision-making tasks. However, reinforcement learning struggles to learn efficiently, thus limiting its pervasive application to several challenging problems. A typical RL agent learns solely from its own trial-and-error experiences, requiring many experiences to learn a successful policy. To alleviate this problem, we propose collaborative inter-agent knowledge distillation (CIKD). CIKD is a learning framework that uses an ensemble of RL agents to execute different policies in the environment while sharing knowledge amongst agents in the ensemble. Our experiments demonstrate that CIKD improves upon state-of-the-art RL methods in sample efficiency and performance on several challenging MuJoCo benchmark tasks. Additionally, we present an in-depth investigation on how CIKD leads to performance improvements. |
Tasks | Decision Making |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkeYSlrYwH |
https://openreview.net/pdf?id=BkeYSlrYwH | |
PWC | https://paperswithcode.com/paper/collaborative-inter-agent-knowledge |
Repo | |
Framework | |
Where is the Information in a Deep Network?
Title | Where is the Information in a Deep Network? |
Authors | Anonymous |
Abstract | Whatever information a deep neural network has gleaned from past data is encoded in its weights. How this information affects the response of the network to future data is largely an open question. In fact, even how to define and measure information in a network entails some subtleties. We measure information in the weights of a deep neural network as the optimal trade-off between accuracy of the network and complexity of the weights relative to a prior. Depending on the prior, the definition reduces to known information measures such as Shannon Mutual Information and Fisher Information, but in general it affords added flexibility that enables us to relate it to generalization, via the PAC-Bayes bound, and to invariance. For the latter, we introduce a notion of effective information in the activations, which are deterministic functions of future inputs. We relate this to the Information in the Weights, and use this result to show that models of low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs. These relations hinge not only on the architecture of the model, but also on how it is trained. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkgHWkrtPB |
https://openreview.net/pdf?id=BkgHWkrtPB | |
PWC | https://paperswithcode.com/paper/where-is-the-information-in-a-deep-network |
Repo | |
Framework | |
Understanding and Training Deep Diagonal Circulant Neural Networks
Title | Understanding and Training Deep Diagonal Circulant Neural Networks |
Authors | Anonymous |
Abstract | In this paper, we study deep diagonal circulant neural networks, that is deep neural networks in which weight matrices are the product of diagonal and circulant ones. Besides making a theoretical analysis of their expressivity, we introduced principled techniques for training these models: we devise an initialization scheme and proposed a smart use of non-linearity functions in order to train deep diagonal circulant networks. Furthermore, we show that these networks outperform recently introduced deep networks with other types of structured layers. We conduct a thorough experimental study to compare the performance of deep diagonal circulant networks with state of the art models based on structured matrices and with dense models. We show that our models achieve better accuracy than other structured approaches while required 2x fewer weights as the next best approach. Finally we train deep diagonal circulant networks to build a compact and accurate models on a real world video classification dataset with over 3.8 million training examples. |
Tasks | Video Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rygL4gStDS |
https://openreview.net/pdf?id=rygL4gStDS | |
PWC | https://paperswithcode.com/paper/understanding-and-training-deep-diagonal |
Repo | |
Framework | |
A GOODNESS OF FIT MEASURE FOR GENERATIVE NETWORKS
Title | A GOODNESS OF FIT MEASURE FOR GENERATIVE NETWORKS |
Authors | Anonymous |
Abstract | We define a goodness of fit measure for generative networks which captures how well the network can generate the training data, which is necessary to learn the true data distribution. We demonstrate how our measure can be leveraged to understand mode collapse in generative adversarial networks and provide practitioners with a novel way to perform model comparison and early stopping without having to access another trained model as with Frechet Inception Distance or Inception Score. This measure shows that several successful, popular generative models, such as DCGAN and WGAN, fall very short of learning the data distribution. We identify this issue in generative models and empirically show that overparameterization via subsampling data and using a mixture of models improves performance in terms of goodness of fit. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BklsagBYPS |
https://openreview.net/pdf?id=BklsagBYPS | |
PWC | https://paperswithcode.com/paper/a-goodness-of-fit-measure-for-generative |
Repo | |
Framework | |