Paper Group AWR 124
Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction. On the “steerability” of generative adversarial networks. Learning to compute inner consensus: A novel approach to modeling agreement between Capsules. SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications. Simultaneous Object Detection and Sema …
Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction
Title | Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction |
Authors | Yunhan Jia, Yantao Lu, Senem Velipasalar, Zhenyu Zhong, Tao Wei |
Abstract | Neural networks are known to be vulnerable to carefully crafted adversarial examples, and these malicious samples often transfer, i.e., they maintain their effectiveness even against other models. With great efforts delved into the transferability of adversarial examples, surprisingly, less attention has been paid to its impact on real-world deep learning deployment. In this paper, we investigate the transferability of adversarial examples across a wide range of real-world computer vision tasks, including image classification, explicit content detection, optical character recognition (OCR), and object detection. It represents the cybercriminal’s situation where an ensemble of different detection mechanisms need to be evaded all at once. We propose practical attack that overcomes existing attacks’ limitation of requiring task-specific loss functions by targeting on the `dispersion’ of internal feature map. We report evaluation on four different computer vision tasks provided by Google Cloud Vision APIs to show how our approach outperforms existing attacks by degrading performance of multiple CV tasks by a large margin with only modest perturbations. | |
Tasks | Image Classification, Object Detection, Optical Character Recognition |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.03333v1 |
https://arxiv.org/pdf/1905.03333v1.pdf | |
PWC | https://paperswithcode.com/paper/190503333 |
Repo | https://github.com/lyt910522/Dispersion_reduction |
Framework | pytorch |
On the “steerability” of generative adversarial networks
Title | On the “steerability” of generative adversarial networks |
Authors | Ali Jahanian, Lucy Chai, Phillip Isola |
Abstract | An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real world events. Generative models are no exception, but recent advances in generative adversarial networks (GANs) suggest otherwise - these models can now synthesize strikingly realistic and diverse images. Is generative modeling of photos a solved problem? We show that although current GANs can fit standard datasets very well, they still fall short of being comprehensive models of the visual manifold. In particular, we study their ability to fit simple transformations such as camera movements and color changes. We find that the models reflect the biases of the datasets on which they are trained (e.g., centered objects), but that they also exhibit some capacity for generalization: by “steering” in latent space, we can shift the distribution while still creating realistic images. We hypothesize that the degree of distributional shift is related to the breadth of the training data distribution. Thus, we conduct experiments to quantify the limits of GAN transformations and introduce techniques to mitigate the problem. Code is released on our project page: https://ali-design.github.io/gan_steerability/ |
Tasks | |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07171v4 |
https://arxiv.org/pdf/1907.07171v4.pdf | |
PWC | https://paperswithcode.com/paper/on-the-steerability-of-generative-adversarial |
Repo | https://github.com/ali-design/gan_steerability |
Framework | pytorch |
Learning to compute inner consensus: A novel approach to modeling agreement between Capsules
Title | Learning to compute inner consensus: A novel approach to modeling agreement between Capsules |
Authors | Gonçalo Faria |
Abstract | This project considers Capsule Networks, a recently introduced machine learning model that has shown promising results regarding generalization and preservation of spatial information with few parameters. The Capsule Network’s inner routing procedures thus far proposed, a priori, establish how the routing relations are modeled, which limits the expressiveness of the underlying model. In this project, we propose two distinct ways in which the routing procedure can be learned like any other network parameter. |
Tasks | |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12737v5 |
https://arxiv.org/pdf/1909.12737v5.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-compute-inner-consensus-a-noble |
Repo | https://github.com/goncalorafaria/learning-inner-consensus |
Framework | tf |
SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications
Title | SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications |
Authors | Pengyi Zhang, Yunxin Zhong, Xiaoqiong Li |
Abstract | Drones or general Unmanned Aerial Vehicles (UAVs), endowed with computer vision function by on-board cameras and embedded systems, have become popular in a wide range of applications. However, real-time scene parsing through object detection running on a UAV platform is very challenging, due to limited memory and computing power of embedded devices. To deal with these challenges, in this paper we propose to learn efficient deep object detectors through channel pruning of convolutional layers. To this end, we enforce channel-level sparsity of convolutional layers by imposing L1 regularization on channel scaling factors and prune less informative feature channels to obtain “slim” object detectors. Based on such approach, we present SlimYOLOv3 with fewer trainable parameters and floating point operations (FLOPs) in comparison of original YOLOv3 (Joseph Redmon et al., 2018) as a promising solution for real-time object detection on UAVs. We evaluate SlimYOLOv3 on VisDrone2018-Det benchmark dataset; compelling results are achieved by SlimYOLOv3 in comparison of unpruned counterpart, including ~90.8% decrease of FLOPs, ~92.0% decline of parameter size, running ~2 times faster and comparable detection accuracy as YOLOv3. Experimental results with different pruning ratios consistently verify that proposed SlimYOLOv3 with narrower structure are more efficient, faster and better than YOLOv3, and thus are more suitable for real-time object detection on UAVs. Our codes are made publicly available at https://github.com/PengyiZhang/SlimYOLOv3. |
Tasks | Object Detection, Real-Time Object Detection, Scene Parsing |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11093v1 |
https://arxiv.org/pdf/1907.11093v1.pdf | |
PWC | https://paperswithcode.com/paper/slimyolov3-narrower-faster-and-better-for |
Repo | https://github.com/PengyiZhang/SlimYOLOv3 |
Framework | pytorch |
Simultaneous Object Detection and Semantic Segmentation
Title | Simultaneous Object Detection and Semantic Segmentation |
Authors | Niels Ole Salscheider |
Abstract | Both object detection in and semantic segmentation of camera images are important tasks for automated vehicles. Object detection is necessary so that the planning and behavior modules can reason about other road users. Semantic segmentation provides for example free space information and information about static and dynamic parts of the environment. There has been a lot of research to solve both tasks using Convolutional Neural Networks. These approaches give good results but are computationally demanding. In practice, a compromise has to be found between detection performance, detection quality and the number of tasks. Otherwise it is not possible to meet the real-time requirements of automated vehicles. In this work, we propose a neural network architecture to solve both tasks simultaneously. This architecture was designed to run with around 10 Hz on 1 MP images on current hardware. Our approach achieves a mean IoU of 61.2% for the semantic segmentation task on the challenging Cityscapes benchmark. It also achieves an average precision of 69.3% for cars and 67.7% on the moderate difficulty level of the KITTI benchmark. |
Tasks | Object Detection, Semantic Segmentation |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.02285v2 |
https://arxiv.org/pdf/1905.02285v2.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-object-detection-and-semantic |
Repo | https://github.com/fzi-forschungszentrum-informatik/NNAD |
Framework | none |
Conditioning LSTM Decoder and Bi-directional Attention Based Question Answering System
Title | Conditioning LSTM Decoder and Bi-directional Attention Based Question Answering System |
Authors | Heguang Liu |
Abstract | Applying neural-networks on Question Answering has gained increasing popularity in recent years. In this paper, I implemented a model with Bi-directional attention flow layer, connected with a Multi-layer LSTM encoder, connected with one start-index decoder and one conditioning end-index decoder. I introduce a new end-index decoder layer, conditioning on start-index output. The Experiment shows this has increased model performance by 15.16%. For prediction, I proposed a new smart-span equation, rewarding both short answer length and high probability in start-index and end-index, which further improved the prediction accuracy. The best single model achieves an F1 score of 73.97% and EM score of 64.95% on test set. |
Tasks | Question Answering |
Published | 2019-05-02 |
URL | http://arxiv.org/abs/1905.02019v1 |
http://arxiv.org/pdf/1905.02019v1.pdf | |
PWC | https://paperswithcode.com/paper/conditioning-lstm-decoder-and-bi-directional |
Repo | https://github.com/hyuna915/squad |
Framework | none |
Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs
Title | Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs |
Authors | Mingyang Chen, Wen Zhang, Wei Zhang, Qiang Chen, Huajun Chen |
Abstract | Link prediction is an important way to complete knowledge graphs (KGs), while embedding-based methods, effective for link prediction in KGs, perform poorly on relations that only have a few associative triples. In this work, we propose a Meta Relational Learning (MetaR) framework to do the common but challenging few-shot link prediction in KGs, namely predicting new triples about a relation by only observing a few associative triples. We solve few-shot link prediction by focusing on transferring relation-specific meta information to make model learn the most important knowledge and learn faster, corresponding to relation meta and gradient meta respectively in MetaR. Empirically, our model achieves state-of-the-art results on few-shot link prediction KG benchmarks. |
Tasks | Knowledge Graphs, Link Prediction, Relational Reasoning |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01515v1 |
https://arxiv.org/pdf/1909.01515v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-relational-learning-for-few-shot-link |
Repo | https://github.com/AnselCmy/MetaR |
Framework | pytorch |
Implicit Background Estimation for Semantic Segmentation
Title | Implicit Background Estimation for Semantic Segmentation |
Authors | Charles Lehman, Dogancan Temel, Ghassan AlRegib |
Abstract | Scene understanding and semantic segmentation are at the core of many computer vision tasks, many of which, involve interacting with humans in potentially dangerous ways. It is therefore paramount that techniques for principled design of robust models be developed. In this paper, we provide analytic and empirical evidence that correcting potentially errant non-distinct mappings that result from the softmax function can result in improving robustness characteristics on a state-of-the-art semantic segmentation model with minimal impact to performance and minimal changes to the code base. |
Tasks | Scene Understanding, Semantic Segmentation |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.13306v1 |
https://arxiv.org/pdf/1905.13306v1.pdf | |
PWC | https://paperswithcode.com/paper/190513306 |
Repo | https://github.com/olivesgatech/implicit-background-estimation |
Framework | pytorch |
Kernel Stein Tests for Multiple Model Comparison
Title | Kernel Stein Tests for Multiple Model Comparison |
Authors | Jen Ning Lim, Makoto Yamada, Bernhard Schölkopf, Wittawat Jitkrittum |
Abstract | We address the problem of non-parametric multiple model comparison: given $l$ candidate models, decide whether each candidate is as good as the best one(s) or worse than it. We propose two statistical tests, each controlling a different notion of decision errors. The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declared worse (false positive rate). The second test is based on multiple correction, and controls the proportion of the models declared worse but are in fact as good as the best (false discovery rate). We prove that under appropriate conditions the first test can yield a higher true positive rate than the second. Experimental results on toy and real (CelebA, Chicago Crime data) problems show that the two tests have high true positive rates with well-controlled error rates. By contrast, the naive approach of choosing the model with the lowest score without correction leads to more false positives. |
Tasks | |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12252v1 |
https://arxiv.org/pdf/1910.12252v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-stein-tests-for-multiple-model |
Repo | https://github.com/jenninglim/model-comparison-test |
Framework | none |
Disentangling Influence: Using Disentangled Representations to Audit Model Predictions
Title | Disentangling Influence: Using Disentangled Representations to Audit Model Predictions |
Authors | Charles T. Marx, Richard Lanas Phillips, Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian |
Abstract | Motivated by the need to audit complex and black box models, there has been extensive research on quantifying how data features influence model predictions. Feature influence can be direct (a direct influence on model outcomes) and indirect (model outcomes are influenced via proxy features). Feature influence can also be expressed in aggregate over the training or test data or locally with respect to a single point. Current research has typically focused on one of each of these dimensions. In this paper, we develop disentangled influence audits, a procedure to audit the indirect influence of features. Specifically, we show that disentangled representations provide a mechanism to identify proxy features in the dataset, while allowing an explicit computation of feature influence on either individual outcomes or aggregate-level outcomes. We show through both theory and experiments that disentangled influence audits can both detect proxy features and show, for each individual or in aggregate, which of these proxy features affects the classifier being audited the most. In this respect, our method is more powerful than existing methods for ascertaining feature influence. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08652v1 |
https://arxiv.org/pdf/1906.08652v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-influence-using-disentangled |
Repo | https://github.com/charliemarx/disentangling-influence |
Framework | none |
Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting
Title | Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting |
Authors | Taejun Kim, Juhan Nam |
Abstract | While end-to-end learning has become a trend in deep learning, the model architecture is often designed to incorporate domain knowledge. We propose a novel convolutional recurrent neural network (CRNN) architecture with temporal feedback connections, inspired by the feedback pathways from the brain to ears in the human auditory system. The proposed architecture uses a hidden state of the RNN module at the previous time to control the sensitivity of channel-wise feature activations in the CNN blocks at the current time, which is analogous to the mechanism of the outer hair-cell. We apply the proposed model to keyword spotting where the speech commands have sequential nature. We show the proposed model consistently outperforms the compared model without temporal feedback for different input/output settings in the CRNN framework. We also investigate the details of the performance improvement by conducting a failure analysis of the keyword spotting task and a visualization of the channel-wise feature scaling in each CNN block. |
Tasks | Keyword Spotting |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1911.01803v1 |
https://arxiv.org/pdf/1911.01803v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-feedback-convolutional-recurrent |
Repo | https://github.com/tae-jun/temporal-feedback-crnn |
Framework | none |
The Search for Sparse, Robust Neural Networks
Title | The Search for Sparse, Robust Neural Networks |
Authors | Justin Cosentino, Federico Zaiter, Dan Pei, Jun Zhu |
Abstract | Recent work on deep neural network pruning has shown there exist sparse subnetworks that achieve equal or improved accuracy, training time, and loss using fewer network parameters when compared to their dense counterparts. Orthogonal to pruning literature, deep neural networks are known to be susceptible to adversarial examples, which may pose risks in security- or safety-critical applications. Intuition suggests that there is an inherent trade-off between sparsity and robustness such that these characteristics could not co-exist. We perform an extensive empirical evaluation and analysis testing the Lottery Ticket Hypothesis with adversarial training and show this approach enables us to find sparse, robust neural networks. Code for reproducing experiments is available here: https://github.com/justincosentino/robust-sparse-networks. |
Tasks | Network Pruning |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02386v1 |
https://arxiv.org/pdf/1912.02386v1.pdf | |
PWC | https://paperswithcode.com/paper/the-search-for-sparse-robust-neural-networks |
Repo | https://github.com/justincosentino/robust-sparse-networks |
Framework | tf |
Towards a Deeper Understanding of Adversarial Losses
Title | Towards a Deeper Understanding of Adversarial Losses |
Authors | Hao-Wen Dong, Yi-Hsuan Yang |
Abstract | Recent work has proposed various adversarial losses for training generative adversarial networks. Yet, it remains unclear what certain types of functions are valid adversarial loss functions, and how these loss functions perform against one another. In this paper, we aim to gain a deeper understanding of adversarial losses by decoupling the effects of their component functions and regularization terms. We first derive some necessary and sufficient conditions of the component functions such that the adversarial loss is a divergence-like measure between the data and the model distributions. In order to systematically compare different adversarial losses, we then propose DANTest, a new, simple framework based on discriminative adversarial networks. With this framework, we evaluate an extensive set of adversarial losses by combining different component functions and regularization approaches. This study leads to some new insights into the adversarial losses. For reproducibility, all source code is available at https://github.com/salu133445/dan . |
Tasks | |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.08753v1 |
http://arxiv.org/pdf/1901.08753v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-deeper-understanding-of-adversarial |
Repo | https://github.com/salu133445/dan |
Framework | tf |
Are Few-Shot Learning Benchmarks too Simple ?
Title | Are Few-Shot Learning Benchmarks too Simple ? |
Authors | Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien |
Abstract | We argue that the widely used Omniglot and miniImageNet benchmarks are too simple because their class semantics do not vary across episodes, which defeats their intended purpose of evaluating few-shot classification methods. The class semantics of Omniglot is invariably “characters” and the class semantics of miniImageNet, “object category”. Because the class semantics are so similar, we propose a new method called Centroid Networks which can achieve surprisingly high accuracies on Omniglot and miniImageNet without using any labels at meta-evaluation time. Our results suggest that those benchmarks are not adapted for supervised few-shot classification since the supervision itself is not necessary during meta-evaluation. The Meta-Dataset, a collection of 10 datasets, was recently proposed as a harder few-shot classification benchmark. Using our method, we derive a new metric, the Class Semantics Consistency Criterion, and use it to quantify the difficulty of Meta-Dataset. Finally, under some restrictive assumptions, we show that Centroid Networks is faster and more accurate than a state-of-the-art learning-to-cluster method (Hsu et al 2018). |
Tasks | Few-Shot Learning, Meta-Learning, Omniglot |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08605v2 |
https://arxiv.org/pdf/1902.08605v2.pdf | |
PWC | https://paperswithcode.com/paper/centroid-networks-for-few-shot-clustering-and |
Repo | https://github.com/gabrielhuang/centroid-networks |
Framework | pytorch |
Joint Training of Neural Network Ensembles
Title | Joint Training of Neural Network Ensembles |
Authors | Andrew M. Webb, Charles Reynolds, Dan-Andrei Iliescu, Henry Reeve, Mikel Lujan, Gavin Brown |
Abstract | We examine the practice of joint training for neural network ensembles, in which a multi-branch architecture is trained via single loss. This approach has recently gained traction, with claims of greater accuracy per parameter along with increased parallelism. We introduce a family of novel loss functions generalizing multiple previously proposed approaches, with which we study theoretical and empirical properties of joint training. These losses interpolate smoothly between independent and joint training of predictors, demonstrating that joint training has several disadvantages not observed in prior work. However, with appropriate regularization via our proposed loss, the method shows new promise in resource limited scenarios and fault-tolerant systems, e.g., IoT and edge devices. Finally, we discuss how these results may have implications for general multi-branch architectures such as ResNeXt and Inception. |
Tasks | |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04422v2 |
http://arxiv.org/pdf/1902.04422v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-training-of-neural-network-ensembles |
Repo | https://github.com/grey-area/modular-loss-experiments |
Framework | pytorch |