February 1, 2020

2846 words 14 mins read

Paper Group AWR 124

Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction. On the “steerability” of generative adversarial networks. Learning to compute inner consensus: A novel approach to modeling agreement between Capsules. SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications. Simultaneous Object Detection and Sema …

Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction


Title	Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction
Authors	Yunhan Jia, Yantao Lu, Senem Velipasalar, Zhenyu Zhong, Tao Wei
Abstract	Neural networks are known to be vulnerable to carefully crafted adversarial examples, and these malicious samples often transfer, i.e., they maintain their effectiveness even against other models. With great efforts delved into the transferability of adversarial examples, surprisingly, less attention has been paid to its impact on real-world deep learning deployment. In this paper, we investigate the transferability of adversarial examples across a wide range of real-world computer vision tasks, including image classification, explicit content detection, optical character recognition (OCR), and object detection. It represents the cybercriminal’s situation where an ensemble of different detection mechanisms need to be evaded all at once. We propose practical attack that overcomes existing attacks’ limitation of requiring task-specific loss functions by targeting on the `dispersion’ of internal feature map. We report evaluation on four different computer vision tasks provided by Google Cloud Vision APIs to show how our approach outperforms existing attacks by degrading performance of multiple CV tasks by a large margin with only modest perturbations. \|
Tasks	Image Classification, Object Detection, Optical Character Recognition
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03333v1
PDF	https://arxiv.org/pdf/1905.03333v1.pdf
PWC	https://paperswithcode.com/paper/190503333
Repo	https://github.com/lyt910522/Dispersion_reduction
Framework	pytorch

On the “steerability” of generative adversarial networks


Title	On the “steerability” of generative adversarial networks
Authors	Ali Jahanian, Lucy Chai, Phillip Isola
Abstract	An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real world events. Generative models are no exception, but recent advances in generative adversarial networks (GANs) suggest otherwise - these models can now synthesize strikingly realistic and diverse images. Is generative modeling of photos a solved problem? We show that although current GANs can fit standard datasets very well, they still fall short of being comprehensive models of the visual manifold. In particular, we study their ability to fit simple transformations such as camera movements and color changes. We find that the models reflect the biases of the datasets on which they are trained (e.g., centered objects), but that they also exhibit some capacity for generalization: by “steering” in latent space, we can shift the distribution while still creating realistic images. We hypothesize that the degree of distributional shift is related to the breadth of the training data distribution. Thus, we conduct experiments to quantify the limits of GAN transformations and introduce techniques to mitigate the problem. Code is released on our project page: https://ali-design.github.io/gan_steerability/
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07171v4
PDF	https://arxiv.org/pdf/1907.07171v4.pdf
PWC	https://paperswithcode.com/paper/on-the-steerability-of-generative-adversarial
Repo	https://github.com/ali-design/gan_steerability
Framework	pytorch

Learning to compute inner consensus: A novel approach to modeling agreement between Capsules


Title	Learning to compute inner consensus: A novel approach to modeling agreement between Capsules
Authors	Gonçalo Faria
Abstract	This project considers Capsule Networks, a recently introduced machine learning model that has shown promising results regarding generalization and preservation of spatial information with few parameters. The Capsule Network’s inner routing procedures thus far proposed, a priori, establish how the routing relations are modeled, which limits the expressiveness of the underlying model. In this project, we propose two distinct ways in which the routing procedure can be learned like any other network parameter.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12737v5
PDF	https://arxiv.org/pdf/1909.12737v5.pdf
PWC	https://paperswithcode.com/paper/learning-to-compute-inner-consensus-a-noble
Repo	https://github.com/goncalorafaria/learning-inner-consensus
Framework	tf

SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications


Title	SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications
Authors	Pengyi Zhang, Yunxin Zhong, Xiaoqiong Li
Abstract	Drones or general Unmanned Aerial Vehicles (UAVs), endowed with computer vision function by on-board cameras and embedded systems, have become popular in a wide range of applications. However, real-time scene parsing through object detection running on a UAV platform is very challenging, due to limited memory and computing power of embedded devices. To deal with these challenges, in this paper we propose to learn efficient deep object detectors through channel pruning of convolutional layers. To this end, we enforce channel-level sparsity of convolutional layers by imposing L1 regularization on channel scaling factors and prune less informative feature channels to obtain “slim” object detectors. Based on such approach, we present SlimYOLOv3 with fewer trainable parameters and floating point operations (FLOPs) in comparison of original YOLOv3 (Joseph Redmon et al., 2018) as a promising solution for real-time object detection on UAVs. We evaluate SlimYOLOv3 on VisDrone2018-Det benchmark dataset; compelling results are achieved by SlimYOLOv3 in comparison of unpruned counterpart, including ~90.8% decrease of FLOPs, ~92.0% decline of parameter size, running ~2 times faster and comparable detection accuracy as YOLOv3. Experimental results with different pruning ratios consistently verify that proposed SlimYOLOv3 with narrower structure are more efficient, faster and better than YOLOv3, and thus are more suitable for real-time object detection on UAVs. Our codes are made publicly available at https://github.com/PengyiZhang/SlimYOLOv3.
Tasks	Object Detection, Real-Time Object Detection, Scene Parsing
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11093v1
PDF	https://arxiv.org/pdf/1907.11093v1.pdf
PWC	https://paperswithcode.com/paper/slimyolov3-narrower-faster-and-better-for
Repo	https://github.com/PengyiZhang/SlimYOLOv3
Framework	pytorch

Simultaneous Object Detection and Semantic Segmentation


Title	Simultaneous Object Detection and Semantic Segmentation
Authors	Niels Ole Salscheider
Abstract	Both object detection in and semantic segmentation of camera images are important tasks for automated vehicles. Object detection is necessary so that the planning and behavior modules can reason about other road users. Semantic segmentation provides for example free space information and information about static and dynamic parts of the environment. There has been a lot of research to solve both tasks using Convolutional Neural Networks. These approaches give good results but are computationally demanding. In practice, a compromise has to be found between detection performance, detection quality and the number of tasks. Otherwise it is not possible to meet the real-time requirements of automated vehicles. In this work, we propose a neural network architecture to solve both tasks simultaneously. This architecture was designed to run with around 10 Hz on 1 MP images on current hardware. Our approach achieves a mean IoU of 61.2% for the semantic segmentation task on the challenging Cityscapes benchmark. It also achieves an average precision of 69.3% for cars and 67.7% on the moderate difficulty level of the KITTI benchmark.
Tasks	Object Detection, Semantic Segmentation
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02285v2
PDF	https://arxiv.org/pdf/1905.02285v2.pdf
PWC	https://paperswithcode.com/paper/simultaneous-object-detection-and-semantic
Repo	https://github.com/fzi-forschungszentrum-informatik/NNAD
Framework	none

Conditioning LSTM Decoder and Bi-directional Attention Based Question Answering System


Title	Conditioning LSTM Decoder and Bi-directional Attention Based Question Answering System
Authors	Heguang Liu
Abstract	Applying neural-networks on Question Answering has gained increasing popularity in recent years. In this paper, I implemented a model with Bi-directional attention flow layer, connected with a Multi-layer LSTM encoder, connected with one start-index decoder and one conditioning end-index decoder. I introduce a new end-index decoder layer, conditioning on start-index output. The Experiment shows this has increased model performance by 15.16%. For prediction, I proposed a new smart-span equation, rewarding both short answer length and high probability in start-index and end-index, which further improved the prediction accuracy. The best single model achieves an F1 score of 73.97% and EM score of 64.95% on test set.
Tasks	Question Answering
Published	2019-05-02
URL	http://arxiv.org/abs/1905.02019v1
PDF	http://arxiv.org/pdf/1905.02019v1.pdf
PWC	https://paperswithcode.com/paper/conditioning-lstm-decoder-and-bi-directional
Repo	https://github.com/hyuna915/squad
Framework	none

Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs


Title	Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs
Authors	Mingyang Chen, Wen Zhang, Wei Zhang, Qiang Chen, Huajun Chen
Abstract	Link prediction is an important way to complete knowledge graphs (KGs), while embedding-based methods, effective for link prediction in KGs, perform poorly on relations that only have a few associative triples. In this work, we propose a Meta Relational Learning (MetaR) framework to do the common but challenging few-shot link prediction in KGs, namely predicting new triples about a relation by only observing a few associative triples. We solve few-shot link prediction by focusing on transferring relation-specific meta information to make model learn the most important knowledge and learn faster, corresponding to relation meta and gradient meta respectively in MetaR. Empirically, our model achieves state-of-the-art results on few-shot link prediction KG benchmarks.
Tasks	Knowledge Graphs, Link Prediction, Relational Reasoning
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01515v1
PDF	https://arxiv.org/pdf/1909.01515v1.pdf
PWC	https://paperswithcode.com/paper/meta-relational-learning-for-few-shot-link
Repo	https://github.com/AnselCmy/MetaR
Framework	pytorch

Implicit Background Estimation for Semantic Segmentation


Title	Implicit Background Estimation for Semantic Segmentation
Authors	Charles Lehman, Dogancan Temel, Ghassan AlRegib
Abstract	Scene understanding and semantic segmentation are at the core of many computer vision tasks, many of which, involve interacting with humans in potentially dangerous ways. It is therefore paramount that techniques for principled design of robust models be developed. In this paper, we provide analytic and empirical evidence that correcting potentially errant non-distinct mappings that result from the softmax function can result in improving robustness characteristics on a state-of-the-art semantic segmentation model with minimal impact to performance and minimal changes to the code base.
Tasks	Scene Understanding, Semantic Segmentation
Published	2019-05-23
URL	https://arxiv.org/abs/1905.13306v1
PDF	https://arxiv.org/pdf/1905.13306v1.pdf
PWC	https://paperswithcode.com/paper/190513306
Repo	https://github.com/olivesgatech/implicit-background-estimation
Framework	pytorch

Kernel Stein Tests for Multiple Model Comparison


Title	Kernel Stein Tests for Multiple Model Comparison
Authors	Jen Ning Lim, Makoto Yamada, Bernhard Schölkopf, Wittawat Jitkrittum
Abstract	We address the problem of non-parametric multiple model comparison: given $l$ candidate models, decide whether each candidate is as good as the best one(s) or worse than it. We propose two statistical tests, each controlling a different notion of decision errors. The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declared worse (false positive rate). The second test is based on multiple correction, and controls the proportion of the models declared worse but are in fact as good as the best (false discovery rate). We prove that under appropriate conditions the first test can yield a higher true positive rate than the second. Experimental results on toy and real (CelebA, Chicago Crime data) problems show that the two tests have high true positive rates with well-controlled error rates. By contrast, the naive approach of choosing the model with the lowest score without correction leads to more false positives.
Tasks
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12252v1
PDF	https://arxiv.org/pdf/1910.12252v1.pdf
PWC	https://paperswithcode.com/paper/kernel-stein-tests-for-multiple-model
Repo	https://github.com/jenninglim/model-comparison-test
Framework	none

Disentangling Influence: Using Disentangled Representations to Audit Model Predictions


Title	Disentangling Influence: Using Disentangled Representations to Audit Model Predictions
Authors	Charles T. Marx, Richard Lanas Phillips, Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian
Abstract	Motivated by the need to audit complex and black box models, there has been extensive research on quantifying how data features influence model predictions. Feature influence can be direct (a direct influence on model outcomes) and indirect (model outcomes are influenced via proxy features). Feature influence can also be expressed in aggregate over the training or test data or locally with respect to a single point. Current research has typically focused on one of each of these dimensions. In this paper, we develop disentangled influence audits, a procedure to audit the indirect influence of features. Specifically, we show that disentangled representations provide a mechanism to identify proxy features in the dataset, while allowing an explicit computation of feature influence on either individual outcomes or aggregate-level outcomes. We show through both theory and experiments that disentangled influence audits can both detect proxy features and show, for each individual or in aggregate, which of these proxy features affects the classifier being audited the most. In this respect, our method is more powerful than existing methods for ascertaining feature influence.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08652v1
PDF	https://arxiv.org/pdf/1906.08652v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-influence-using-disentangled
Repo	https://github.com/charliemarx/disentangling-influence
Framework	none

Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting


Title	Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting
Authors	Taejun Kim, Juhan Nam
Abstract	While end-to-end learning has become a trend in deep learning, the model architecture is often designed to incorporate domain knowledge. We propose a novel convolutional recurrent neural network (CRNN) architecture with temporal feedback connections, inspired by the feedback pathways from the brain to ears in the human auditory system. The proposed architecture uses a hidden state of the RNN module at the previous time to control the sensitivity of channel-wise feature activations in the CNN blocks at the current time, which is analogous to the mechanism of the outer hair-cell. We apply the proposed model to keyword spotting where the speech commands have sequential nature. We show the proposed model consistently outperforms the compared model without temporal feedback for different input/output settings in the CRNN framework. We also investigate the details of the performance improvement by conducting a failure analysis of the keyword spotting task and a visualization of the channel-wise feature scaling in each CNN block.
Tasks	Keyword Spotting
Published	2019-10-30
URL	https://arxiv.org/abs/1911.01803v1
PDF	https://arxiv.org/pdf/1911.01803v1.pdf
PWC	https://paperswithcode.com/paper/temporal-feedback-convolutional-recurrent
Repo	https://github.com/tae-jun/temporal-feedback-crnn
Framework	none

The Search for Sparse, Robust Neural Networks


Title	The Search for Sparse, Robust Neural Networks
Authors	Justin Cosentino, Federico Zaiter, Dan Pei, Jun Zhu
Abstract	Recent work on deep neural network pruning has shown there exist sparse subnetworks that achieve equal or improved accuracy, training time, and loss using fewer network parameters when compared to their dense counterparts. Orthogonal to pruning literature, deep neural networks are known to be susceptible to adversarial examples, which may pose risks in security- or safety-critical applications. Intuition suggests that there is an inherent trade-off between sparsity and robustness such that these characteristics could not co-exist. We perform an extensive empirical evaluation and analysis testing the Lottery Ticket Hypothesis with adversarial training and show this approach enables us to find sparse, robust neural networks. Code for reproducing experiments is available here: https://github.com/justincosentino/robust-sparse-networks.
Tasks	Network Pruning
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02386v1
PDF	https://arxiv.org/pdf/1912.02386v1.pdf
PWC	https://paperswithcode.com/paper/the-search-for-sparse-robust-neural-networks
Repo	https://github.com/justincosentino/robust-sparse-networks
Framework	tf

Towards a Deeper Understanding of Adversarial Losses


Title	Towards a Deeper Understanding of Adversarial Losses
Authors	Hao-Wen Dong, Yi-Hsuan Yang
Abstract	Recent work has proposed various adversarial losses for training generative adversarial networks. Yet, it remains unclear what certain types of functions are valid adversarial loss functions, and how these loss functions perform against one another. In this paper, we aim to gain a deeper understanding of adversarial losses by decoupling the effects of their component functions and regularization terms. We first derive some necessary and sufficient conditions of the component functions such that the adversarial loss is a divergence-like measure between the data and the model distributions. In order to systematically compare different adversarial losses, we then propose DANTest, a new, simple framework based on discriminative adversarial networks. With this framework, we evaluate an extensive set of adversarial losses by combining different component functions and regularization approaches. This study leads to some new insights into the adversarial losses. For reproducibility, all source code is available at https://github.com/salu133445/dan .
Tasks
Published	2019-01-25
URL	http://arxiv.org/abs/1901.08753v1
PDF	http://arxiv.org/pdf/1901.08753v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-deeper-understanding-of-adversarial
Repo	https://github.com/salu133445/dan
Framework	tf

Are Few-Shot Learning Benchmarks too Simple ?


Title	Are Few-Shot Learning Benchmarks too Simple ?
Authors	Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien
Abstract	We argue that the widely used Omniglot and miniImageNet benchmarks are too simple because their class semantics do not vary across episodes, which defeats their intended purpose of evaluating few-shot classification methods. The class semantics of Omniglot is invariably “characters” and the class semantics of miniImageNet, “object category”. Because the class semantics are so similar, we propose a new method called Centroid Networks which can achieve surprisingly high accuracies on Omniglot and miniImageNet without using any labels at meta-evaluation time. Our results suggest that those benchmarks are not adapted for supervised few-shot classification since the supervision itself is not necessary during meta-evaluation. The Meta-Dataset, a collection of 10 datasets, was recently proposed as a harder few-shot classification benchmark. Using our method, we derive a new metric, the Class Semantics Consistency Criterion, and use it to quantify the difficulty of Meta-Dataset. Finally, under some restrictive assumptions, we show that Centroid Networks is faster and more accurate than a state-of-the-art learning-to-cluster method (Hsu et al 2018).
Tasks	Few-Shot Learning, Meta-Learning, Omniglot
Published	2019-02-22
URL	https://arxiv.org/abs/1902.08605v2
PDF	https://arxiv.org/pdf/1902.08605v2.pdf
PWC	https://paperswithcode.com/paper/centroid-networks-for-few-shot-clustering-and
Repo	https://github.com/gabrielhuang/centroid-networks
Framework	pytorch

Joint Training of Neural Network Ensembles


Title	Joint Training of Neural Network Ensembles
Authors	Andrew M. Webb, Charles Reynolds, Dan-Andrei Iliescu, Henry Reeve, Mikel Lujan, Gavin Brown
Abstract	We examine the practice of joint training for neural network ensembles, in which a multi-branch architecture is trained via single loss. This approach has recently gained traction, with claims of greater accuracy per parameter along with increased parallelism. We introduce a family of novel loss functions generalizing multiple previously proposed approaches, with which we study theoretical and empirical properties of joint training. These losses interpolate smoothly between independent and joint training of predictors, demonstrating that joint training has several disadvantages not observed in prior work. However, with appropriate regularization via our proposed loss, the method shows new promise in resource limited scenarios and fault-tolerant systems, e.g., IoT and edge devices. Finally, we discuss how these results may have implications for general multi-branch architectures such as ResNeXt and Inception.
Tasks
Published	2019-02-12
URL	http://arxiv.org/abs/1902.04422v2
PDF	http://arxiv.org/pdf/1902.04422v2.pdf
PWC	https://paperswithcode.com/paper/joint-training-of-neural-network-ensembles
Repo	https://github.com/grey-area/modular-loss-experiments
Framework	pytorch