April 1, 2020

2825 words 14 mins read

Paper Group NANR 32

Shape Features Improve General Model Robustness. Deep Network classification by Scattering and Homotopy dictionary learning. HUBERT Untangles BERT to Improve Transfer across NLP Tasks. Deeper Insights into Weight Sharing in Neural Architecture Search. DyNet: Dynamic Convolution for Accelerating Convolution Neural Networks. A Simple Geometric Proof …

Shape Features Improve General Model Robustness


Title	Shape Features Improve General Model Robustness
Authors	Anonymous
Abstract	Recent studies show that convolutional neural networks (CNNs) are vulnerable under various settings, including adversarial examples, backdoor attacks, and distribution shifting. Motivated by the findings that human visual system pays more attention to global structure (e.g., shape) for recognition while CNNs are biased towards local texture features in images, we propose a unified framework EdgeGANRob based on robust edge features to improve the robustness of CNNs in general, which first explicitly extracts shape/structure features from a given image and then reconstructs a new image by refilling the texture information with a trained generative adversarial network (GAN). In addition, to reduce the sensitivity of edge detection algorithm to adversarial perturbation, we propose a robust edge detection approach Robust Canny based on the vanilla Canny algorithm. To gain more insights, we also compare EdgeGANRob with its simplified backbone procedure EdgeNetRob, which performs learning tasks directly on the extracted robust edge features. We find that EdgeNetRob can help boost model robustness significantly but at the cost of the clean model accuracy. EdgeGANRob, on the other hand, is able to improve clean model accuracy compared with EdgeNetRob and without losing the robustness benefits introduced by EdgeNetRob. Extensive experiments show that EdgeGANRob is resilient in different learning tasks under diverse settings.
Tasks	Edge Detection
Published	2020-01-01
URL	https://openreview.net/forum?id=SJlPZlStwS
PDF	https://openreview.net/pdf?id=SJlPZlStwS
PWC	https://paperswithcode.com/paper/shape-features-improve-general-model
Repo
Framework

Deep Network classification by Scattering and Homotopy dictionary learning


Title	Deep Network classification by Scattering and Homotopy dictionary learning
Authors	Anonymous
Abstract	We introduce a sparse scattering deep convolutional neural network, which provides a simple model to analyze properties of deep representation learning for classification. Learning a single dictionary matrix with a classifier yields a higher classification accuracy than AlexNet over the ImageNet ILSVRC2012 dataset. The network first applies a scattering transform which linearizes variabilities due to geometric transformations such as translations and small deformations. A sparse l1 dictionary coding reduces intra-class variability while preserving class separation through projections over unions of linear spaces. It is implemented in a deep convolutional network with a homotopy algorithm having an exponential convergence. A convergence proof is given in a general framework including ALISTA. Classification results are analyzed over ImageNet.
Tasks	Dictionary Learning, Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=SJxWS64FwH
PDF	https://openreview.net/pdf?id=SJxWS64FwH
PWC	https://paperswithcode.com/paper/deep-network-classification-by-scattering-and-1
Repo
Framework

HUBERT Untangles BERT to Improve Transfer across NLP Tasks


Title	HUBERT Untangles BERT to Improve Transfer across NLP Tasks
Authors	Anonymous
Abstract	We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional transformer language model. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. We also show that there is shared structure between different NLP datasets which HUBERT, but not BERT, is able to learn and leverage. Extensive transfer-learning experiments are conducted to confirm this proposition.
Tasks	Language Modelling, Transfer Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=HJxnM1rFvr
PDF	https://openreview.net/pdf?id=HJxnM1rFvr
PWC	https://paperswithcode.com/paper/hubert-untangles-bert-to-improve-transfer
Repo
Framework


Title	Deeper Insights into Weight Sharing in Neural Architecture Search
Authors	Anonymous
Abstract	With the success of deep neural networks, Neural Architecture Search (NAS) as a way of automatic model design has attracted wide attention. As training every child model from scratch is very time-consuming, recent works leverage weight-sharing to speed up the model evaluation procedure. These approaches greatly reduce computation by maintaining a single copy of weights on the super-net and share the weights among every child model. However, weight-sharing has no theoretical guarantee and its impact has not been well studied before. In this paper, we conduct comprehensive experiments to reveal the impact of weight-sharing: (1) The best-performing models from different runs or even from consecutive epochs within the same run have significant variance; (2) Even with high variance, we can extract valuable information from training the super-net with shared weights; (3) The interference between child models is a main factor that induces high variance; (4) Properly reducing the degree of weight sharing could effectively reduce variance and improve performance.
Tasks	Neural Architecture Search
Published	2020-01-01
URL	https://openreview.net/forum?id=ryxmrpNtvH
PDF	https://openreview.net/pdf?id=ryxmrpNtvH
PWC	https://paperswithcode.com/paper/deeper-insights-into-weight-sharing-in-neural
Repo
Framework

DyNet: Dynamic Convolution for Accelerating Convolution Neural Networks


Title	DyNet: Dynamic Convolution for Accelerating Convolution Neural Networks
Authors	Anonymous
Abstract	Convolution operator is the core of convolutional neural networks (CNNs) and occupies the most computation cost. To make CNNs more efficient, many methods have been proposed to either design lightweight networks or compress models. Although some efficient network structures have been proposed, such as MobileNet or ShuffleNet, we find that there still exists redundant information between convolution kernels. To address this issue, we propose a novel dynamic convolution method named \textbf{DyNet} in this paper, which can adaptively generate convolution kernels based on image contents. To demonstrate the effectiveness, we apply DyNet on multiple state-of-the-art CNNs. The experiment results show that DyNet can reduce the computation cost remarkably, while maintaining the performance nearly unchanged. Specifically, for ShuffleNetV2 (1.0), MobileNetV2 (1.0), ResNet18 and ResNet50, DyNet reduces 40.0%, 56.7%, 68.2% and 72.4% FLOPs respectively while the Top-1 accuracy on ImageNet only changes by +1.0%, -0.27%, -0.6% and -0.08%. Meanwhile, DyNet further accelerates the inference speed of MobileNetV2 (1.0), ResNet18 and ResNet50 by 1.87x,1.32x and 1.48x on CPU platform respectively. To verify the scalability, we also apply DyNet on segmentation task, the results show that DyNet can reduces 69.3% FLOPs while maintaining the Mean IoU on segmentation task.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SyeZIkrKwS
PDF	https://openreview.net/pdf?id=SyeZIkrKwS
PWC	https://paperswithcode.com/paper/dynet-dynamic-convolution-for-accelerating
Repo
Framework

A Simple Geometric Proof for the Benefit of Depth in ReLU Networks


Title	A Simple Geometric Proof for the Benefit of Depth in ReLU Networks
Authors	Anonymous
Abstract	We present a simple proof for the benefit of depth in multi-layer feedforward network with rectifed activation (``"depth separation”). Specifically we present a sequence of classification problems f_i such that (a) for any fixed depth rectified network we can find an index m such that problems with index > m require exponential network width to fully represent the function f_m; and (b) for any problem f_m in the family, we present a concrete neural network with linear depth and bounded width that fully represents it. While there are several previous work showing similar results, our proof uses substantially simpler tools and techniques, and should be accessible to undergraduate students in computer science and people with similar backgrounds. \|
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SkxEWgStDr
PDF	https://openreview.net/pdf?id=SkxEWgStDr
PWC	https://paperswithcode.com/paper/a-simple-geometric-proof-for-the-benefit-of
Repo
Framework

Towards Fast Adaptation of Neural Architectures with Meta Learning


Title	Towards Fast Adaptation of Neural Architectures with Meta Learning
Authors	Anonymous
Abstract	Recently, Neural Architecture Search (NAS) has been successfully applied to multiple artificial intelligence areas and shows better performance compared with hand-designed networks. However, the existing NAS methods only target a specific task. Most of them usually do well in searching an architecture for single task but are troublesome for multiple datasets or multiple tasks. Generally, the architecture for a new task is either searched from scratch, which is neither efficient nor flexible enough for practical application scenarios, or borrowed from the ones searched on other tasks, which might be not optimal. In order to tackle the transferability of NAS and conduct fast adaptation of neural architectures, we propose a novel Transferable Neural Architecture Search method based on meta-learning in this paper, which is termed as T-NAS. T-NAS learns a meta-architecture that is able to adapt to a new task quickly through a few gradient steps, which makes the transferred architecture suitable for the specific task. Extensive experiments show that T-NAS achieves state-of-the-art performance in few-shot learning and comparable performance in supervised learning but with 50x less searching cost, which demonstrates the effectiveness of our method.
Tasks	Few-Shot Learning, Meta-Learning, Neural Architecture Search
Published	2020-01-01
URL	https://openreview.net/forum?id=r1eowANFvr
PDF	https://openreview.net/pdf?id=r1eowANFvr
PWC	https://paperswithcode.com/paper/towards-fast-adaptation-of-neural
Repo
Framework

Higher-Order Function Networks for Learning Composable 3D Object Representations


Title	Higher-Order Function Networks for Learning Composable 3D Object Representations
Authors	Anonymous
Abstract	We present a new approach to 3D object representation where the geometry of an object is encoded directly into the weights and biases of a second ‘mapping’ network. This mapping network can be used to reconstruct an object by applying its encoded transformation to points randomly sampled from a simple geometric space, such as the unit sphere. Next, we extend this concept to enable the composition of multiple mapping functions. This capability provides a method for mixing features of different objects through function composition in a latent function space. Our experiments examine the effectiveness of our method on a subset of the ShapeNet dataset. We find that this representation can reconstruct objects with accuracy equal to or exceeding state-of-the-art methods with orders of magnitude fewer parameters. Our smallest reconstruction network has only about 7000 parameters and shows reconstruction quality on par with state-of-the-art object representation architectures with millions of parameters.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=HJgfDREKDB
PDF	https://openreview.net/pdf?id=HJgfDREKDB
PWC	https://paperswithcode.com/paper/higher-order-function-networks-for-learning-1
Repo
Framework

Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards


Title	Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards
Authors	Anonymous
Abstract	While recent progress in deep reinforcement learning has enabled robots to learn complex behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge. In this work, we propose an effective reward shaping method through predictive coding to tackle sparse reward problems. By learning predictive representations offline and using these representations for reward shaping, we gain access to reward signals that understand the structure and dynamics of the environment. In particular, our method achieves better learning by providing reward signals that 1) understand environment dynamics 2) emphasize on features most useful for learning 3) resist noise in learned representations through reward accumulation. We demonstrate the usefulness of this approach in different domains ranging from robotic manipulation to navigation, and we show that reward signals produced through predictive coding are as effective for learning as hand-crafted rewards.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkxi2gHYvH
PDF	https://openreview.net/pdf?id=Hkxi2gHYvH
PWC	https://paperswithcode.com/paper/predictive-coding-for-boosting-deep
Repo
Framework

Representation Quality Explain Adversarial Attacks


Title	Representation Quality Explain Adversarial Attacks
Authors	Anonymous
Abstract	Neural networks have been shown vulnerable to adversarial samples. Slightly perturbed input images are able to change the classification of accurate models, showing that the representation learned is not as good as previously thought. To aid the development of better neural networks, it would be important to evaluate to what extent are current neural networks’ representations capturing the existing features. Here we propose a way to evaluate the representation quality of neural networks using a novel type of zero-shot test, entitled Raw Zero-Shot. The main idea lies in the fact that some features are present on unknown classes and that unknown classes can be defined as a combination of previous learned features without representation bias (a bias towards representation that maps only current set of input-outputs and their boundary). To evaluate the soft-labels of unknown classes, two metrics are proposed. One is based on clustering validation techniques (Davies-Bouldin Index) and the other is based on soft-label distance of a given correct soft-label. Experiments show that such metrics are in accordance with the robustness to adversarial attacks and might serve as a guidance to build better models as well as be used in loss functions to create new types of neural networks. Interestingly, the results suggests that dynamic routing networks such as CapsNet have better representation while current deeper DNNs are trading off representation quality for accuracy.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SklfY6EFDH
PDF	https://openreview.net/pdf?id=SklfY6EFDH
PWC	https://paperswithcode.com/paper/representation-quality-explain-adversarial
Repo
Framework

Learning by shaking: Computing policy gradients by physical forward-propagation


Title	Learning by shaking: Computing policy gradients by physical forward-propagation
Authors	Anonymous
Abstract	Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with respect to the perturbation (shaking) of the parameters is learned. This allows us to predict the local behavior of the physical system around a set of nominal policies without knowing the actual model. We assay our method on a custom-built physical robot in extensive experiments and show the feasibility of the approach in practice. We investigate potential challenges when applying our method to physical systems and propose solutions to each of them.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=r1gfweBFPB
PDF	https://openreview.net/pdf?id=r1gfweBFPB
PWC	https://paperswithcode.com/paper/learning-by-shaking-computing-policy
Repo
Framework

Learning Latent State Spaces for Planning through Reward Prediction


Title	Learning Latent State Spaces for Planning through Reward Prediction
Authors	Anonymous
Abstract	Model-based reinforcement learning methods typically learn models for high-dimensional state spaces by aiming to reconstruct and predict the original observations. However, drawing inspiration from model-free reinforcement learning, we propose learning a latent dynamics model directly from rewards. In this work, we introduce a model-based planning framework which learns a latent reward prediction model and then plan in the latent state-space. The latent representation is learned exclusively from multi-step reward prediction which we show to be the only necessary information for successful planning. With this framework, we are able to benefit from the concise model-free representation, while still enjoying the data-efficiency of model-based algorithms. We demonstrate our framework in multi-pendulum and multi-cheetah environments where several pendulums or cheetahs are shown to the agent but only one of them produces rewards. In these environments, it is important for the agent to construct a concise latent representation to filter out irrelevant observations. We find that our method can successfully learn an accurate latent reward prediction model in the presence of the irrelevant information while existing model-based methods fail. Planning in the learned latent state-space shows strong performance and high sample efficiency over model-free and model-based baselines.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=ByxJjlHKwr
PDF	https://openreview.net/pdf?id=ByxJjlHKwr
PWC	https://paperswithcode.com/paper/learning-latent-state-spaces-for-planning
Repo
Framework

DeepAGREL: Biologically plausible deep learning via direct reinforcement


Title	DeepAGREL: Biologically plausible deep learning via direct reinforcement
Authors	Anonymous
Abstract	While much recent work has focused on biologically plausible variants of error-backpropagation, learning in the brain seems to mostly adhere to a reinforcement learning paradigm; biologically plausible neural reinforcement learning frameworks, however, were limited to shallow networks learning from compact and abstract sensory representations. Here, we show that it is possible to generalize such approaches to deep networks with an arbitrary number of layers. We demonstrate the learning scheme - DeepAGREL - on classical and hard image-classification benchmarks requiring deep networks, namely MNIST, CIFAR10, and CIFAR100, cast as direct reward tasks, both for deep fully connected, convolutional and locally connected architectures. We show that for these tasks, DeepAGREL achieves an accuracy that is equal to supervised error-backpropagation, and the trial-and-error nature of such learning imposes only a very limited cost in terms of training time. Thus, our results provide new insights into how deep learning may be implemented in the brain.
Tasks	Image Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=ryl4-pEKvB
PDF	https://openreview.net/pdf?id=ryl4-pEKvB
PWC	https://paperswithcode.com/paper/deepagrel-biologically-plausible-deep
Repo
Framework

CrossNorm: On Normalization for Off-Policy Reinforcement Learning


Title	CrossNorm: On Normalization for Off-Policy Reinforcement Learning
Authors	Anonymous
Abstract	Off-policy temporal difference (TD) methods are a powerful class of reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD algorithms are not commonly used in combination with feature normalization techniques, despite positive effects of normalization in other domains. We show that naive application of existing normalization techniques is indeed not effective, but that well-designed normalization improves optimization stability and removes the necessity of target networks. In particular, we introduce a normalization based on a mixture of on- and off-policy transitions, which we call cross-normalization. It can be regarded as an extension of batch normalization that re-centers data for two different distributions, as present in off-policy learning. Applied to DDPG and TD3, cross-normalization improves over the state of the art across a range of MuJoCo benchmark tasks.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SyeMblBtwr
PDF	https://openreview.net/pdf?id=SyeMblBtwr
PWC	https://paperswithcode.com/paper/crossnorm-on-normalization-for-off-policy
Repo
Framework

Imbalanced Classification via Adversarial Minority Over-sampling


Title	Imbalanced Classification via Adversarial Minority Over-sampling
Authors	Anonymous
Abstract	In most real-world scenarios, training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion. In this paper, we explore a novel yet simple way to alleviate this issue via synthesizing less-frequent classes with adversarial examples of other classes. Surprisingly, we found this counter-intuitive method can effectively learn generalizable features of minority classes by transferring and leveraging the diversity of the majority information. Our experimental results on various types of class-imbalanced datasets in image classification and natural language processing show that the proposed method not only improves the generalization of minority classes significantly compared to other re-sampling or re-weighting methods, but also surpasses other methods of state-of-art level for the class-imbalanced classification.
Tasks	Image Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=HJxaC1rKDS
PDF	https://openreview.net/pdf?id=HJxaC1rKDS
PWC	https://paperswithcode.com/paper/imbalanced-classification-via-adversarial
Repo
Framework