April 1, 2020

2877 words 14 mins read

Paper Group NANR 108

Symplectic Recurrent Neural Networks. Model-Augmented Actor-Critic: Backpropagating through Paths. Detecting Change in Seasonal Pattern via Autoencoder and Temporal Regularization. Weakly Supervised Clustering by Exploiting Unique Class Count. LARGE SCALE REPRESENTATION LEARNING FROM TRIPLET COMPARISONS. ICNN: INPUT-CONDITIONED FEATURE REPRESENTATI …

Symplectic Recurrent Neural Networks


Title	Symplectic Recurrent Neural Networks
Authors	Anonymous
Abstract	We propose Symplectic Recurrent Neural Networks (SRNNs) as learning algorithms that capture the dynamics of physical systems from observed trajectories. SRNNs model the Hamiltonian function of the system by a neural networks, and leverage symplectic integration, multiple-step training and initial state optimization to address the challenging numerical issues associated with Hamiltonian systems. We show SRNNs succeed reliably on complex and noisy Hamiltonian systems. Finally, we show how to augment the SRNN integration scheme in order to handle stiff dynamical systems such as bouncing billiards.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=BkgYPREtPr
PDF	https://openreview.net/pdf?id=BkgYPREtPr
PWC	https://paperswithcode.com/paper/symplectic-recurrent-neural-networks-1
Repo
Framework

Model-Augmented Actor-Critic: Backpropagating through Paths


Title	Model-Augmented Actor-Critic: Backpropagating through Paths
Authors	Anonymous
Abstract	Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator to augment the data for policy optimization or value function learning. In this paper, we show how to make more effective use of the model by exploiting its differentiability. We construct a policy optimization algorithm that uses the pathwise derivative of the learned model and policy across future timesteps. Instabilities of learning across many timesteps are prevented by using a terminal value function, learning the policy in an actor-critic fashion. Furthermore, we present a derivation on the monotonic improvement of our objective in terms of the gradient error in the model and value function. We show that our approach (i) is consistently more sample efficient than existing state-of-the-art model-based algorithms, (ii) matches the asymptotic performance of model-free algorithms, and (iii) scales to long horizons, a regime where typically past model-based approaches have struggled.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Skln2A4YDB
PDF	https://openreview.net/pdf?id=Skln2A4YDB
PWC	https://paperswithcode.com/paper/model-augmented-actor-critic-backpropagating
Repo
Framework

Detecting Change in Seasonal Pattern via Autoencoder and Temporal Regularization


Title	Detecting Change in Seasonal Pattern via Autoencoder and Temporal Regularization
Authors	Anonymous
Abstract	Change-point detection problem consists of discovering abrupt property changes in the generation process of time-series. Most state-of-the-art models are optimizing the power of a kernel two-sample test, with only a few assumptions on the distribution of the data. Unfortunately, because they presume the samples are distributed i.i.d, they are not able to use information about the seasonality of a time-series. In this paper, we present a novel approach - ATR-CSPD allowing the detection of changes in the seasonal pattern of a time-series. Our method uses an autoencoder together with a temporal regularization, to learn the pattern of each seasonal cycle. Using low dimensional representation of the seasonal patterns, it is possible to accurately and efficiently estimate the existence of a change point using a clustering algorithm. Through experiments on artificial and real-world data sets, we demonstrate the usefulness of the proposed method for several applications.
Tasks	Change Point Detection, Time Series
Published	2020-01-01
URL	https://openreview.net/forum?id=B1esygHFwS
PDF	https://openreview.net/pdf?id=B1esygHFwS
PWC	https://paperswithcode.com/paper/detecting-change-in-seasonal-pattern-via
Repo
Framework

Weakly Supervised Clustering by Exploiting Unique Class Count


Title	Weakly Supervised Clustering by Exploiting Unique Class Count
Authors	Anonymous
Abstract	A weakly supervised learning based clustering framework is proposed in this paper. As the core of this framework, we introduce a novel multiple instance learning task based on a bag level label called unique class count (ucc), which is the number of unique classes among all instances inside the bag. In this task, no annotations on individual instances inside the bag are needed during training of the models. We mathematically prove that with a perfect ucc classifier, perfect clustering of individual instances inside the bags is possible even when no annotations on individual instances are given during training. We have constructed a neural network based ucc classifier and experimentally shown that the clustering performance of our framework with our weakly supervised ucc classifier is comparable to that of fully supervised learning models where labels for all instances are known. Furthermore, we have tested the applicability of our framework to a real world task of semantic segmentation of breast cancer metastases in histological lymph node sections and shown that the performance of our weakly supervised framework is comparable to the performance of a fully supervised Unet model.
Tasks	Multiple Instance Learning, Semantic Segmentation
Published	2020-01-01
URL	https://openreview.net/forum?id=B1xIj3VYvr
PDF	https://openreview.net/pdf?id=B1xIj3VYvr
PWC	https://paperswithcode.com/paper/weakly-supervised-clustering-by-exploiting
Repo
Framework

LARGE SCALE REPRESENTATION LEARNING FROM TRIPLET COMPARISONS


Title	LARGE SCALE REPRESENTATION LEARNING FROM TRIPLET COMPARISONS
Authors	Anonymous
Abstract	In this paper, we discuss the fundamental problem of representation learning from a new perspective. It has been observed in many supervised/unsupervised DNNs that the final layer of the network often provides an informative representation for many tasks, even though the network has been trained to perform a particular task. The common ingredient in all previous studies is a low-level feature representation for items, for example, RGB values of images in the image context. In the present work, we assume that no meaningful representation of the items is given. Instead, we are provided with the answers to some triplet comparisons of the following form: Is item A more similar to item B or item C? We provide a fast algorithm based on DNNs that constructs a Euclidean representation for the items, using solely the answers to the above-mentioned triplet comparisons. This problem has been studied in a sub-community of machine learning by the name “Ordinal Embedding”. Previous approaches to the problem are painfully slow and cannot scale to larger datasets. We demonstrate that our proposed approach is significantly faster than available methods, and can scale to real-world large datasets. Thereby, we also draw attention to the less explored idea of using neural networks to directly, approximately solve non-convex, NP-hard optimization problems that arise naturally in unsupervised learning problems.
Tasks	Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rklhqkHFDB
PDF	https://openreview.net/pdf?id=rklhqkHFDB
PWC	https://paperswithcode.com/paper/large-scale-representation-learning-from-1
Repo
Framework

ICNN: INPUT-CONDITIONED FEATURE REPRESENTATION LEARNING FOR TRANSFORMATION-INVARIANT NEURAL NETWORK


Title	ICNN: INPUT-CONDITIONED FEATURE REPRESENTATION LEARNING FOR TRANSFORMATION-INVARIANT NEURAL NETWORK
Authors	Anonymous
Abstract	We propose a novel framework, ICNN, which combines the input-conditioned filter generation module and a decoder based network to incorporate contextual information present in images into Convolutional Neural Networks (CNNs). In contrast to traditional CNNs, we do not employ the same set of learned convolution filters for all input image instances. And our proposed decoder network serves the purpose of reducing the transformation present in the input image by learning to construct a representative image of the input image class. Our proposed joint supervision of input-aware framework when combined with techniques inspired by Multi-instance learning and max-pooling, results in a transformation-invariant neural network. We investigated the performance of our proposed framework on three MNIST variations, which covers both rotation and scaling variance, and achieved 0.98% error on MNIST-rot-12k, 1.12% error on Half-rotated MNIST and 0.68% error on Scaling MNIST, which is significantly better than the state-of-the-art results. Our proposed model also showcased consistent improvement on the CIFAR dataset. We make use of visualization to further prove the effectiveness of our input-aware convolution filters. Our proposed convolution filter generation framework can also serve as a plugin for any CNN based architecture and enhance its modeling capacity.
Tasks	Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=SJecKyrKPH
PDF	https://openreview.net/pdf?id=SJecKyrKPH
PWC	https://paperswithcode.com/paper/icnn-input-conditioned-feature-representation
Repo
Framework

LDMGAN: Reducing Mode Collapse in GANs with Latent Distribution Matching


Title	LDMGAN: Reducing Mode Collapse in GANs with Latent Distribution Matching
Authors	Zhiwen Zuo, Lei Zhao, Huiming Zhang, Qihang Mo, Haibo Chen, Zhizhong Wang, AiLin Li, Lihong Qiu, Wei Xing, Dongming Lu
Abstract	Generative Adversarial Networks (GANs) have shown impressive results in modeling distributions over complicated manifolds such as those of natural images. However, GANs often suffer from mode collapse, which means they are prone to characterize only a single or a few modes of the data distribution. In order to address this problem, we propose a novel framework called LDMGAN. We ﬁrst introduce Latent Distribution Matching (LDM) constraint which regularizes the generator by aligning distribution of generated samples with that of real samples in latent space. To make use of such latent space, we propose a regularized AutoEncoder (AE) that maps the data distribution to prior distribution in encoded space. Extensive experiments on synthetic data and real world datasets show that our proposed framework signiﬁcantly improves GAN’s stability and diversity.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=HygHbTVYPB
PDF	https://openreview.net/pdf?id=HygHbTVYPB
PWC	https://paperswithcode.com/paper/ldmgan-reducing-mode-collapse-in-gans-with
Repo
Framework


Title	Generalized Natural Language Grounded Navigation via Environment-agnostic Multitask Learning
Authors	Anonymous
Abstract	Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e.g., following natural language instructions or dialog. However, existing methods tend to overfit training data in seen environments and fail to generalize well in previously unseen environments. In order to close the gap between seen and unseen environments, we aim at learning a generalizable navigation model from two novel perspectives: (1) we introduce a multitask navigation model that can be seamlessly trained on both Vision-Language Navigation (VLN) and Navigation from Dialog History (NDH) tasks, which benefits from richer natural language guidance and effectively transfers knowledge across tasks; (2) we propose to learn environment-agnostic representations for navigation policy that are invariant among environments, thus generalizing better on unseen environments. Extensive experiments show that our environment-agnostic multitask navigation model significantly reduces the performance gap between seen and unseen environments and outperforms the baselines on unseen environments by 16% (relative measure on success rate) on VLN and 120% (goal progress) on NDH, establishing the new state of the art for NDH task.
Tasks	Vision-Language Navigation
Published	2020-01-01
URL	https://openreview.net/forum?id=HkxzNpNtDS
PDF	https://openreview.net/pdf?id=HkxzNpNtDS
PWC	https://paperswithcode.com/paper/generalized-natural-language-grounded
Repo
Framework

Scheduling the Learning Rate Via Hypergradients: New Insights and a New Algorithm


Title	Scheduling the Learning Rate Via Hypergradients: New Insights and a New Algorithm
Authors	Anonymous
Abstract	We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization. This allows us to explicitly search for schedules that achieve good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rates, the hypergradient, and based on this we introduce a novel online algorithm. Our method adaptively interpolates between two recently proposed techniques (Franceschi et al., 2017; Baydin et al.,2018), featuring increased stability and faster convergence. We show empirically that the proposed technique compares favorably with baselines and related methodsin terms of final test accuracy.
Tasks	Hyperparameter Optimization
Published	2020-01-01
URL	https://openreview.net/forum?id=Ske6qJSKPH
PDF	https://openreview.net/pdf?id=Ske6qJSKPH
PWC	https://paperswithcode.com/paper/scheduling-the-learning-rate-via
Repo
Framework

Filter redistribution templates for iteration-lessconvolutional model reduction


Title	Filter redistribution templates for iteration-lessconvolutional model reduction
Authors	Anonymous
Abstract	Automatic neural network discovery methods face an enormous challenge caused for the size of the search space. A common practice is to split this space at different levels and to explore only a part of it. Neural architecture search methods look for how to combine a subset of layers, which are the most promising, to create an architecture while keeping a predefined number of filters in each layer. On the other hand, pruning techniques take a well known architecture and look for the appropriate number of filters per layer. In both cases the exploration is made iteratively, training models several times during the search. Inspired by the advantages of the two previous approaches, we proposed a fast option to find models with improved characteristics. We apply a small set of templates, which are considered promising, for make a redistribution of the number of filters in an already existing neural network. When compared to the initial base models, we found that the resulting architectures, trained from scratch, surpass the original accuracy even after been reduced to fit the same amount of resources.
Tasks	Neural Architecture Search
Published	2020-01-01
URL	https://openreview.net/forum?id=SkxMjxHYPS
PDF	https://openreview.net/pdf?id=SkxMjxHYPS
PWC	https://paperswithcode.com/paper/filter-redistribution-templates-for-iteration
Repo
Framework

Extreme Values are Accurate and Robust in Deep Networks


Title	Extreme Values are Accurate and Robust in Deep Networks
Authors	Anonymous
Abstract	Recent evidence shows that convolutional neural networks (CNNs) are biased towards textures so that CNNs are non-robust to adversarial perturbations over textures, while traditional robust visual features like SIFT (scale-invariant feature transforms) are designed to be robust across a substantial range of affine distortion, addition of noise, etc with the mimic of human perception nature. This paper aims to leverage good properties of SIFT to renovate CNN architectures towards better accuracy and robustness. We borrow the scale-space extreme value idea from SIFT, and propose EVPNet (extreme value preserving network) which contains three novel components to model the extreme values: (1) parametric differences of Gaussian (DoG) to extract extrema, (2) truncated ReLU to suppress non-stable extrema and (3) projected normalization layer (PNL) to mimic PCA-SIFT like feature normalization. Experiments demonstrate that EVPNets can achieve similar or better accuracy than conventional CNNs, while achieving much better robustness on a set of adversarial attacks (FGSM,PGD,etc) even without adversarial training.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=H1gHb1rFwr
PDF	https://openreview.net/pdf?id=H1gHb1rFwr
PWC	https://paperswithcode.com/paper/extreme-values-are-accurate-and-robust-in
Repo
Framework

Scaling Laws for the Principled Design, Initialization, and Preconditioning of ReLU Networks


Title	Scaling Laws for the Principled Design, Initialization, and Preconditioning of ReLU Networks
Authors	Aaron Defazio, Leon Bottou
Abstract	Abstract In this work, we describe a set of rules for the design and initialization of well-conditioned neural networks, guided by the goal of naturally balancing the diagonal blocks of the Hessian at the start of training. We show how our measure of conditioning of a block relates to another natural measure of conditioning, the ratio of weight gradients to the weights. We prove that for a ReLU-based deep multilayer perceptron, a simple initialization scheme using the geometric mean of the fan-in and fan-out satisfies our scaling rule. For more sophisticated architectures, we show how our scaling principle can be used to guide design choices to produce well-conditioned neural networks, reducing guess-work.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=BJedt6VKPS
PDF	https://openreview.net/pdf?id=BJedt6VKPS
PWC	https://paperswithcode.com/paper/scaling-laws-for-the-principled-design-1
Repo
Framework

Overcoming Catastrophic Forgetting via Hessian-free Curvature Estimates


Title	Overcoming Catastrophic Forgetting via Hessian-free Curvature Estimates
Authors	Anonymous
Abstract	Learning neural networks with gradient descent over a long sequence of tasks is problematic as their fine-tuning to new tasks overwrites the network weights that are important for previous tasks. This leads to a poor performance on old tasks – a phenomenon framed as catastrophic forgetting. While early approaches use task rehearsal and growing networks that both limit the scalability of the task sequence orthogonal approaches build on regularization. Based on the Fisher information matrix (FIM) changes to parameters that are relevant to old tasks are penalized, which forces the task to be mapped into the available remaining capacity of the network. This requires to calculate the Hessian around a mode, which makes learning tractable. In this paper, we introduce Hessian-free curvature estimates as an alternative method to actually calculating the Hessian. In contrast to previous work, we exploit the fact that most regions in the loss surface are flat and hence only calculate a Hessian-vector-product around the surface that is relevant for the current task. Our experiments show that on a variety of well-known task sequences we either significantly outperform or are en par with previous work.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=H1ls_eSKPH
PDF	https://openreview.net/pdf?id=H1ls_eSKPH
PWC	https://paperswithcode.com/paper/overcoming-catastrophic-forgetting-via
Repo
Framework

Provenance detection through learning transformation-resilient watermarking


Title	Provenance detection through learning transformation-resilient watermarking
Authors	Anonymous
Abstract	Advancements in deep generative models have made it possible to synthesize images, videos and audio signals that are hard to distinguish from natural signals, creating opportunities for potential abuse of these capabilities. This motivates the problem of tracking the provenance of signals, i.e., being able to determine the original source of a signal. Watermarking the signal at the time of signal creation is a potential solution, but current techniques are brittle and watermark detection mechanisms can easily be bypassed by doing some post-processing (cropping images, shifting pitch in the audio etc.). In this paper, we introduce ReSWAT (Resilient Signal Watermarking via Adversarial Training), a framework for learning transformation-resilient watermark detectors that are able to detect a watermark even after a signal has been through several post-processing transformations. Our detection method can be applied to domains with continuous data representations such as images, videos or sound signals. Experiments on watermarking image and audio signals show that our method can reliably detect the provenance of a synthetic signal, even if the signal has been through several post-processing transformations, and improve upon related work in this setting. Furthermore, we show that for specific kinds of transformations (perturbations bounded in the $\ell_2$ norm), we can even get formal guarantees on the ability of our model to detect the watermark. We provide qualitative examples of watermarked image and audio samples in the anonymous code submission link.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=S1gmvyHFDS
PDF	https://openreview.net/pdf?id=S1gmvyHFDS
PWC	https://paperswithcode.com/paper/provenance-detection-through-learning
Repo
Framework

Adversarial Training Generalizes Data-dependent Spectral Norm Regularization


Title	Adversarial Training Generalizes Data-dependent Spectral Norm Regularization
Authors	Anonymous
Abstract	We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we present a data-dependent variant of spectral norm regularization and prove that it is equivalent to adversarial training based on a specific $\ell_2$-norm constrained projected gradient ascent attack. This fundamental connection confirms the long-standing argument that a network’s sensitivity to adversarial examples is tied to its spectral properties and hints at novel ways to robustify and defend against adversarial attacks. We provide extensive empirical evidence to support our theoretical results.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=S1ervgHFwS
PDF	https://openreview.net/pdf?id=S1ervgHFwS
PWC	https://paperswithcode.com/paper/adversarial-training-generalizes-data-1
Repo
Framework