April 1, 2020

2961 words 14 mins read

Paper Group NAWR 4

Generative Adversarial Nets for Multiple Text Corpora. Hyperbolic Image Embeddings. Localised Generative Flows. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model. Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model. Interpretations are useful: penalizing e …

Generative Adversarial Nets for Multiple Text Corpora


Title	Generative Adversarial Nets for Multiple Text Corpora
Authors	Anonymous
Abstract	Generative adversarial nets (GANs) have been successfully applied to the artificial generation of image data. In terms of text data, much has been done on the artificial generation of natural language from a single corpus. We consider multiple text corpora as the input data, for which there can be two applications of GANs: (1) the creation of consistent cross-corpus word embeddings given different word embeddings per corpus; (2) the generation of robust bag-of-words document embeddings for each corpora. We demonstrate our GAN models on real-world text data sets from different corpora, and show that embeddings from both models lead to improvements in supervised learning problems.
Tasks	Word Embeddings
Published	2020-01-01
URL	https://openreview.net/forum?id=BkexaxBKPB
PDF	https://openreview.net/pdf?id=BkexaxBKPB
PWC	https://paperswithcode.com/paper/generative-adversarial-nets-for-multiple-text-1
Repo	https://github.com/deeplearning2018/emgan
Framework	none

Hyperbolic Image Embeddings


Title	Hyperbolic Image Embeddings
Authors	Anonymous
Abstract	Computer vision tasks such as image classification, image retrieval and few-shot learning are currently dominated by Euclidean and spherical embeddings, so that the final decisions about class belongings or the degree of similarity are made using linear hyperplanes, Euclidean distances, or spherical geodesic distances (cosine similarity). In this work, we demonstrate that in many practical scenarios hyperbolic embeddings provide a better alternative.
Tasks	Few-Shot Learning, Image Classification, Image Retrieval
Published	2020-01-01
URL	https://openreview.net/forum?id=SkgC6yHtvB
PDF	https://openreview.net/pdf?id=SkgC6yHtvB
PWC	https://paperswithcode.com/paper/hyperbolic-image-embeddings-1
Repo	https://github.com/hyperbolic-embeddings/hyperbolic-image-embeddings
Framework	pytorch

Localised Generative Flows


Title	Localised Generative Flows
Authors	Anonymous
Abstract	We argue that flow-based density models based on continuous bijections are limited in their ability to learn target distributions with complicated topologies, and propose localised generative flows (LGFs) to address this problem. LGFs are composed of stacked continuous mixtures of bijections, which enables each bijection to learn a local region of the target rather than its entirety. Our method is a generalisation of existing flow-based methods, which can be used without modification as the basis for an LGF model. Unlike normalising flows, LGFs do not permit exact computation of log likelihoods, but we propose a simple variational scheme that performs well in practice. We show empirically that LGFs yield improved performance across a variety of common density estimation tasks.
Tasks	Density Estimation, Normalising Flows
Published	2020-01-01
URL	https://openreview.net/forum?id=SyegvgHtwr
PDF	https://openreview.net/pdf?id=SyegvgHtwr
PWC	https://paperswithcode.com/paper/localised-generative-flows
Repo	https://github.com/anonsubmission974/lgf
Framework	pytorch

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model


Title	Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
Authors	Anonymous
Abstract	Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these kinds of observation spaces present a number of challenges in practice, since the policy must now solve two problems: a representation learning problem, and a task learning problem. In this paper, we aim to explicitly learn representations that can accelerate reinforcement learning from images. We propose the stochastic latent actor-critic (SLAC) algorithm: a sample-efficient and high-performing RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs. SLAC learns a compact latent representation space using a stochastic sequential latent variable model, and then learns a critic model within this latent space. By learning a critic within a compact state space, SLAC can learn much more efficiently than standard RL methods. The proposed model improves performance substantially over alternative representations as well, such as variational autoencoders. In fact, our experimental evaluation demonstrates that the sample efficiency of our resulting method is comparable to that of model-based RL methods that directly use a similar type of model for control. Furthermore, our method outperforms both model-free and model-based alternatives in terms of final performance and sample efficiency, on a range of difficult image-based control tasks. Our code and videos of our results are available at our website.
Tasks	Continuous Control, Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=HJxDugSFDB
PDF	https://openreview.net/pdf?id=HJxDugSFDB
PWC	https://paperswithcode.com/paper/stochastic-latent-actor-critic-deep-1
Repo	https://github.com/alexlee-gk/slac
Framework	tf

Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model


Title	Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model
Authors	Tsai-Min Chen, Chih-Han Huang, Edward S.C. Shih, Yu-Feng Hu, Ming-Jing Hwang
Abstract	Electrocardiograms (ECGs) are widely used to clinically detect cardiac arrhythmias (CAs). They are also being used to develop computer-assisted methods for heart disease diagnosis. We have developed a convolution neural network model to detect and classify CAs, using a large 12-lead ECG dataset (6,877 recordings) provided by the China Physiological Signal Challenge (CPSC) 2018. Our model, which was ranked first in the challenge competition, achieved a median overall F1-score of 0.84 for the nine-type CA classification of CPSC2018’s hidden test set of 2,954 ECG recordings. Further analysis showed that concurrent CAs were adequately predictive for 476 patients with multiple types of CA diagnoses in the dataset. Using only single-lead data yielded a performance that was only slightly worse than using the full 12-lead data, with leads aVR and V1 being the most prominent. We extensively consider these results in the context of their agreement with and relevance to clinical observations.
Tasks	Arrhythmia Detection
Published	2020-02-04
URL	https://doi.org/10.1016/j.isci.2020.100886
PDF	https://www.cell.com/action/showPdf?pii=S2589-0042%2820%2930070-5
PWC	https://paperswithcode.com/paper/detection-and-classification-of-cardiac
Repo	https://github.com/ChihHanHuang/The-China-Physiological-Signal-Challenge-2018-champion
Framework	tf

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge


Title	Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
Authors	Anonymous
Abstract	For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it. In this paper, we propose contextual decomposition explanation penalization (CDEP), a method which enables practitioners to leverage existing explanation methods in order to increase the predictive accuracy of deep learning models. In particular, when shown that a model has incorrectly assigned importance to some features, CDEP enables practitioners to correct these errors by directly regularizing the provided explanations. Using explanations provided by contextual decomposition (CD) (Murdoch et al., 2018), we demonstrate the ability of our method to increase performance on an array of toy and real datasets.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Syx7WyBtwB
PDF	https://openreview.net/pdf?id=Syx7WyBtwB
PWC	https://paperswithcode.com/paper/interpretations-are-useful-penalizing-1
Repo	https://github.com/laura-rieger/deep-explanation-penalization
Framework	pytorch

Network Deconvolution


Title	Network Deconvolution
Authors	Anonymous
Abstract	Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the immense amount of correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolution, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computation cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain. Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results without the use of batch normalization. We apply our network deconvolution operation to 10 modern neural network models by replacing batch normailization within each. Our extensive experiments show the network deconvolution operation is able to deliver performance improvement in all cases on CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST and ImageNet datasets.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=rkeu30EtvS
PDF	https://openreview.net/pdf?id=rkeu30EtvS
PWC	https://paperswithcode.com/paper/network-deconvolution
Repo	https://github.com/deconvolutionpaper/deconvolution
Framework	pytorch

Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving


Title	Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
Authors	Anonymous
Abstract	Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving. Existing approaches largely rely on expensive LiDAR sensors for accurate depth information. While recently pseudo-LiDAR has been introduced as a promising alternative, at a much lower cost based solely on stereo images, there is still a notable performance gap. In this paper, we provide substantial advances to the pseudo-LiDAR framework through improvements in stereo depth estimation. Concretely, we adapt the stereo network architecture and loss function to be more aligned with accurate depth estimation of faraway objects — currently the primary weakness of pseudo-LiDAR. Further, we explore the idea to leverage cheaper but extremely sparse LiDAR sensors, which alone provide insufficient information for 3D detection, to de-bias our depth estimation. We propose a depth-propagation algorithm, guided by the initial depth estimates, to diffuse these few exact measurements across the entire depth map. We show on the KITTI object detection benchmark that our combined approach yields substantial improvements in depth estimation and stereo-based 3D object detection — outperforming the previous state-of-the-art detection accuracy for faraway objects by 40%.
Tasks	3D Object Detection, 3D object detection from stereo images, Autonomous Driving, Depth Estimation, Object Detection, Stereo Depth Estimation
Published	2020-01-01
URL	https://openreview.net/forum?id=BJedHRVtPB
PDF	https://openreview.net/pdf?id=BJedHRVtPB
PWC	https://paperswithcode.com/paper/pseudo-lidar-accurate-depth-for-3d-object-1
Repo	https://github.com/mileyan/Pseudo_Lidar_V2
Framework	pytorch

Neural Stored-program Memory


Title	Neural Stored-program Memory
Authors	Anonymous
Abstract	Neural networks powered with external memory simulate computer behaviors. These models, which use the memory to store data for a neural controller, can learn algorithms and other complex tasks. In this paper, we introduce a new memory to store weights for the controller, analogous to the stored-program memory in modern computer architectures. The proposed model, dubbed Neural Stored-program Memory, augments current memory-augmented neural networks, creating differentiable machines that can switch programs through time, adapt to variable contexts and thus fully resemble the Universal Turing Machine or Von Neumann Architecture. A wide range of experiments demonstrate that the resulting machines not only excel in classical algorithmic problems, but also have potential for compositional, continual, few-shot learning and question-answering tasks.
Tasks	Few-Shot Learning, Question Answering
Published	2020-01-01
URL	https://openreview.net/forum?id=rkxxA24FDr
PDF	https://openreview.net/pdf?id=rkxxA24FDr
PWC	https://paperswithcode.com/paper/neural-stored-program-memory-1
Repo	https://github.com/thaihungle/NSM
Framework	pytorch

Zero-Shot Out-of-Distribution Detection with Feature Correlations


Title	Zero-Shot Out-of-Distribution Detection with Feature Correlations
Authors	Anonymous
Abstract	When presented with Out-of-Distribution (OOD) examples, deep neural networks yield confident, incorrect predictions. Detecting OOD examples is challenging, and the potential risks are high. In this paper, we propose to detect OOD examples by identifying inconsistencies between activity patterns and class predicted. We find that characterizing activity patterns by feature correlations and identifying anomalies in pairwise feature correlation values can yield high OOD detection rates. We identify anomalies in the pairwise feature correlations by simply comparing each pairwise correlation value with its respective range observed over the training data. Unlike many approaches, this can be used with any pre-trained softmax classifier and does not require access to OOD data for fine-tuning hyperparameters, nor does it require OOD access for inferring parameters. The method is applicable across a variety of architectures and vision datasets and generally performs better than or equal to state-of-the-art OOD detection methods, including those that do assume access to OOD examples.
Tasks	Out-of-Distribution Detection
Published	2020-01-01
URL	https://openreview.net/forum?id=r1g6MCEtwr
PDF	https://openreview.net/pdf?id=r1g6MCEtwr
PWC	https://paperswithcode.com/paper/zero-shot-out-of-distribution-detection-with
Repo	https://github.com/zeroshot-ood/ood-detection
Framework	pytorch

Directional Message Passing for Molecular Graphs


Title	Directional Message Passing for Molecular Graphs
Authors	Anonymous
Abstract	Graph neural networks have recently achieved great successes in predicting quantum mechanical properties of molecules. These models represent a molecule as a graph using only the distance between atoms (nodes) and not the spatial direction from one atom to another. However, directional information plays a central role in empirical potentials for molecules, e.g. in angular potentials. To alleviate this limitation we propose directional message passing, in which we embed the messages passed between atoms instead of the atoms themselves. Each message is associated with a direction in coordinate space. These directional message embeddings are rotationally equivariant since the associated directions rotate with the molecule. We propose a message passing scheme analogous to belief propagation, which uses the directional information by transforming messages based on the angle between them. Additionally, we use spherical Bessel functions to construct a theoretically well-founded, orthogonal radial basis that achieves better performance than the currently prevalent Gaussian radial basis functions while using more than 4x fewer parameters. We leverage these innovations to construct the directional message passing neural network (DimeNet). DimeNet outperforms previous GNNs on average by 77% on MD17 and by 41% on QM9.
Tasks	Drug Discovery, Formation Energy
Published	2020-01-01
URL	https://openreview.net/forum?id=B1eWbxStPH
PDF	https://openreview.net/pdf?id=B1eWbxStPH
PWC	https://paperswithcode.com/paper/directional-message-passing-for-molecular
Repo	https://github.com/klicperajo/dimenet
Framework	tf

At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?


Title	At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Authors	Anonymous
Abstract	Background: Recent developments have made it possible to accelerate neural networks training significantly using large batch sizes and data parallelism. Training in an asynchronous fashion, where delay occurs, can make training even more scalable. However, asynchronous training has its pitfalls, mainly a degradation in generalization, even after convergence of the algorithm. This gap remains not well understood, as theoretical analysis so far mainly focused on the convergence rate of asynchronous methods. Contributions: We examine asynchronous training from the perspective of dynamical stability. We find that the degree of delay interacts with the learning rate, to change the set of minima accessible by an asynchronous stochastic gradient descent algorithm. We derive closed-form rules on how the learning rate could be changed, while keeping the accessible set the same. Specifically, for high delay values, we find that the learning rate should be kept inversely proportional to the delay. We then extend this analysis to include momentum. We find momentum should be either turned off, or modified to improve training stability. We provide empirical experiments to validate our theoretical findings.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Bkeb7lHtvH
PDF	https://openreview.net/pdf?id=Bkeb7lHtvH
PWC	https://paperswithcode.com/paper/at-stabilitys-edge-how-to-adjust
Repo	https://github.com/paper-submissions/delay_stability
Framework	pytorch

Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning


Title	Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning
Authors	Skyler Speakman, Celia Cintas, Victor Akinwande, Srihari Sridharan, Edward McFowland III
Abstract	This work views neural networks as data generating systems and applies anomalous pattern detection techniques on that data in order to detect when a network is processing a group of anomalous inputs. Detecting anomalies is a critical component for multiple machine learning problems including detecting the presence of adversarial noise added to inputs. More broadly, this work is a step towards giving neural networks the ability to detect groups of out-of-distribution samples. This work introduces `Subset Scanning methods from the anomalous pattern detection domain to the task of detecting anomalous inputs to neural networks. Subset Scanning allows us to answer the question: "`Which subset of inputs have larger-than-expected activations at which subset of nodes?” Framing the adversarial detection problem this way allows us to identify systematic patterns in the activation space that span multiple adversarially noised images. Such images are ``"weird together”. Leveraging this common anomalous pattern, we show increased detection power as the proportion of noised images increases in a test set. Detection power and accuracy results are provided for targeted adversarial noise added to CIFAR-10 images on a 20-layer ResNet using the Basic Iterative Method attack. \|
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Skld1aVtPB
PDF	https://openreview.net/pdf?id=Skld1aVtPB
PWC	https://paperswithcode.com/paper/deep-mining-detecting-anomalous-patterns-in
Repo	https://github.com/hikayifix/adversarialdetector
Framework	none

Black-Box Adversarial Attack with Transferable Model-based Embedding


Title	Black-Box Adversarial Attack with Transferable Model-based Embedding
Authors	Anonymous
Abstract	We present a new method for black-box adversarial attack. Unlike previous methods that combined transfer-based and scored-based methods by using the gradient or initialization of a surrogate white-box model, this new method tries to learn a low-dimensional embedding using a pretrained model, and then performs efficient search within the embedding space to attack an unknown target network. The method produces adversarial perturbations with high level semantic patterns that are easily transferable. We show that this approach can greatly improve the query efficiency of black-box adversarial attack across different target network architectures. We evaluate our approach on MNIST, ImageNet and Google Cloud Vision API, resulting in a significant reduction on the number of queries. We also attack adversarially defended networks on CIFAR10 and ImageNet, where our method not only reduces the number of queries, but also improves the attack success rate.
Tasks	Adversarial Attack
Published	2020-01-01
URL	https://openreview.net/forum?id=SJxhNTNYwB
PDF	https://openreview.net/pdf?id=SJxhNTNYwB
PWC	https://paperswithcode.com/paper/black-box-adversarial-attack-with
Repo	https://github.com/TransEmbedBA/TREMBA
Framework	pytorch

N-BEATS: Neural basis expansion analysis for interpretable time series forecasting


Title	N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
Authors	Anonymous
Abstract	We focus on solving the univariate times series point forecasting problem using deep learning. We propose a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. The architecture has a number of desirable properties, being interpretable, applicable without modification to a wide array of target domains, and fast to train. We test the proposed architecture on several well-known datasets, including M3, M4 and TOURISM competition datasets containing time series from diverse domains. We demonstrate state-of-the-art performance for two configurations of N-BEATS for all the datasets, improving forecast accuracy by 11% over a statistical benchmark and by 3% over last year’s winner of the M4 competition, a domain-adjusted hand-crafted hybrid between neural network and statistical time series models. The first configuration of our model does not employ any time-series-specific components and its performance on heterogeneous datasets strongly suggests that, contrarily to received wisdom, deep learning primitives such as residual blocks are by themselves sufficient to solve a wide range of forecasting problems. Finally, we demonstrate how the proposed architecture can be augmented to provide outputs that are interpretable without considerable loss in accuracy.
Tasks	Time Series, Time Series Forecasting
Published	2020-01-01
URL	https://openreview.net/forum?id=r1ecqn4YwB
PDF	https://openreview.net/pdf?id=r1ecqn4YwB
PWC	https://paperswithcode.com/paper/n-beats-neural-basis-expansion-analysis-for-1
Repo	https://github.com/amitesh863/nbeats_forecast
Framework	pytorch