Paper Group NAWR 4
Generative Adversarial Nets for Multiple Text Corpora. Hyperbolic Image Embeddings. Localised Generative Flows. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model. Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model. Interpretations are useful: penalizing e …
Generative Adversarial Nets for Multiple Text Corpora
Title | Generative Adversarial Nets for Multiple Text Corpora |
Authors | Anonymous |
Abstract | Generative adversarial nets (GANs) have been successfully applied to the artificial generation of image data. In terms of text data, much has been done on the artificial generation of natural language from a single corpus. We consider multiple text corpora as the input data, for which there can be two applications of GANs: (1) the creation of consistent cross-corpus word embeddings given different word embeddings per corpus; (2) the generation of robust bag-of-words document embeddings for each corpora. We demonstrate our GAN models on real-world text data sets from different corpora, and show that embeddings from both models lead to improvements in supervised learning problems. |
Tasks | Word Embeddings |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkexaxBKPB |
https://openreview.net/pdf?id=BkexaxBKPB | |
PWC | https://paperswithcode.com/paper/generative-adversarial-nets-for-multiple-text-1 |
Repo | https://github.com/deeplearning2018/emgan |
Framework | none |
Hyperbolic Image Embeddings
Title | Hyperbolic Image Embeddings |
Authors | Anonymous |
Abstract | Computer vision tasks such as image classification, image retrieval and few-shot learning are currently dominated by Euclidean and spherical embeddings, so that the final decisions about class belongings or the degree of similarity are made using linear hyperplanes, Euclidean distances, or spherical geodesic distances (cosine similarity). In this work, we demonstrate that in many practical scenarios hyperbolic embeddings provide a better alternative. |
Tasks | Few-Shot Learning, Image Classification, Image Retrieval |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SkgC6yHtvB |
https://openreview.net/pdf?id=SkgC6yHtvB | |
PWC | https://paperswithcode.com/paper/hyperbolic-image-embeddings-1 |
Repo | https://github.com/hyperbolic-embeddings/hyperbolic-image-embeddings |
Framework | pytorch |
Localised Generative Flows
Title | Localised Generative Flows |
Authors | Anonymous |
Abstract | We argue that flow-based density models based on continuous bijections are limited in their ability to learn target distributions with complicated topologies, and propose localised generative flows (LGFs) to address this problem. LGFs are composed of stacked continuous mixtures of bijections, which enables each bijection to learn a local region of the target rather than its entirety. Our method is a generalisation of existing flow-based methods, which can be used without modification as the basis for an LGF model. Unlike normalising flows, LGFs do not permit exact computation of log likelihoods, but we propose a simple variational scheme that performs well in practice. We show empirically that LGFs yield improved performance across a variety of common density estimation tasks. |
Tasks | Density Estimation, Normalising Flows |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SyegvgHtwr |
https://openreview.net/pdf?id=SyegvgHtwr | |
PWC | https://paperswithcode.com/paper/localised-generative-flows |
Repo | https://github.com/anonsubmission974/lgf |
Framework | pytorch |
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
Title | Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model |
Authors | Anonymous |
Abstract | Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these kinds of observation spaces present a number of challenges in practice, since the policy must now solve two problems: a representation learning problem, and a task learning problem. In this paper, we aim to explicitly learn representations that can accelerate reinforcement learning from images. We propose the stochastic latent actor-critic (SLAC) algorithm: a sample-efficient and high-performing RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs. SLAC learns a compact latent representation space using a stochastic sequential latent variable model, and then learns a critic model within this latent space. By learning a critic within a compact state space, SLAC can learn much more efficiently than standard RL methods. The proposed model improves performance substantially over alternative representations as well, such as variational autoencoders. In fact, our experimental evaluation demonstrates that the sample efficiency of our resulting method is comparable to that of model-based RL methods that directly use a similar type of model for control. Furthermore, our method outperforms both model-free and model-based alternatives in terms of final performance and sample efficiency, on a range of difficult image-based control tasks. Our code and videos of our results are available at our website. |
Tasks | Continuous Control, Representation Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HJxDugSFDB |
https://openreview.net/pdf?id=HJxDugSFDB | |
PWC | https://paperswithcode.com/paper/stochastic-latent-actor-critic-deep-1 |
Repo | https://github.com/alexlee-gk/slac |
Framework | tf |
Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model
Title | Detection and Classification of Cardiac Arrhythmias by a Challenge-Best Deep Learning Neural Network Model |
Authors | Tsai-Min Chen, Chih-Han Huang, Edward S.C. Shih, Yu-Feng Hu, Ming-Jing Hwang |
Abstract | Electrocardiograms (ECGs) are widely used to clinically detect cardiac arrhythmias (CAs). They are also being used to develop computer-assisted methods for heart disease diagnosis. We have developed a convolution neural network model to detect and classify CAs, using a large 12-lead ECG dataset (6,877 recordings) provided by the China Physiological Signal Challenge (CPSC) 2018. Our model, which was ranked first in the challenge competition, achieved a median overall F1-score of 0.84 for the nine-type CA classification of CPSC2018’s hidden test set of 2,954 ECG recordings. Further analysis showed that concurrent CAs were adequately predictive for 476 patients with multiple types of CA diagnoses in the dataset. Using only single-lead data yielded a performance that was only slightly worse than using the full 12-lead data, with leads aVR and V1 being the most prominent. We extensively consider these results in the context of their agreement with and relevance to clinical observations. |
Tasks | Arrhythmia Detection |
Published | 2020-02-04 |
URL | https://doi.org/10.1016/j.isci.2020.100886 |
https://www.cell.com/action/showPdf?pii=S2589-0042%2820%2930070-5 | |
PWC | https://paperswithcode.com/paper/detection-and-classification-of-cardiac |
Repo | https://github.com/ChihHanHuang/The-China-Physiological-Signal-Challenge-2018-champion |
Framework | tf |
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
Title | Interpretations are useful: penalizing explanations to align neural networks with prior knowledge |
Authors | Anonymous |
Abstract | For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it. In this paper, we propose contextual decomposition explanation penalization (CDEP), a method which enables practitioners to leverage existing explanation methods in order to increase the predictive accuracy of deep learning models. In particular, when shown that a model has incorrectly assigned importance to some features, CDEP enables practitioners to correct these errors by directly regularizing the provided explanations. Using explanations provided by contextual decomposition (CD) (Murdoch et al., 2018), we demonstrate the ability of our method to increase performance on an array of toy and real datasets. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Syx7WyBtwB |
https://openreview.net/pdf?id=Syx7WyBtwB | |
PWC | https://paperswithcode.com/paper/interpretations-are-useful-penalizing-1 |
Repo | https://github.com/laura-rieger/deep-explanation-penalization |
Framework | pytorch |
Network Deconvolution
Title | Network Deconvolution |
Authors | Anonymous |
Abstract | Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the immense amount of correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolution, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computation cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain. Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results without the use of batch normalization. We apply our network deconvolution operation to 10 modern neural network models by replacing batch normailization within each. Our extensive experiments show the network deconvolution operation is able to deliver performance improvement in all cases on CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST and ImageNet datasets. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rkeu30EtvS |
https://openreview.net/pdf?id=rkeu30EtvS | |
PWC | https://paperswithcode.com/paper/network-deconvolution |
Repo | https://github.com/deconvolutionpaper/deconvolution |
Framework | pytorch |
Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
Title | Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving |
Authors | Anonymous |
Abstract | Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving. Existing approaches largely rely on expensive LiDAR sensors for accurate depth information. While recently pseudo-LiDAR has been introduced as a promising alternative, at a much lower cost based solely on stereo images, there is still a notable performance gap. In this paper, we provide substantial advances to the pseudo-LiDAR framework through improvements in stereo depth estimation. Concretely, we adapt the stereo network architecture and loss function to be more aligned with accurate depth estimation of faraway objects — currently the primary weakness of pseudo-LiDAR. Further, we explore the idea to leverage cheaper but extremely sparse LiDAR sensors, which alone provide insufficient information for 3D detection, to de-bias our depth estimation. We propose a depth-propagation algorithm, guided by the initial depth estimates, to diffuse these few exact measurements across the entire depth map. We show on the KITTI object detection benchmark that our combined approach yields substantial improvements in depth estimation and stereo-based 3D object detection — outperforming the previous state-of-the-art detection accuracy for faraway objects by 40%. |
Tasks | 3D Object Detection, 3D object detection from stereo images, Autonomous Driving, Depth Estimation, Object Detection, Stereo Depth Estimation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BJedHRVtPB |
https://openreview.net/pdf?id=BJedHRVtPB | |
PWC | https://paperswithcode.com/paper/pseudo-lidar-accurate-depth-for-3d-object-1 |
Repo | https://github.com/mileyan/Pseudo_Lidar_V2 |
Framework | pytorch |
Neural Stored-program Memory
Title | Neural Stored-program Memory |
Authors | Anonymous |
Abstract | Neural networks powered with external memory simulate computer behaviors. These models, which use the memory to store data for a neural controller, can learn algorithms and other complex tasks. In this paper, we introduce a new memory to store weights for the controller, analogous to the stored-program memory in modern computer architectures. The proposed model, dubbed Neural Stored-program Memory, augments current memory-augmented neural networks, creating differentiable machines that can switch programs through time, adapt to variable contexts and thus fully resemble the Universal Turing Machine or Von Neumann Architecture. A wide range of experiments demonstrate that the resulting machines not only excel in classical algorithmic problems, but also have potential for compositional, continual, few-shot learning and question-answering tasks. |
Tasks | Few-Shot Learning, Question Answering |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rkxxA24FDr |
https://openreview.net/pdf?id=rkxxA24FDr | |
PWC | https://paperswithcode.com/paper/neural-stored-program-memory-1 |
Repo | https://github.com/thaihungle/NSM |
Framework | pytorch |
Zero-Shot Out-of-Distribution Detection with Feature Correlations
Title | Zero-Shot Out-of-Distribution Detection with Feature Correlations |
Authors | Anonymous |
Abstract | When presented with Out-of-Distribution (OOD) examples, deep neural networks yield confident, incorrect predictions. Detecting OOD examples is challenging, and the potential risks are high. In this paper, we propose to detect OOD examples by identifying inconsistencies between activity patterns and class predicted. We find that characterizing activity patterns by feature correlations and identifying anomalies in pairwise feature correlation values can yield high OOD detection rates. We identify anomalies in the pairwise feature correlations by simply comparing each pairwise correlation value with its respective range observed over the training data. Unlike many approaches, this can be used with any pre-trained softmax classifier and does not require access to OOD data for fine-tuning hyperparameters, nor does it require OOD access for inferring parameters. The method is applicable across a variety of architectures and vision datasets and generally performs better than or equal to state-of-the-art OOD detection methods, including those that do assume access to OOD examples. |
Tasks | Out-of-Distribution Detection |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1g6MCEtwr |
https://openreview.net/pdf?id=r1g6MCEtwr | |
PWC | https://paperswithcode.com/paper/zero-shot-out-of-distribution-detection-with |
Repo | https://github.com/zeroshot-ood/ood-detection |
Framework | pytorch |
Directional Message Passing for Molecular Graphs
Title | Directional Message Passing for Molecular Graphs |
Authors | Anonymous |
Abstract | Graph neural networks have recently achieved great successes in predicting quantum mechanical properties of molecules. These models represent a molecule as a graph using only the distance between atoms (nodes) and not the spatial direction from one atom to another. However, directional information plays a central role in empirical potentials for molecules, e.g. in angular potentials. To alleviate this limitation we propose directional message passing, in which we embed the messages passed between atoms instead of the atoms themselves. Each message is associated with a direction in coordinate space. These directional message embeddings are rotationally equivariant since the associated directions rotate with the molecule. We propose a message passing scheme analogous to belief propagation, which uses the directional information by transforming messages based on the angle between them. Additionally, we use spherical Bessel functions to construct a theoretically well-founded, orthogonal radial basis that achieves better performance than the currently prevalent Gaussian radial basis functions while using more than 4x fewer parameters. We leverage these innovations to construct the directional message passing neural network (DimeNet). DimeNet outperforms previous GNNs on average by 77% on MD17 and by 41% on QM9. |
Tasks | Drug Discovery, Formation Energy |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1eWbxStPH |
https://openreview.net/pdf?id=B1eWbxStPH | |
PWC | https://paperswithcode.com/paper/directional-message-passing-for-molecular |
Repo | https://github.com/klicperajo/dimenet |
Framework | tf |
At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Title | At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? |
Authors | Anonymous |
Abstract | Background: Recent developments have made it possible to accelerate neural networks training significantly using large batch sizes and data parallelism. Training in an asynchronous fashion, where delay occurs, can make training even more scalable. However, asynchronous training has its pitfalls, mainly a degradation in generalization, even after convergence of the algorithm. This gap remains not well understood, as theoretical analysis so far mainly focused on the convergence rate of asynchronous methods. Contributions: We examine asynchronous training from the perspective of dynamical stability. We find that the degree of delay interacts with the learning rate, to change the set of minima accessible by an asynchronous stochastic gradient descent algorithm. We derive closed-form rules on how the learning rate could be changed, while keeping the accessible set the same. Specifically, for high delay values, we find that the learning rate should be kept inversely proportional to the delay. We then extend this analysis to include momentum. We find momentum should be either turned off, or modified to improve training stability. We provide empirical experiments to validate our theoretical findings. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Bkeb7lHtvH |
https://openreview.net/pdf?id=Bkeb7lHtvH | |
PWC | https://paperswithcode.com/paper/at-stabilitys-edge-how-to-adjust |
Repo | https://github.com/paper-submissions/delay_stability |
Framework | pytorch |
Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning
Title | Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning |
Authors | Skyler Speakman, Celia Cintas, Victor Akinwande, Srihari Sridharan, Edward McFowland III |
Abstract | This work views neural networks as data generating systems and applies anomalous pattern detection techniques on that data in order to detect when a network is processing a group of anomalous inputs. Detecting anomalies is a critical component for multiple machine learning problems including detecting the presence of adversarial noise added to inputs. More broadly, this work is a step towards giving neural networks the ability to detect groups of out-of-distribution samples. This work introduces Subset Scanning methods from the anomalous pattern detection domain to the task of detecting anomalous inputs to neural networks. Subset Scanning allows us to answer the question: " Which subset of inputs have larger-than-expected activations at which subset of nodes?” Framing the adversarial detection problem this way allows us to identify systematic patterns in the activation space that span multiple adversarially noised images. Such images are ``"weird together”. Leveraging this common anomalous pattern, we show increased detection power as the proportion of noised images increases in a test set. Detection power and accuracy results are provided for targeted adversarial noise added to CIFAR-10 images on a 20-layer ResNet using the Basic Iterative Method attack. | |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Skld1aVtPB |
https://openreview.net/pdf?id=Skld1aVtPB | |
PWC | https://paperswithcode.com/paper/deep-mining-detecting-anomalous-patterns-in |
Repo | https://github.com/hikayifix/adversarialdetector |
Framework | none |
Black-Box Adversarial Attack with Transferable Model-based Embedding
Title | Black-Box Adversarial Attack with Transferable Model-based Embedding |
Authors | Anonymous |
Abstract | We present a new method for black-box adversarial attack. Unlike previous methods that combined transfer-based and scored-based methods by using the gradient or initialization of a surrogate white-box model, this new method tries to learn a low-dimensional embedding using a pretrained model, and then performs efficient search within the embedding space to attack an unknown target network. The method produces adversarial perturbations with high level semantic patterns that are easily transferable. We show that this approach can greatly improve the query efficiency of black-box adversarial attack across different target network architectures. We evaluate our approach on MNIST, ImageNet and Google Cloud Vision API, resulting in a significant reduction on the number of queries. We also attack adversarially defended networks on CIFAR10 and ImageNet, where our method not only reduces the number of queries, but also improves the attack success rate. |
Tasks | Adversarial Attack |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJxhNTNYwB |
https://openreview.net/pdf?id=SJxhNTNYwB | |
PWC | https://paperswithcode.com/paper/black-box-adversarial-attack-with |
Repo | https://github.com/TransEmbedBA/TREMBA |
Framework | pytorch |
N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
Title | N-BEATS: Neural basis expansion analysis for interpretable time series forecasting |
Authors | Anonymous |
Abstract | We focus on solving the univariate times series point forecasting problem using deep learning. We propose a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. The architecture has a number of desirable properties, being interpretable, applicable without modification to a wide array of target domains, and fast to train. We test the proposed architecture on several well-known datasets, including M3, M4 and TOURISM competition datasets containing time series from diverse domains. We demonstrate state-of-the-art performance for two configurations of N-BEATS for all the datasets, improving forecast accuracy by 11% over a statistical benchmark and by 3% over last year’s winner of the M4 competition, a domain-adjusted hand-crafted hybrid between neural network and statistical time series models. The first configuration of our model does not employ any time-series-specific components and its performance on heterogeneous datasets strongly suggests that, contrarily to received wisdom, deep learning primitives such as residual blocks are by themselves sufficient to solve a wide range of forecasting problems. Finally, we demonstrate how the proposed architecture can be augmented to provide outputs that are interpretable without considerable loss in accuracy. |
Tasks | Time Series, Time Series Forecasting |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1ecqn4YwB |
https://openreview.net/pdf?id=r1ecqn4YwB | |
PWC | https://paperswithcode.com/paper/n-beats-neural-basis-expansion-analysis-for-1 |
Repo | https://github.com/amitesh863/nbeats_forecast |
Framework | pytorch |