Paper Group ANR 77
Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions. Spatiotemporal Local Propagation. Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization. Gland Segmentation in Histopathological Images by Deep Neural Network. Scaling Limits of Wide Neural Networks …
Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions
Title | Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions |
Authors | Emily L. Aiken, Andre T. Nguyen, Mauricio Santillana |
Abstract | We introduce the use of a Gated Recurrent Unit (GRU) for influenza prediction at the state- and city-level in the US, and experiment with the inclusion of real-time flu-related Internet search data. We find that a GRU has lower prediction error than current state-of-the-art methods for data-driven influenza prediction at time horizons of over two weeks. In contrast with other machine learning approaches, the inclusion of real-time Internet search data does not improve GRU predictions. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02673v2 |
https://arxiv.org/pdf/1911.02673v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-use-of-neural-networks-for |
Repo | |
Framework | |
Spatiotemporal Local Propagation
Title | Spatiotemporal Local Propagation |
Authors | Alessandro Betti, Marco Gori |
Abstract | This paper proposes an in-depth re-thinking of neural computation that parallels apparently unrelated laws of physics, that are formulated in the variational framework of the least action principle. The theory holds for neural networks that are also based on any digraph, and the resulting computational scheme exhibits the intriguing property of being truly biologically plausible. The scheme, which is referred to as SpatioTemporal Local Propagation (STLP), is local in both space and time. Space locality comes from the expression of the network connections by an appropriate Lagrangian term, so as the corresponding computational scheme does not need the backpropagation (BP) of the error, while temporal locality is the outcome of the variational formulation of the problem. Overall, in addition to conquering the often invoked biological plausibility missed by BP, the locality in both space and time that arises from the proposed theory can neither be exhibited by Backpropagation Through Time (BPTT) nor by Real-Time Recurrent Learning (RTRL). |
Tasks | |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05106v1 |
https://arxiv.org/pdf/1907.05106v1.pdf | |
PWC | https://paperswithcode.com/paper/spatiotemporal-local-propagation |
Repo | |
Framework | |
Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization
Title | Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization |
Authors | Mathijs Mul, Willem Zuidema |
Abstract | Can neural nets learn logic? We approach this classic question with current methods, and demonstrate that recurrent neural networks can learn to recognize first order logical entailment relations between expressions. We define an artificial language in first-order predicate logic, generate a large dataset of sample ‘sentences’, and use an automatic theorem prover to infer the relation between random pairs of such sentences. We describe a Siamese neural architecture trained to predict the logical relation, and experiment with recurrent and recursive networks. Siamese Recurrent Networks are surprisingly successful at the entailment recognition task, reaching near perfect performance on novel sentences (consisting of known words), and even outperforming recursive networks. We report a series of experiments to test the ability of the models to perform compositional generalization. In particular, we study how they deal with sentences of unseen length, and sentences containing unseen words. We show that set-ups using LSTMs and GRUs obtain high scores on these tests, demonstrating a form of compositionality. |
Tasks | |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00180v1 |
https://arxiv.org/pdf/1906.00180v1.pdf | |
PWC | https://paperswithcode.com/paper/190600180 |
Repo | |
Framework | |
Gland Segmentation in Histopathological Images by Deep Neural Network
Title | Gland Segmentation in Histopathological Images by Deep Neural Network |
Authors | Safiye Rezaei, Ali Emami, Nader Karimi, Shadrokh Samavi |
Abstract | Histology method is vital in the diagnosis and prognosis of cancers and many other diseases. For the analysis of histopathological images, we need to detect and segment all gland structures. These images are very challenging, and the task of segmentation is even challenging for specialists. Segmentation of glands determines the grade of cancer such as colon, breast, and prostate. Given that deep neural networks have achieved high performance in medical images, we propose a method based on the LinkNet network for gland segmentation. We found the effects of using different loss functions. By using Warwick-Qu dataset, which contains two test sets and one train set, we show that our approach is comparable to state-of-the-art methods. Finally, it is shown that enhancing the gland edges and the use of hematoxylin components can improve the performance of the proposed model. |
Tasks | |
Published | 2019-11-03 |
URL | https://arxiv.org/abs/1911.00909v1 |
https://arxiv.org/pdf/1911.00909v1.pdf | |
PWC | https://paperswithcode.com/paper/gland-segmentation-in-histopathological |
Repo | |
Framework | |
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Title | Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation |
Authors | Greg Yang |
Abstract | Several recent trends in machine learning theory and practice, from the design of state-of-the-art Gaussian Process to the convergence analysis of deep neural nets (DNNs) under stochastic gradient descent (SGD), have found it fruitful to study wide random neural networks. Central to these approaches are certain scaling limits of such networks. We unify these results by introducing a notion of a straightline \emph{tensor program} that can express most neural network computations, and we characterize its scaling limit when its tensors are large and randomized. From our framework follows (1) the convergence of random neural networks to Gaussian processes for architectures such as recurrent neural networks, convolutional neural networks, residual networks, attention, and any combination thereof, with or without batch normalization; (2) conditions under which the \emph{gradient independence assumption} – that weights in backpropagation can be assumed to be independent from weights in the forward pass – leads to correct computation of gradient dynamics, and corrections when it does not; (3) the convergence of the Neural Tangent Kernel, a recently proposed kernel used to predict training dynamics of neural networks under gradient descent, at initialization for all architectures in (1) without batch normalization. Mathematically, our framework is general enough to rederive classical random matrix results such as the semicircle and the Marchenko-Pastur laws, as well as recent results in neural network Jacobian singular values. We hope our work opens a way toward design of even stronger Gaussian Processes, initialization schemes to avoid gradient explosion/vanishing, and deeper understanding of SGD dynamics in modern architectures. |
Tasks | Gaussian Processes |
Published | 2019-02-13 |
URL | http://arxiv.org/abs/1902.04760v2 |
http://arxiv.org/pdf/1902.04760v2.pdf | |
PWC | https://paperswithcode.com/paper/scaling-limits-of-wide-neural-networks-with |
Repo | |
Framework | |
Learning Representations in Reinforcement Learning:An Information Bottleneck Approach
Title | Learning Representations in Reinforcement Learning:An Information Bottleneck Approach |
Authors | Pei Yingjun, Hou Xinwen |
Abstract | The information bottleneck principle is an elegant and useful approach to representation learning. In this paper, we investigate the problem of representation learning in the context of reinforcement learning using the information bottleneck framework, aiming at improving the sample efficiency of the learning algorithms. %by accelerating the process of discarding irrelevant information when the %input states are extremely high-dimensional. We analytically derive the optimal conditional distribution of the representation, and provide a variational lower bound. Then, we maximize this lower bound with the Stein variational (SV) gradient method. We incorporate this framework in the advantageous actor critic algorithm (A2C) and the proximal policy optimization algorithm (PPO). Our experimental results show that our framework can improve the sample efficiency of vanilla A2C and PPO significantly. Finally, we study the information bottleneck (IB) perspective in deep RL with the algorithm called mutual information neural estimation(MINE) . We experimentally verify that the information extraction-compression process also exists in deep RL and our framework is capable of accelerating this process. We also analyze the relationship between MINE and our method, through this relationship, we theoretically derive an algorithm to optimize our IB framework without constructing the lower bound. |
Tasks | Representation Learning |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.05695v1 |
https://arxiv.org/pdf/1911.05695v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-in-reinforcement |
Repo | |
Framework | |
Graph Markov Network for Traffic Forecasting with Missing Data
Title | Graph Markov Network for Traffic Forecasting with Missing Data |
Authors | Zhiyong Cui, Longfei Lin, Ziyuan Pu, Yinhai Wang |
Abstract | Traffic forecasting is a classical task for traffic management and it plays an important role in intelligent transportation systems. However, since traffic data are mostly collected by traffic sensors or probe vehicles, sensor failures and the lack of probe vehicles will inevitably result in missing values in the collected raw data for some specific links in the traffic network. Although missing values can be imputed, existing data imputation methods normally need long-term historical traffic state data. As for short-term traffic forecasting, especially under edge computing and online prediction scenarios, traffic forecasting models with the capability of handling missing values are needed. In this study, we consider the traffic network as a graph and define the transition between network-wide traffic states at consecutive time steps as a graph Markov process. In this way, missing traffic states can be inferred step by step and the spatial-temporal relationships among the roadway links can be Incorporated. Based on the graph Markov process, we propose a new neural network architecture for spatial-temporal data forecasting, i.e. the graph Markov network (GMN). By incorporating the spectral graph convolution operation, we also propose a spectral graph Markov network (SGMN). The proposed models are compared with baseline models and tested on three real-world traffic state datasets with various missing rates. Experimental results show that the proposed GMN and SGMN can achieve superior prediction performance in terms of both accuracy and efficiency. Besides, the proposed models’ parameters, weights, and predicted results are comprehensively analyzed and visualized. |
Tasks | Imputation |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.05457v1 |
https://arxiv.org/pdf/1912.05457v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-markov-network-for-traffic-forecasting |
Repo | |
Framework | |
RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving
Title | RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving |
Authors | Jean Marie Uwabeza Vianney, Shubhra Aich, Bingbing Liu |
Abstract | In this paper, we strive for solving the ambiguities arisen by the astoundingly high density of raw PseudoLiDAR for monocular 3D object detection for autonomous driving. Without much computational overhead, we propose a supervised and an unsupervised sparsification scheme of PseudoLiDAR prior to 3D detection. Both the strategies assist the standard 3D detector gain better performance over the raw PseudoLiDAR baseline using only ~5% of its points on the KITTI object detection benchmark, thus making our monocular framework and LiDAR-based counterparts computationally equivalent (Figure 1). Moreover, our architecture agnostic refinements provide state-of-the-art results on KITTI3D test set for “Car” and “Pedestrian” categories with 54% relative improvement for “Pedestrian”. Finally, exploratory analysis is performed on the discrepancy between monocular and LiDAR-based 3D detection frameworks to guide future endeavours. |
Tasks | 3D Object Detection, Autonomous Driving, Object Detection |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09712v1 |
https://arxiv.org/pdf/1911.09712v1.pdf | |
PWC | https://paperswithcode.com/paper/refinedmpl-refined-monocular-pseudolidar-for |
Repo | |
Framework | |
Don’t Blame the ELBO! A Linear VAE Perspective on Posterior Collapse
Title | Don’t Blame the ELBO! A Linear VAE Perspective on Posterior Collapse |
Authors | James Lucas, George Tucker, Roger Grosse, Mohammad Norouzi |
Abstract | Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to local maxima in the log marginal likelihood. Unexpectedly, we prove that the ELBO objective for the linear VAE does not introduce additional spurious local maxima relative to log marginal likelihood. We show further that training a linear VAE with exact variational inference recovers an identifiable global maximum corresponding to the principal component directions. Empirically, we find that our linear analysis is predictive even for high-capacity, non-linear VAEs and helps explain the relationship between the observation noise, local maxima, and posterior collapse in deep Gaussian VAEs. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02469v1 |
https://arxiv.org/pdf/1911.02469v1.pdf | |
PWC | https://paperswithcode.com/paper/dont-blame-the-elbo-a-linear-vae-perspective |
Repo | |
Framework | |
Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks
Title | Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks |
Authors | Ka-Ho Chow, Wenqi Wei, Yanzhao Wu, Ling Liu |
Abstract | Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. In this paper, we present MODEF, a cross-layer model diversity ensemble framework. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks. |
Tasks | Denoising |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07667v2 |
https://arxiv.org/pdf/1908.07667v2.pdf | |
PWC | https://paperswithcode.com/paper/190807667 |
Repo | |
Framework | |
Improved histogram-based anomaly detector with the extended principal component features
Title | Improved histogram-based anomaly detector with the extended principal component features |
Authors | Sunil Aryal, Arbind Agrahari Baniya, KC Santosh |
Abstract | In this era of big data, databases are growing rapidly in terms of the number of records. Fast automatic detection of anomalous records in these massive databases is a challenging task. Traditional distance based anomaly detectors are not applicable in these massive datasets. Recently, a simple but extremely fast anomaly detector using one-dimensional histograms has been introduced. The anomaly score of a data instance is computed as the product of the probability mass of histograms in each dimensions where it falls into. It is shown to produce competitive results compared to many state-of-the-art methods in many datasets. Because it assumes data features are independent of each other, it results in poor detection accuracy when there is correlation between features. To address this issue, we propose to increase the feature size by adding more features based on principal components. Our results show that using the original input features together with principal components improves the detection accuracy of histogram-based anomaly detector significantly without compromising much in terms of run-time. |
Tasks | |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12702v1 |
https://arxiv.org/pdf/1909.12702v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-histogram-based-anomaly-detector |
Repo | |
Framework | |
Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection
Title | Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection |
Authors | Taekyung Kim, Minki Jeong, Seunghyeon Kim, Seokeon Choi, Changick Kim |
Abstract | We introduce a novel unsupervised domain adaptation approach for object detection. We aim to alleviate the imperfect translation problem of pixel-level adaptations, and the source-biased discriminativity problem of feature-level adaptations simultaneously. Our approach is composed of two stages, i.e., Domain Diversification (DD) and Multi-domain-invariant Representation Learning (MRL). At the DD stage, we diversify the distribution of the labeled data by generating various distinctive shifted domains from the source domain. At the MRL stage, we apply adversarial learning with a multi-domain discriminator to encourage feature to be indistinguishable among the domains. DD addresses the source-biased discriminativity, while MRL mitigates the imperfect image translation. We construct a structured domain adaptation framework for our learning paradigm and introduce a practical way of DD for implementation. Our method outperforms the state-of-the-art methods by a large margin of 3%~11% in terms of mean average precision (mAP) on various datasets. |
Tasks | Domain Adaptation, Object Detection, Representation Learning, Unsupervised Domain Adaptation |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05396v1 |
https://arxiv.org/pdf/1905.05396v1.pdf | |
PWC | https://paperswithcode.com/paper/diversify-and-match-a-domain-adaptive |
Repo | |
Framework | |
Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz ‘96 Model
Title | Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz ‘96 Model |
Authors | David John Gagne II, Hannah M. Christensen, Aneesh C. Subramanian, Adam H. Monahan |
Abstract | Stochastic parameterizations account for uncertainty in the representation of unresolved sub-grid processes by sampling from the distribution of possible sub-grid forcings. Some existing stochastic parameterizations utilize data-driven approaches to characterize uncertainty, but these approaches require significant structural assumptions that can limit their scalability. Machine learning models, including neural networks, are able to represent a wide range of distributions and build optimized mappings between a large number of inputs and sub-grid forcings. Recent research on machine learning parameterizations has focused only on deterministic parameterizations. In this study, we develop a stochastic parameterization using the generative adversarial network (GAN) machine learning framework. The GAN stochastic parameterization is trained and evaluated on output from the Lorenz ‘96 model, which is a common baseline model for evaluating both parameterization and data assimilation techniques. We evaluate different ways of characterizing the input noise for the model and perform model runs with the GAN parameterization at weather and climate timescales. Some of the GAN configurations perform better than a baseline bespoke parameterization at both timescales, and the networks closely reproduce the spatio-temporal correlations and regimes of the Lorenz ‘96 system. We also find that in general those models which produce skillful forecasts are also associated with the best climate simulations. |
Tasks | |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04711v1 |
https://arxiv.org/pdf/1909.04711v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-stochastic |
Repo | |
Framework | |
Efficient Cloth Simulation using Miniature Cloth and Upscaling Deep Neural Networks
Title | Efficient Cloth Simulation using Miniature Cloth and Upscaling Deep Neural Networks |
Authors | Tae Min Lee, Young Jin Oh, In-Kwon Lee |
Abstract | Cloth simulation requires a fast and stable method for interactively and realistically visualizing fabric materials using computer graphics. We propose an efficient cloth simulation method using miniature cloth simulation and upscaling Deep Neural Networks (DNN). The upscaling DNNs generate the target cloth simulation from the results of physically-based simulations of a miniature cloth that has similar physical properties to those of the target cloth. We have verified the utility of the proposed method through experiments, and the results demonstrate that it is possible to generate fast and stable cloth simulations under various conditions. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.03953v1 |
https://arxiv.org/pdf/1907.03953v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-cloth-simulation-using-miniature |
Repo | |
Framework | |
Learning Permutation Invariant Representations using Memory Networks
Title | Learning Permutation Invariant Representations using Memory Networks |
Authors | Shivam Kalra, Mohammed Adnan, Graham Taylor, Hamid Tizhoosh |
Abstract | Many real world tasks such as 3D object detection and high-resolution image classification involve learning from a set of instances. In these cases, only a group of instances, a set, collectively contains meaningful information and therefore only the sets have labels, and not individual data instances. In this work, we present a permutation invariant neural network called a \textbf{Memory-based Exchangeable Model (MEM)} for learning set functions. The model consists of memory units that embed an input sequence to high-level features (memories) enabling the model to learn inter-dependencies among instances of the set in the form of attention vectors. To demonstrate its learning ability, we evaluated our model on test datasets created using MNIST, point cloud classification, and population estimation. We also tested the model for classifying histopathology whole slide images to discriminate between two subtypes of Lung cancer—Lung Adenocarcinoma, and Lung Squamous Cell Carcinoma. We systematically extracted patches from lung cancer images from The Cancer Genome Atlas~(TCGA) dataset, the largest public repository of histopathology images. The proposed method achieved a competitive classification accuracy of 84.84%. The results on other datasets are promising and demonstrate the efficacy of our model. |
Tasks | 3D Object Detection, Image Classification, Object Detection |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07984v1 |
https://arxiv.org/pdf/1911.07984v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-permutation-invariant |
Repo | |
Framework | |