January 31, 2020

3134 words 15 mins read

Paper Group ANR 77

Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions. Spatiotemporal Local Propagation. Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization. Gland Segmentation in Histopathological Images by Deep Neural Network. Scaling Limits of Wide Neural Networks …

Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions


Title	Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions
Authors	Emily L. Aiken, Andre T. Nguyen, Mauricio Santillana
Abstract	We introduce the use of a Gated Recurrent Unit (GRU) for influenza prediction at the state- and city-level in the US, and experiment with the inclusion of real-time flu-related Internet search data. We find that a GRU has lower prediction error than current state-of-the-art methods for data-driven influenza prediction at time horizons of over two weeks. In contrast with other machine learning approaches, the inclusion of real-time Internet search data does not improve GRU predictions.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02673v2
PDF	https://arxiv.org/pdf/1911.02673v2.pdf
PWC	https://paperswithcode.com/paper/towards-the-use-of-neural-networks-for
Repo
Framework

Spatiotemporal Local Propagation


Title	Spatiotemporal Local Propagation
Authors	Alessandro Betti, Marco Gori
Abstract	This paper proposes an in-depth re-thinking of neural computation that parallels apparently unrelated laws of physics, that are formulated in the variational framework of the least action principle. The theory holds for neural networks that are also based on any digraph, and the resulting computational scheme exhibits the intriguing property of being truly biologically plausible. The scheme, which is referred to as SpatioTemporal Local Propagation (STLP), is local in both space and time. Space locality comes from the expression of the network connections by an appropriate Lagrangian term, so as the corresponding computational scheme does not need the backpropagation (BP) of the error, while temporal locality is the outcome of the variational formulation of the problem. Overall, in addition to conquering the often invoked biological plausibility missed by BP, the locality in both space and time that arises from the proposed theory can neither be exhibited by Backpropagation Through Time (BPTT) nor by Real-Time Recurrent Learning (RTRL).
Tasks
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05106v1
PDF	https://arxiv.org/pdf/1907.05106v1.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-local-propagation
Repo
Framework

Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization


Title	Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization
Authors	Mathijs Mul, Willem Zuidema
Abstract	Can neural nets learn logic? We approach this classic question with current methods, and demonstrate that recurrent neural networks can learn to recognize first order logical entailment relations between expressions. We define an artificial language in first-order predicate logic, generate a large dataset of sample ‘sentences’, and use an automatic theorem prover to infer the relation between random pairs of such sentences. We describe a Siamese neural architecture trained to predict the logical relation, and experiment with recurrent and recursive networks. Siamese Recurrent Networks are surprisingly successful at the entailment recognition task, reaching near perfect performance on novel sentences (consisting of known words), and even outperforming recursive networks. We report a series of experiments to test the ability of the models to perform compositional generalization. In particular, we study how they deal with sentences of unseen length, and sentences containing unseen words. We show that set-ups using LSTMs and GRUs obtain high scores on these tests, demonstrating a form of compositionality.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00180v1
PDF	https://arxiv.org/pdf/1906.00180v1.pdf
PWC	https://paperswithcode.com/paper/190600180
Repo
Framework

Gland Segmentation in Histopathological Images by Deep Neural Network


Title	Gland Segmentation in Histopathological Images by Deep Neural Network
Authors	Safiye Rezaei, Ali Emami, Nader Karimi, Shadrokh Samavi
Abstract	Histology method is vital in the diagnosis and prognosis of cancers and many other diseases. For the analysis of histopathological images, we need to detect and segment all gland structures. These images are very challenging, and the task of segmentation is even challenging for specialists. Segmentation of glands determines the grade of cancer such as colon, breast, and prostate. Given that deep neural networks have achieved high performance in medical images, we propose a method based on the LinkNet network for gland segmentation. We found the effects of using different loss functions. By using Warwick-Qu dataset, which contains two test sets and one train set, we show that our approach is comparable to state-of-the-art methods. Finally, it is shown that enhancing the gland edges and the use of hematoxylin components can improve the performance of the proposed model.
Tasks
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00909v1
PDF	https://arxiv.org/pdf/1911.00909v1.pdf
PWC	https://paperswithcode.com/paper/gland-segmentation-in-histopathological
Repo
Framework


Title	Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Authors	Greg Yang
Abstract	Several recent trends in machine learning theory and practice, from the design of state-of-the-art Gaussian Process to the convergence analysis of deep neural nets (DNNs) under stochastic gradient descent (SGD), have found it fruitful to study wide random neural networks. Central to these approaches are certain scaling limits of such networks. We unify these results by introducing a notion of a straightline \emph{tensor program} that can express most neural network computations, and we characterize its scaling limit when its tensors are large and randomized. From our framework follows (1) the convergence of random neural networks to Gaussian processes for architectures such as recurrent neural networks, convolutional neural networks, residual networks, attention, and any combination thereof, with or without batch normalization; (2) conditions under which the \emph{gradient independence assumption} – that weights in backpropagation can be assumed to be independent from weights in the forward pass – leads to correct computation of gradient dynamics, and corrections when it does not; (3) the convergence of the Neural Tangent Kernel, a recently proposed kernel used to predict training dynamics of neural networks under gradient descent, at initialization for all architectures in (1) without batch normalization. Mathematically, our framework is general enough to rederive classical random matrix results such as the semicircle and the Marchenko-Pastur laws, as well as recent results in neural network Jacobian singular values. We hope our work opens a way toward design of even stronger Gaussian Processes, initialization schemes to avoid gradient explosion/vanishing, and deeper understanding of SGD dynamics in modern architectures.
Tasks	Gaussian Processes
Published	2019-02-13
URL	http://arxiv.org/abs/1902.04760v2
PDF	http://arxiv.org/pdf/1902.04760v2.pdf
PWC	https://paperswithcode.com/paper/scaling-limits-of-wide-neural-networks-with
Repo
Framework

Learning Representations in Reinforcement Learning:An Information Bottleneck Approach


Title	Learning Representations in Reinforcement Learning:An Information Bottleneck Approach
Authors	Pei Yingjun, Hou Xinwen
Abstract	The information bottleneck principle is an elegant and useful approach to representation learning. In this paper, we investigate the problem of representation learning in the context of reinforcement learning using the information bottleneck framework, aiming at improving the sample efficiency of the learning algorithms. %by accelerating the process of discarding irrelevant information when the %input states are extremely high-dimensional. We analytically derive the optimal conditional distribution of the representation, and provide a variational lower bound. Then, we maximize this lower bound with the Stein variational (SV) gradient method. We incorporate this framework in the advantageous actor critic algorithm (A2C) and the proximal policy optimization algorithm (PPO). Our experimental results show that our framework can improve the sample efficiency of vanilla A2C and PPO significantly. Finally, we study the information bottleneck (IB) perspective in deep RL with the algorithm called mutual information neural estimation(MINE) . We experimentally verify that the information extraction-compression process also exists in deep RL and our framework is capable of accelerating this process. We also analyze the relationship between MINE and our method, through this relationship, we theoretically derive an algorithm to optimize our IB framework without constructing the lower bound.
Tasks	Representation Learning
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05695v1
PDF	https://arxiv.org/pdf/1911.05695v1.pdf
PWC	https://paperswithcode.com/paper/learning-representations-in-reinforcement
Repo
Framework

Graph Markov Network for Traffic Forecasting with Missing Data


Title	Graph Markov Network for Traffic Forecasting with Missing Data
Authors	Zhiyong Cui, Longfei Lin, Ziyuan Pu, Yinhai Wang
Abstract	Traffic forecasting is a classical task for traffic management and it plays an important role in intelligent transportation systems. However, since traffic data are mostly collected by traffic sensors or probe vehicles, sensor failures and the lack of probe vehicles will inevitably result in missing values in the collected raw data for some specific links in the traffic network. Although missing values can be imputed, existing data imputation methods normally need long-term historical traffic state data. As for short-term traffic forecasting, especially under edge computing and online prediction scenarios, traffic forecasting models with the capability of handling missing values are needed. In this study, we consider the traffic network as a graph and define the transition between network-wide traffic states at consecutive time steps as a graph Markov process. In this way, missing traffic states can be inferred step by step and the spatial-temporal relationships among the roadway links can be Incorporated. Based on the graph Markov process, we propose a new neural network architecture for spatial-temporal data forecasting, i.e. the graph Markov network (GMN). By incorporating the spectral graph convolution operation, we also propose a spectral graph Markov network (SGMN). The proposed models are compared with baseline models and tested on three real-world traffic state datasets with various missing rates. Experimental results show that the proposed GMN and SGMN can achieve superior prediction performance in terms of both accuracy and efficiency. Besides, the proposed models’ parameters, weights, and predicted results are comprehensively analyzed and visualized.
Tasks	Imputation
Published	2019-12-10
URL	https://arxiv.org/abs/1912.05457v1
PDF	https://arxiv.org/pdf/1912.05457v1.pdf
PWC	https://paperswithcode.com/paper/graph-markov-network-for-traffic-forecasting
Repo
Framework

RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving


Title	RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving
Authors	Jean Marie Uwabeza Vianney, Shubhra Aich, Bingbing Liu
Abstract	In this paper, we strive for solving the ambiguities arisen by the astoundingly high density of raw PseudoLiDAR for monocular 3D object detection for autonomous driving. Without much computational overhead, we propose a supervised and an unsupervised sparsification scheme of PseudoLiDAR prior to 3D detection. Both the strategies assist the standard 3D detector gain better performance over the raw PseudoLiDAR baseline using only ~5% of its points on the KITTI object detection benchmark, thus making our monocular framework and LiDAR-based counterparts computationally equivalent (Figure 1). Moreover, our architecture agnostic refinements provide state-of-the-art results on KITTI3D test set for “Car” and “Pedestrian” categories with 54% relative improvement for “Pedestrian”. Finally, exploratory analysis is performed on the discrepancy between monocular and LiDAR-based 3D detection frameworks to guide future endeavours.
Tasks	3D Object Detection, Autonomous Driving, Object Detection
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09712v1
PDF	https://arxiv.org/pdf/1911.09712v1.pdf
PWC	https://paperswithcode.com/paper/refinedmpl-refined-monocular-pseudolidar-for
Repo
Framework

Don’t Blame the ELBO! A Linear VAE Perspective on Posterior Collapse


Title	Don’t Blame the ELBO! A Linear VAE Perspective on Posterior Collapse
Authors	James Lucas, George Tucker, Roger Grosse, Mohammad Norouzi
Abstract	Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to local maxima in the log marginal likelihood. Unexpectedly, we prove that the ELBO objective for the linear VAE does not introduce additional spurious local maxima relative to log marginal likelihood. We show further that training a linear VAE with exact variational inference recovers an identifiable global maximum corresponding to the principal component directions. Empirically, we find that our linear analysis is predictive even for high-capacity, non-linear VAEs and helps explain the relationship between the observation noise, local maxima, and posterior collapse in deep Gaussian VAEs.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02469v1
PDF	https://arxiv.org/pdf/1911.02469v1.pdf
PWC	https://paperswithcode.com/paper/dont-blame-the-elbo-a-linear-vae-perspective
Repo
Framework

Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks


Title	Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks
Authors	Ka-Ho Chow, Wenqi Wei, Yanzhao Wu, Ling Liu
Abstract	Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. In this paper, we present MODEF, a cross-layer model diversity ensemble framework. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks.
Tasks	Denoising
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07667v2
PDF	https://arxiv.org/pdf/1908.07667v2.pdf
PWC	https://paperswithcode.com/paper/190807667
Repo
Framework

Improved histogram-based anomaly detector with the extended principal component features


Title	Improved histogram-based anomaly detector with the extended principal component features
Authors	Sunil Aryal, Arbind Agrahari Baniya, KC Santosh
Abstract	In this era of big data, databases are growing rapidly in terms of the number of records. Fast automatic detection of anomalous records in these massive databases is a challenging task. Traditional distance based anomaly detectors are not applicable in these massive datasets. Recently, a simple but extremely fast anomaly detector using one-dimensional histograms has been introduced. The anomaly score of a data instance is computed as the product of the probability mass of histograms in each dimensions where it falls into. It is shown to produce competitive results compared to many state-of-the-art methods in many datasets. Because it assumes data features are independent of each other, it results in poor detection accuracy when there is correlation between features. To address this issue, we propose to increase the feature size by adding more features based on principal components. Our results show that using the original input features together with principal components improves the detection accuracy of histogram-based anomaly detector significantly without compromising much in terms of run-time.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12702v1
PDF	https://arxiv.org/pdf/1909.12702v1.pdf
PWC	https://paperswithcode.com/paper/improved-histogram-based-anomaly-detector
Repo
Framework

Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection


Title	Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection
Authors	Taekyung Kim, Minki Jeong, Seunghyeon Kim, Seokeon Choi, Changick Kim
Abstract	We introduce a novel unsupervised domain adaptation approach for object detection. We aim to alleviate the imperfect translation problem of pixel-level adaptations, and the source-biased discriminativity problem of feature-level adaptations simultaneously. Our approach is composed of two stages, i.e., Domain Diversification (DD) and Multi-domain-invariant Representation Learning (MRL). At the DD stage, we diversify the distribution of the labeled data by generating various distinctive shifted domains from the source domain. At the MRL stage, we apply adversarial learning with a multi-domain discriminator to encourage feature to be indistinguishable among the domains. DD addresses the source-biased discriminativity, while MRL mitigates the imperfect image translation. We construct a structured domain adaptation framework for our learning paradigm and introduce a practical way of DD for implementation. Our method outperforms the state-of-the-art methods by a large margin of 3%~11% in terms of mean average precision (mAP) on various datasets.
Tasks	Domain Adaptation, Object Detection, Representation Learning, Unsupervised Domain Adaptation
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05396v1
PDF	https://arxiv.org/pdf/1905.05396v1.pdf
PWC	https://paperswithcode.com/paper/diversify-and-match-a-domain-adaptive
Repo
Framework

Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz ‘96 Model


Title	Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz ‘96 Model
Authors	David John Gagne II, Hannah M. Christensen, Aneesh C. Subramanian, Adam H. Monahan
Abstract	Stochastic parameterizations account for uncertainty in the representation of unresolved sub-grid processes by sampling from the distribution of possible sub-grid forcings. Some existing stochastic parameterizations utilize data-driven approaches to characterize uncertainty, but these approaches require significant structural assumptions that can limit their scalability. Machine learning models, including neural networks, are able to represent a wide range of distributions and build optimized mappings between a large number of inputs and sub-grid forcings. Recent research on machine learning parameterizations has focused only on deterministic parameterizations. In this study, we develop a stochastic parameterization using the generative adversarial network (GAN) machine learning framework. The GAN stochastic parameterization is trained and evaluated on output from the Lorenz ‘96 model, which is a common baseline model for evaluating both parameterization and data assimilation techniques. We evaluate different ways of characterizing the input noise for the model and perform model runs with the GAN parameterization at weather and climate timescales. Some of the GAN configurations perform better than a baseline bespoke parameterization at both timescales, and the networks closely reproduce the spatio-temporal correlations and regimes of the Lorenz ‘96 system. We also find that in general those models which produce skillful forecasts are also associated with the best climate simulations.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04711v1
PDF	https://arxiv.org/pdf/1909.04711v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-stochastic
Repo
Framework

Efficient Cloth Simulation using Miniature Cloth and Upscaling Deep Neural Networks


Title	Efficient Cloth Simulation using Miniature Cloth and Upscaling Deep Neural Networks
Authors	Tae Min Lee, Young Jin Oh, In-Kwon Lee
Abstract	Cloth simulation requires a fast and stable method for interactively and realistically visualizing fabric materials using computer graphics. We propose an efficient cloth simulation method using miniature cloth simulation and upscaling Deep Neural Networks (DNN). The upscaling DNNs generate the target cloth simulation from the results of physically-based simulations of a miniature cloth that has similar physical properties to those of the target cloth. We have verified the utility of the proposed method through experiments, and the results demonstrate that it is possible to generate fast and stable cloth simulations under various conditions.
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.03953v1
PDF	https://arxiv.org/pdf/1907.03953v1.pdf
PWC	https://paperswithcode.com/paper/efficient-cloth-simulation-using-miniature
Repo
Framework

Learning Permutation Invariant Representations using Memory Networks


Title	Learning Permutation Invariant Representations using Memory Networks
Authors	Shivam Kalra, Mohammed Adnan, Graham Taylor, Hamid Tizhoosh
Abstract	Many real world tasks such as 3D object detection and high-resolution image classification involve learning from a set of instances. In these cases, only a group of instances, a set, collectively contains meaningful information and therefore only the sets have labels, and not individual data instances. In this work, we present a permutation invariant neural network called a \textbf{Memory-based Exchangeable Model (MEM)} for learning set functions. The model consists of memory units that embed an input sequence to high-level features (memories) enabling the model to learn inter-dependencies among instances of the set in the form of attention vectors. To demonstrate its learning ability, we evaluated our model on test datasets created using MNIST, point cloud classification, and population estimation. We also tested the model for classifying histopathology whole slide images to discriminate between two subtypes of Lung cancer—Lung Adenocarcinoma, and Lung Squamous Cell Carcinoma. We systematically extracted patches from lung cancer images from The Cancer Genome Atlas~(TCGA) dataset, the largest public repository of histopathology images. The proposed method achieved a competitive classification accuracy of 84.84%. The results on other datasets are promising and demonstrate the efficacy of our model.
Tasks	3D Object Detection, Image Classification, Object Detection
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07984v1
PDF	https://arxiv.org/pdf/1911.07984v1.pdf
PWC	https://paperswithcode.com/paper/learning-permutation-invariant
Repo
Framework