January 30, 2020

3260 words 16 mins read

Paper Group ANR 436

Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity. Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary. Boolean matrix factorization meets consecutive ones property. Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics. Neural Forest Le …

Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity


Title	Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity
Authors	Ekaterina Abramova, Derek Bunn
Abstract	This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast electricity price spreads between different hours of the day. This supports an optimal day ahead storage and discharge schedule, and thereby facilitates a bidding strategy for a merchant arbitrage facility into the day-ahead auctions for wholesale electricity. The four latent moments of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the mean, variance, skewness and kurtosis of the densities to respond hourly to such factors as weather and demand forecasts. The best specification for each spread is selected based on the Pinball Loss function, following the closed form analytical solutions of the cumulative density functions. Those analytical properties also allow the calculation of risk associated with the spread arbitrages. From these spread densities, the optimal daily operation of a battery storage facility is determined.
Tasks
Published	2019-03-09
URL	http://arxiv.org/abs/1903.06668v1
PDF	http://arxiv.org/pdf/1903.06668v1.pdf
PWC	https://paperswithcode.com/paper/estimating-dynamic-conditional-spread
Repo
Framework

Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary


Title	Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary
Authors	Hassan Muhammad, Carlie S. Sigel, Gabriele Campanella, Thomas Boerner, Linda M. Pak, Stefan Büttner, Jan N. M. IJzermans, Bas Groot Koerkamp, Michael Doukas, William R. Jarnagin, Amber Simpson, Thomas J. Fuchs
Abstract	Unlike common cancers, such as those of the prostate and breast, tumor grading in rare cancers is difficult and largely undefined because of small sample sizes, the sheer volume of time needed to undertake on such a task, and the inherent difficulty of extracting human-observed patterns. One of the most challenging examples is intrahepatic cholangiocarcinoma (ICC), a primary liver cancer arising from the biliary system, for which there is well-recognized tumor heterogeneity and no grading paradigm or prognostic biomarkers. In this paper, we propose a new unsupervised deep convolutional autoencoder-based clustering model that groups together cellular and structural morphologies of tumor in 246 ICC digitized whole slides, based on visual similarity. From this visual dictionary of histologic patterns, we use the clusters as covariates to train Cox-proportional hazard survival models. In univariate analysis, three clusters were significantly associated with recurrence-free survival. Combinations of these clusters were significant in multivariate analysis. In a multivariate analysis of all clusters, five showed significance to recurrence-free survival, however the overall model was not measured to be significant. Finally, a pathologist assigned clinical terminology to the significant clusters in the visual dictionary and found evidence supporting the hypothesis that collagen-enriched fibrosis plays a role in disease severity. These results offer insight into the future of cancer subtyping and show that computational pathology can contribute to disease prognostication, especially in rare cancers.
Tasks
Published	2019-03-12
URL	http://arxiv.org/abs/1903.05257v1
PDF	http://arxiv.org/pdf/1903.05257v1.pdf
PWC	https://paperswithcode.com/paper/towards-unsupervised-cancer-subtyping
Repo
Framework

Boolean matrix factorization meets consecutive ones property


Title	Boolean matrix factorization meets consecutive ones property
Authors	Nikolaj Tatti, Pauli Miettinen
Abstract	Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well.
Tasks
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05797v1
PDF	http://arxiv.org/pdf/1901.05797v1.pdf
PWC	https://paperswithcode.com/paper/boolean-matrix-factorization-meets
Repo
Framework

Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics


Title	Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Authors	Denis Steckelmacher, Hélène Plisnier, Diederik M. Roijers, Ann Nowé
Abstract	Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an on-policy critic. We propose Bootstrapped Dual Policy Iteration (BDPI), a novel model-free reinforcement-learning algorithm for continuous states and discrete actions, with an actor and several off-policy critics. Off-policy critics are compatible with experience replay, ensuring high sample-efficiency, without the need for off-policy corrections. The actor, by slowly imitating the average greedy policy of the critics, leads to high-quality and state-specific exploration, which we compare to Thompson sampling. Because the actor and critics are fully decoupled, BDPI is remarkably stable, and unusually robust to its hyper-parameters. BDPI is significantly more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete, continuous and pixel-based tasks. Source code: https://github.com/vub-ai-lab/bdpi.
Tasks
Published	2019-03-11
URL	https://arxiv.org/abs/1903.04193v2
PDF	https://arxiv.org/pdf/1903.04193v2.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-model-free-reinforcement
Repo
Framework

Neural Forest Learning


Title	Neural Forest Learning
Authors	Yun-Hao Cao, Jianxin Wu
Abstract	We propose Neural Forest Learning (NFL), a novel deep learning based random-forest-like method. In contrast to previous forest methods, NFL enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy than previous forest methods. Furthermore, NFL learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. We achieve superior performance on 7 machine learning datasets when compared to random forests and GBDTs. On the fine-grained benchmarks CUB-200-2011, FGVC-aircraft and Stanford Cars, we achieve over 5.7%, 6.9% and 7.8% gains for VGG-16, respectively. Moreover, NFL can converge in much fewer epochs, further accelerating network training. On the large-scale ImageNet ILSVRC-12 validation set, integration of NFL into ResNet-18 achieves top-1/top-5 errors of 28.32%/9.77%, which outperforms ResNet-18 by 1.92%/1.15% with negligible extra cost and the improvement is consistent under various architectures.
Tasks	Representation Learning
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07845v1
PDF	https://arxiv.org/pdf/1911.07845v1.pdf
PWC	https://paperswithcode.com/paper/neural-forest-learning
Repo
Framework

All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance


Title	All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance
Authors	J. Camacho, A. K. Smilde, E. Saccenti, J. A. Westerhuis
Abstract	Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA) that combines variance maximization and sparsity with the ultimate goal of improving data interpretation. When moving from PCA to sPCA, there are a number of implications that the practitioner needs to be aware of. A relevant one is that scores and loadings in sPCA may not be orthogonal. For this reason, the traditional way of computing scores, residuals and variance explained that is used in the classical PCA cannot directly be applied to sPCA models. This also affects how sPCA components should be visualized. In this paper we illustrate this problem both theoretically and numerically using simulations for several state-of-the-art sPCA algorithms, and provide proper computation of the different elements mentioned. We show that sPCA approaches present disparate and limited performance when modeling noise-free, sparse data. In a follow-up paper, we discuss the theoretical properties that lead to this problem.
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.03989v1
PDF	https://arxiv.org/pdf/1907.03989v1.pdf
PWC	https://paperswithcode.com/paper/all-sparse-pca-models-are-wrong-but-some-are
Repo
Framework

Transferable Semi-supervised 3D Object Detection from RGB-D Data


Title	Transferable Semi-supervised 3D Object Detection from RGB-D Data
Authors	Yew Siang Tang, Gim Hee Lee
Abstract	We investigate the direction of training a 3D object detector for new object classes from only 2D bounding box labels of these new classes, while simultaneously transferring information from 3D bounding box labels of the existing classes. To this end, we propose a transferable semi-supervised 3D object detection model that learns a 3D object detector network from training data with two disjoint sets of object classes - a set of strong classes with both 2D and 3D box labels, and another set of weak classes with only 2D box labels. In particular, we suggest a relaxed reprojection loss, box prior loss and a Box-to-Point Cloud Fit network that allow us to effectively transfer useful 3D information from the strong classes to the weak classes during training, and consequently, enable the network to detect 3D objects in the weak classes during inference. Experimental results show that our proposed algorithm outperforms baseline approaches and achieves promising results compared to fully-supervised approaches on the SUN-RGBD and KITTI datasets. Furthermore, we show that our Box-to-Point Cloud Fit network improves performances of the fully-supervised approaches on both datasets.
Tasks	3D Object Detection, Object Detection
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10300v1
PDF	http://arxiv.org/pdf/1904.10300v1.pdf
PWC	https://paperswithcode.com/paper/transferable-semi-supervised-3d-object
Repo
Framework

Single neuron-based neural networks are as efficient as dense deep neural networks in binary and multi-class recognition problems


Title	Single neuron-based neural networks are as efficient as dense deep neural networks in binary and multi-class recognition problems
Authors	Yassin Khalifa, Justin Hawks, Ervin Sejdic
Abstract	Recent advances in neuroscience have revealed many principles about neural processing. In particular, many biological systems were found to reconfigure/recruit single neurons to generate multiple kinds of decisions. Such findings have the potential to advance our understanding of the design and optimization process of artificial neural networks. Previous work demonstrated that dense neural networks are needed to shape complex decision surfaces required for AI-level recognition tasks. We investigate the ability to model high dimensional recognition problems using single or several neurons networks that are relatively easier to train. By employing three datasets, we test the use of a population of single neuron networks in performing multi-class recognition tasks. Surprisingly, we find that sparse networks can be as efficient as dense networks in both binary and multi-class tasks. Moreover, single neuron networks demonstrate superior performance in binary classification scheme and competing results when combined for multi-class recognition.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12135v1
PDF	https://arxiv.org/pdf/1905.12135v1.pdf
PWC	https://paperswithcode.com/paper/single-neuron-based-neural-networks-are-as
Repo
Framework

Disentangling trainability and generalization in deep learning


Title	Disentangling trainability and generalization in deep learning
Authors	Lechao Xiao, Jeffrey Pennington, Samuel S. Schoenholz
Abstract	A fundamental goal in deep learning is the characterization of trainability and generalization of neural networks as a function of their architecture and hyperparameters. In this paper, we discuss these challenging issues in the context of wide neural networks at large depths where we will see that the situation simplifies considerably. To do this, we leverage recent advances that have separately shown: (1) that in the wide network limit, random networks before training are Gaussian Processes governed by a kernel known as the Neural Network Gaussian Process (NNGP) kernel, (2) that at large depths the spectrum of the NNGP kernel simplifies considerably and becomes “weakly data-dependent” and (3) that gradient descent training of wide neural networks is described by a kernel called the Neural Tangent Kernel (NTK) that is related to the NNGP. Here we show that in the large depth limit the spectrum of the NTK simplifies in much the same way as that of the NNGP kernel. By analyzing this spectrum, we arrive at a precise characterization of trainability and a necessary condition for generalization across a range of architectures including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs). In particular, we find that there are large regions of hyperparameter space where networks can only memorize the training set in the sense they reach perfect training accuracy but completely fail to generalize outside the training set, in contrast with several recent results. By comparing CNNs with- and without-global average pooling, we show that CNNs without average pooling have very nearly identical learning dynamics to FCNs while CNNs with pooling contain a correction that alters its generalization performance. We perform a thorough empirical investigation of these theoretical results and finding excellent agreement on real datasets.
Tasks	Gaussian Processes
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13053v1
PDF	https://arxiv.org/pdf/1912.13053v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-trainability-and-generalization-1
Repo
Framework

Noise Analysis of Photonic Modulator Neurons


Title	Noise Analysis of Photonic Modulator Neurons
Authors	Thomas Ferreira de Lima, Alexander N. Tait, Hooman Saeidi, Mitchell A. Nahmias, Hsuan-Tung Peng, Siamak Abbaslou, Bhavin J. Shastri, Paul R. Prucnal
Abstract	Neuromorphic photonics relies on efficiently emulating analog neural networks at high speeds. Prior work showed that transducing signals from the optical to the electrical domain and back with transimpedance gain was an efficient approach to implementing analog photonic neurons and scalable networks. Here, we examine modulator-based photonic neuron circuits with passive and active transimpedance gains, with special attention to the sources of noise propagation. We find that a modulator nonlinear transfer function can suppress noise, which is necessary to avoid noise propagation in hardware neural networks. In addition, while efficient modulators can reduce power for an individual neuron, signal-to-noise ratios must be traded off with power consumption at a system level. Active transimpedance amplifiers may help relax this tradeoff for conventional p-n junction silicon photonic modulators, but a passive transimpedance circuit is sufficient when very efficient modulators (i.e. low C and low V-pi) are employed.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07325v1
PDF	https://arxiv.org/pdf/1907.07325v1.pdf
PWC	https://paperswithcode.com/paper/noise-analysis-of-photonic-modulator-neurons
Repo
Framework

Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models


Title	Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models
Authors	Daniele Castellana, Davide Bacciu
Abstract	Bottom-Up Hidden Tree Markov Model is a highly expressive model for tree-structured data. Unfortunately, it cannot be used in practice due to the intractable size of its state-transition matrix. We propose a new approximation which lies on the Tucker factorisation of tensors. The probabilistic interpretation of such approximation allows us to define a new probabilistic model for tree-structured data. Hence, we define the new approximated model and we derive its learning algorithm. Then, we empirically assess the effective power of the new model evaluating it on two different tasks. In both cases, our model outperforms the other approximated model known in the literature.
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13528v1
PDF	https://arxiv.org/pdf/1905.13528v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-tensor-factorisation-for-bottom-up
Repo
Framework

CollaGAN : Collaborative GAN for Missing Image Data Imputation


Title	CollaGAN : Collaborative GAN for Missing Image Data Imputation
Authors	Dongwook Lee, Junyoung Kim, Won-Jin Moon, Jong Chul Ye
Abstract	In many applications requiring multiple inputs to obtain a desired output, if any of the input data is missing, it often introduces large amounts of bias. Although many techniques have been developed for imputing missing data, the image imputation is still difficult due to complicated nature of natural images. To address this problem, here we proposed a novel framework for missing image data imputation, called Collaborative Generative Adversarial Network (CollaGAN). CollaGAN converts an image imputation problem to a multi-domain images-to-image translation task so that a single generator and discriminator network can successfully estimate the missing data using the remaining clean data set. We demonstrate that CollaGAN produces the images with a higher visual quality compared to the existing competing approaches in various image imputation tasks.
Tasks	Image Imputation, Imputation
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09764v3
PDF	http://arxiv.org/pdf/1901.09764v3.pdf
PWC	https://paperswithcode.com/paper/collagan-collaborative-gan-for-missing-image
Repo
Framework

Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics & Losses


Title	Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics & Losses
Authors	Anna Jungbluth, Xavier Gitiaux, Shane A. Maloney, Carl Shneider, Paul J. Wright, Alfredo Kalaitzis, Michel Deudon, Atılım Güneş Baydin, Yarin Gal, Andrés Muñoz-Jaramillo
Abstract	Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across time-scales longer than the typical lifespan of a solar instrument are required. At the moment, discrepancies between measurement surveys prevent the combined use of all available data. In this work, we show that machine learning can help bridge the gap between measurement surveys by learning to \textbf{super-resolve} low-resolution magnetic field images and \textbf{translate} between characteristics of contemporary instruments in orbit. We also introduce the notion of physics-based metrics and losses for super-resolution to preserve underlying physics and constrain the solution space of possible super-resolution outputs.
Tasks	Super-Resolution
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01490v1
PDF	https://arxiv.org/pdf/1911.01490v1.pdf
PWC	https://paperswithcode.com/paper/single-frame-super-resolution-of-solar
Repo
Framework

Unsupervised Curricula for Visual Meta-Reinforcement Learning


Title	Unsupervised Curricula for Visual Meta-Reinforcement Learning
Authors	Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn
Abstract	In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast reinforcement learning (RL) strategies that transfer to similar tasks. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can “useful” pre-training tasks be discovered in an unsupervised manner? We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. The task distribution is scaffolded by a parametric density model of the meta-learner’s trajectory distribution. We formulate unsupervised meta-RL as information maximization between a latent task variable and the meta-learner’s data distribution, and describe a practical instantiation which alternates between integration of recent experience into the task distribution and meta-learning of the updated tasks. Repeating this procedure leads to iterative reorganization such that the curriculum adapts as the meta-learner’s data distribution shifts. In particular, we show how discriminative clustering for visual representation can support trajectory-level task acquisition and exploration in domains with pixel observations, avoiding pitfalls of alternatives. In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.
Tasks	Meta-Learning
Published	2019-12-09
URL	https://arxiv.org/abs/1912.04226v1
PDF	https://arxiv.org/pdf/1912.04226v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-curricula-for-visual-meta-1
Repo
Framework

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling


Title	Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling
Authors	Nasser Zalmout, Nizar Habash
Abstract	Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morphological modeling for the features within two dialects, and as a knowledge-transfer scheme for cross-dialectal modeling. We use adversarial training to learn dialect invariant features that can help the knowledge-transfer scheme from the high to low-resource variants. We work with two dialectal variants: Modern Standard Arabic (high-resource “dialect”) and Egyptian Arabic (low-resource dialect) as a case study. Our models achieve state-of-the-art results for both. Furthermore, adversarial training provides more significant improvement when using smaller training datasets in particular.
Tasks	Morphological Tagging, Transfer Learning
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12702v1
PDF	https://arxiv.org/pdf/1910.12702v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-multitask-learning-for-joint-1
Repo
Framework